Published on Fri Jan 29 2021

Statistical Inference after Kernel Ridge Regression Imputation under item nonresponse

Hengfang Wang, Jae-Kwang Kim

We consider a nonparametric approach to imputation using the kernel ridge regression technique. We propose consistent variance estimation using the entropy method. Synthetic data experiments are presented to confirm our theory.

0
0
0
Abstract

Imputation is a popular technique for handling missing data. We consider a nonparametric approach to imputation using the kernel ridge regression technique and propose consistent variance estimation. The proposed variance estimator is based on a linearization approach which employs the entropy method to estimate the density ratio. The root-n consistency of the imputation estimator is established when a Sobolev space is utilized in the kernel ridge regression imputation, which enables us to develop the proposed variance estimator. Synthetic data experiments are presented to confirm our theory.

Thu Jun 07 2018
Machine Learning
Kernel Machines With Missing Responses
Missing responses is a missing data format in which outcomes are not always watched. In this work we develop kernel machines that can handle missing responses. We prove oracle inequalities to prove consistency and to calculate convergence rates.
0
0
0
Wed Apr 07 2021
Machine Learning
Prediction with Missing Data
0
0
0
Fri Jan 24 2020
Machine Learning
Imputation for High-Dimensional Linear Regression
A common strategy in practice is to impute the missing entries with an appropriate substitute and then implement a standard statistical procedure. We show that this imputation scheme coupled with standard off-the-shelf procedures such as the LASSO retains the minimax estimation rate.
0
0
0
Tue Nov 17 2015
Machine Learning
Optimized Linear Imputation
In real-world datasets, especially in high dimensional data, some feature values are missing. Since most data analysis and statistical methods do not handle gracefully missing values, the first step in the analysis requires the imputation of missing values.
0
0
0
Tue Feb 19 2019
Machine Learning
On the consistency of supervised learning with missing values
In many application settings, the data have missing entries which make analyzing them challenging. Here, we consider supervised-learning settings: predicting a target when missing values appear in both training and testing data. We show that a predictor suited for complete Observations can predict optimally on incomplete data.
0
0
0
Wed Feb 06 2019
Machine Learning
Weak consistency of the 1-nearest neighbor measure with applications to missing data
1-nearest neighbor (1NN) importance weighting estimates moments by replacing missing data with the complete data that is the nearest neighbor. We define an empirical measure, the 1NN measure, and show that it is weakly consistent for the measure of the missing data.
0
0
0