Working Paper 99-10

Department of Economics

University of North Carolina, Chapel Hill


Overfitting and Biases in Nonparametric Kernel

Regressions Using Cross-Validated Bandwidths:

A Cautionary Note

Thomas A. Mroz

Timothy H. Savage

September 1999


Using Monte Carlo experiments, we examine the performance of nonparametric kernel regression methods. With cross-validated bandwidths and normal kernels, our experiments find that the kernel estimator fails a simple test that should be passed by classically unbiased estimators. The implicit functional form of the nonparametrric kernel regression, in finite, realistic sized samples, is actually quite restrictive; it fails to capture well either a classical OLS model or a simple discrete outcome model with multiple regressors. We also find that nonparametric regression techniques tend to overfit data for sample sizes common in microeconomic analyses, and we present two measures that help researchers to gauge the extent of the overfitting. Researchers should be exceptionally wary of interpreting estimates from the nonparametric regressions as conditional expected values in multivariate settings.



Thomas Mroz is a professor of economics at the University of North Carolina at Chapel Hill and a fellow of the Carolina Population Center. Timothy Savage is a Senior Associate at Charles River Associates, Inc. The authors thank Hyungtaik Ahn for his programs, data, and advice. Ron Gallant, two referees, and the associate editor also made important contributions to this paper.