1. G. Lugosi and A.B. Nobel,
"Strong consistency of data-driven
histogram methods for density estimation and classification",
Annals of Statistics, vol.24, pp.687-706, 1996.
Summary: General sufficient conditions are given for the almost sure L1-consistency of multivariate histogram density estimates based on data-dependent partitions. Analogous conditions are shown to ensure the Bayes risk consistency of histogram classification schemes using data-dependent partitions. The results of the paper are based in part on a new Vapnik-Chervonenkis type inequality for partition families. The results are applied to several known partitioning estimates, including classifiers based on statistically equivalent blocks, classifiers based on multivariate clustering, and the k-spacing density estimate.
2. A.B. Nobel,
"Histogram regression estimation using data-dependent
partitions", Annals of Statistics, vol. 24,
pp.1084-1105, 1996.
Summary: General sufficient conditions are given for the strong L2-consistency of multivariate histogram regression estimates based on data-dependent partitions. The same conditions are shown to ensure the consistency of partitioning regression estimates using local polynomial fitting, and, with an additional regularity assumption, the consistency of histogram estimates for conditional medians. The results are applied to k-thresholding regression estimates with variable weights, regression estimates based on nearest neighbor clustering, and empirically optimal regression trees.
3. G. Lugosi and A.B. Nobel,
"Adaptive model selection using
empirical complexities",
Annals of Statistics, vol. 27,
pp.1830-1864, 1999.
Summary: This paper considers the problem of adaptive model selection from independent observations. In the proposed method the available observations are divided into two parts. The first is used to form an empirical cover of each model class, and the second is used to select from each cover a candidate estimate having minimal empirical risk. From the list of candidates a distinguished estimate is chosen that minimizes the sum of class complexity and empirical risk. A distinguishing feature of the method is that the complexity of each model class is assessed empirically, based on the size of its empirical cover. Finite sample performance bounds are established for the estimate, which is shown to achieve a favorable tradeoff between approximation and estimation error, and to perform essentially as well as if the distribution-dependent complexities of the model classes were known beforehand. The performance bounds are used to establish near optimal rates of convergence for the estimates in several non-parametric problems.
4. A.B. Nobel,
"Analysis of a complexity based pruning method
for classification trees". To appear in the
IEEE Transactions on Information Theory.
Summary: A complexity based pruning procedure for classification trees is described, and bounds on its finite sample performance are established. The procedure selects a subtree of a (possibly random) initial tree in order to minimize a complexity penalized measure of empirical risk. The complexity assigned to a subtree is proportional to the square root of its size. Two cases are considered. In the first the growing and pruning data sets are identical, and in the second they are independent. In each case the expected performance of the pruning scheme is comparable to a penalized search among a sequence of idealized pruning schemes, where the k'th such scheme selects a subtree of size k having minimal probability of error. Using the performance bound, we establish the Bayes risk consistency of subtrees obtained from the procedure when the sequence of initial trees satisfies suitable geometric and structural constraints. The pruning method and its analysis are motivated by work on adaptive model selection using complexity regularization.