Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01pz50gw21f
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorFan, Jianqingen_US
dc.contributor.authorBarut, Ahmet Emreen_US
dc.contributor.otherOperations Research and Financial Engineering Departmenten_US
dc.date.accessioned2013-09-16T17:26:27Z-
dc.date.available2013-09-16T17:26:27Z-
dc.date.issued2013en_US
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp01pz50gw21f-
dc.description.abstractThe aim of this thesis is to develop methods for variable selection and statistical prediction for high dimensional statistical problems. Along with proposing new and innovative procedures, this thesis also focuses on the theoretical properties of the proposed methods and establishes bounds on the statistical error of resulting estimators. The main body of the thesis is divided into three parts. In Chapter 1, a variable screening method for generalized linear models is discussed. The emphasis of the chapter is to provide a procedure to reduce the number of variables in a reliable and fast manner. Then, Chapter 2 considers the linear regression problem in high dimensions when the noise has heavy tails. To perform robust variable selection, a new method, called adaptive robust Lasso, is introduced. Finally, in Chapter 3, the subject is high dimensional classification problems. In this chapter, a robust approach for this problem is proposed and theoretical properties for this approach are established. Overall, the methods proposed in this thesis collectively attempt to solve many of the issues arising in high dimensional statistics, from screening to variable selection. In Chapter 1, we study the variable screening problem for generalized linear models. In many applications, researchers often have some prior knowledge that a certain set of variables is related to the response. In such a situation, a natural assessment on the relative importance of the other predictors is the conditional contributions of the individual predictors in presence of the known set of variables. This results in conditional sure independence screening (CSIS). We propose and study CSIS in the context of generalized linear models. For ultrahigh-dimensional statistical problems, we give conditions under which sure screening is possible and derive an upper bound on the number of selected variables. We also spell out the situation under which CSIS yields model selection consistency. In Chapter 2, we consider the heavy-tailed high dimensional linear regression problem. In the ultra-high dimensional setting, where the dimensionality can grow exponentially with the sample size, we investigate the model selection oracle property and establish the asymptotic normality of a quantile regression based method called WR-Lasso. We show that only mild conditions on the model error distribution are needed. Our theoretical results also reveal that adaptive choice of the weight vector is essential for the WR-Lasso to enjoy these nice asymptotic properties. To make the WR-Lasso practically feasible, we propose a two-step procedure, called adaptive robust Lasso (AR-Lasso), in which the weight vector in the second step is constructed based on the L_1 penalized quantile regression estimate from the first step. In Chapter 3, we provide an analysis about the issue of measurement errors in high dimensional linear classification problems. For such settings, we propose a new estimator called the robust sparse linear discriminant, that recovers the sparsity signal and adapts to the unknown noise level simultaneously. In contrast to the existing methods, we show that this new method has low risk properties even in the case of measurement errors. Moreover, we propose a new algorithm that recovers the solution paths for a continuum of regularization parameter values.en_US
dc.language.isoenen_US
dc.publisherPrinceton, NJ : Princeton Universityen_US
dc.relation.isformatofThe Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the <a href=http://catalog.princeton.edu> library's main catalog </a>en_US
dc.subjectClassificationen_US
dc.subjectFisher Discriminanten_US
dc.subjectGeneralized Linear Modelsen_US
dc.subjectHigh Dimensional Modelsen_US
dc.subjectPenalized Estimatorsen_US
dc.subjectStatisticsen_US
dc.subject.classificationStatisticsen_US
dc.subject.classificationMathematicsen_US
dc.subject.classificationBiostatisticsen_US
dc.titleVariable Selection and Prediction in High Dimensional Modelsen_US
dc.typeAcademic dissertations (Ph.D.)en_US
pu.projectgrantnumber690-2143en_US
Appears in Collections:Operations Research and Financial Engineering

Files in This Item:
File Description SizeFormat 
Barut_princeton_0181D_10632.pdf558.08 kBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.