Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01wd375z72c
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorEngelhardt, Barbara-
dc.contributor.authorKwok, Jonathan-
dc.date.accessioned2016-06-22T15:23:52Z-
dc.date.available2016-06-22T15:23:52Z-
dc.date.created2016-04-29-
dc.date.issued2016-06-22-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp01wd375z72c-
dc.description.abstractGenome-based data is becoming more accessible as the time and cost to sequence genomes decrease. A majority of studies using genome-based data focus on association tests to find relationships between mutations and traits but fewer studies look at using the data to produce disease prediction models. We look towards linear logistic regression, specifically a technique called elastic net, to build stable, sparse, and interpretable prediction models and compare the performance of the model to common forms of linear logistic regression, support vector machine, and principal component analysis. We find that elastic net produces sparse models but does not perform as well in practice as LASSO, another linear regression technique which also produces sparse models. We conclude from the experiment that LASSO is a better model to use, but suggest that we can use elastic net to verify the findings of the LASSO modelen_US
dc.format.extent35 pages*
dc.language.isoen_USen_US
dc.titleThe Performance of Elastic Net in Genome-based Disease Classificationen_US
dc.typePrinceton University Senior Theses-
pu.date.classyear2016en_US
pu.departmentComputer Scienceen_US
pu.pdf.coverpageSeniorThesisCoverPage-
Appears in Collections:Computer Science, 1988-2020

Files in This Item:
File SizeFormat 
Kwok_Jonathan_thesis.pdf420.66 kBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.