Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01wd375z72c
Title: The Performance of Elastic Net in Genome-based Disease Classification
Authors: Kwok, Jonathan
Advisors: Engelhardt, Barbara
Department: Computer Science
Class Year: 2016
Abstract: Genome-based data is becoming more accessible as the time and cost to sequence genomes decrease. A majority of studies using genome-based data focus on association tests to find relationships between mutations and traits but fewer studies look at using the data to produce disease prediction models. We look towards linear logistic regression, specifically a technique called elastic net, to build stable, sparse, and interpretable prediction models and compare the performance of the model to common forms of linear logistic regression, support vector machine, and principal component analysis. We find that elastic net produces sparse models but does not perform as well in practice as LASSO, another linear regression technique which also produces sparse models. We conclude from the experiment that LASSO is a better model to use, but suggest that we can use elastic net to verify the findings of the LASSO model
Extent: 35 pages
URI: http://arks.princeton.edu/ark:/88435/dsp01wd375z72c
Type of Material: Princeton University Senior Theses
Language: en_US
Appears in Collections:Computer Science, 1988-2020

Files in This Item:
File SizeFormat 
Kwok_Jonathan_thesis.pdf420.66 kBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.