Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01wd375z72c
Title: | The Performance of Elastic Net in Genome-based Disease Classification |
Authors: | Kwok, Jonathan |
Advisors: | Engelhardt, Barbara |
Department: | Computer Science |
Class Year: | 2016 |
Abstract: | Genome-based data is becoming more accessible as the time and cost to sequence genomes decrease. A majority of studies using genome-based data focus on association tests to find relationships between mutations and traits but fewer studies look at using the data to produce disease prediction models. We look towards linear logistic regression, specifically a technique called elastic net, to build stable, sparse, and interpretable prediction models and compare the performance of the model to common forms of linear logistic regression, support vector machine, and principal component analysis. We find that elastic net produces sparse models but does not perform as well in practice as LASSO, another linear regression technique which also produces sparse models. We conclude from the experiment that LASSO is a better model to use, but suggest that we can use elastic net to verify the findings of the LASSO model |
Extent: | 35 pages |
URI: | http://arks.princeton.edu/ark:/88435/dsp01wd375z72c |
Type of Material: | Princeton University Senior Theses |
Language: | en_US |
Appears in Collections: | Computer Science, 1988-2020 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
Kwok_Jonathan_thesis.pdf | 420.66 kB | Adobe PDF | Request a copy |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.