Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp013b591c54v
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Singh, Mona | - |
dc.contributor.author | Todd, David | - |
dc.date.accessioned | 2020-08-12T13:11:39Z | - |
dc.date.available | 2020-08-12T13:11:39Z | - |
dc.date.created | 2020-05-03 | - |
dc.date.issued | 2020-08-12 | - |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/dsp013b591c54v | - |
dc.description.abstract | Characterizing proteins, which mediate a wide array of cellular processes by binding various ligands, is a major aim of computational biology. While proteins maycontain hundreds of amino acids, often only a few are typically involved in interactions with biologically relevant ligands. The most direct approach to determinewhich amino acid residues within a protein are involved in binding is through experimental methods, but only relatively few proteins have been captured in complex with a relevant ligand. To bridge this gap, we train a bidirectional Long ShortTerm Memory (BiLSTM) model to predict the binding properties of each aminoacid position from sequence-based features for five ligand groups: DNA, RNA, protein, ion, and metabolite. To increase power, we extend our set of true labels beyond the limited experimental data by using protein domain-based inferred binding scores. We then evaluate our model by measuring performance on a held-outtest set, and compare performance to a baseline XGBoost model, as well as an existing method. In both these comparisons, our model performs at least as well orbetter for all ligand groups. Because they reflect the binding potential of individ-ual amino acid sites, our predictions can also provide insight into both healthy anddiseased protein function. | en_US |
dc.format.mimetype | application/pdf | - |
dc.language.iso | en | en_US |
dc.title | LICENSE | en_US |
dc.title | LICENSE | en_US |
dc.title | Identifying Binding Positions in Proteins Using Neural Networks | en_US |
dc.type | Princeton University Senior Theses | - |
pu.date.classyear | 2020 | en_US |
pu.department | Computer Science | en_US |
pu.pdf.coverpage | SeniorThesisCoverPage | - |
pu.contributor.authorid | 961272339 | - |
Appears in Collections: | Computer Science, 1988-2020 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
TODD-DAVID-THESIS.pdf | 885.86 kB | Adobe PDF | Request a copy |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.