Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01w3763950z
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorAbou Donia, Mohamed-
dc.contributor.authorChang, Allison-
dc.date.accessioned2018-08-14T15:46:45Z-
dc.date.available2018-08-14T15:46:45Z-
dc.date.created2018-05-03-
dc.date.issued2018-08-14-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp01w3763950z-
dc.description.abstractCurrent algorithms have used profile Hidden Markov Models (pHMMs) for the automated detection of biosynthetic gene clusters (BGCs) in microbial genomes. However, these algorithms rely on the availability of well-assembled genomic input. Moreover, they are unable to tolerate unassembled or mixed genomes. Here, we evaluated the performance of available pHMMs to identify the Condensation domain in non-ribosomal protein synthetases (NRPSs) in the metagenomic sequence data of healthy American patients. We found that sensitivity increased from 56.22% using pHMMs to >70% in several segments of the pHMM (spHMM). The spHMM models performed better compared to the original pHMM model that they were built from and showed that there are several conserved region of residues that are more essential in detecting Condensation domains than others within a model. Our results provide a rapid detection method that is less computationally expensive for profiling the biosynthetic capacity of a large scale cohort.en_US
dc.format.mimetypeapplication/pdf-
dc.language.isoenen_US
dc.titleProfile Hidden Markov Models for the Detection of Non-Ribosomal Peptide Synthetases within Metagenomic Sequence Dataen_US
dc.typePrinceton University Senior Theses-
pu.date.classyear2018en_US
pu.departmentComputer Scienceen_US
pu.pdf.coverpageSeniorThesisCoverPage-
pu.contributor.authorid961071772-
pu.certificateEngineering Biology Programen_US
Appears in Collections:Computer Science, 1988-2020

Files in This Item:
File Description SizeFormat 
CHANG-ALLISON-THESIS.pdf1.16 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.