Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01r781wj79x
Title: | Integrative network-based approaches to analyze genomics data |
Authors: | Yao, Victoria |
Advisors: | Troyanskaya, Olga G |
Contributors: | Computer Science Department |
Keywords: | computational biology data integration neurodegenerative diseases systems biology |
Subjects: | Computer science Biology Bioinformatics |
Issue Date: | 2018 |
Publisher: | Princeton, NJ : Princeton University |
Abstract: | The generation of diverse genome-scale data across organisms and experimental conditions is becoming increasingly commonplace, creating unprecedented opportunities for understanding the molecular underpinnings of human disease. However, these large data are often noisy, highly heterogeneous, and lack the resolution required to study key aspects of metazoan complexity, such as tissue and cell-type specificity. Furthermore, targeted data collection and experimental verification is often infeasible in humans, underscoring the need for methods that can integrate -omics data, computational predictions, and biological knowledge across organisms. In this dissertation, I describe several novel, integrative computational approaches to address these challenges. First, I will describe a statistical and machine learning approach that takes advantage of high-quality neuron-specific molecular profiles of cells that vary in vulnerability to Alzheimer's disease obtained in mouse to generate neuron-specific functional networks in human. We then combine these network models with human quantitative genetics data to prioritize likely Alzheimer's disease candidates. Next, I present an in-depth analysis of all major adult C. elegans tissues and genome-wide expression predictions across 76 tissues and cell types. The tissue expression prediction method is one of the building blocks of diseaseQUEST, an integrative computational-experimental framework that combines human quantitative genetics with in silico functional network representations of model organism biology to systematically identify disease gene candidates. This framework leverages a novel semi-supervised Bayesian network integration approach to predict tissue- and cell-type-specific functional relationships between genes in model organisms. We use diseaseQUEST to construct 203 tissue- and cell-type-specific functional networks and predict candidate genes for 25 different human diseases and traits using C. elegans as a model system, with a particular focus on Parkinson's disease. Finally, I will present a network-based approach that systematically identifies differential isoform interactions. We apply this approach to the study of tissue and environment dynamics in Alzheimer's disease. Together, these approaches provide a framework for addressing the challenges of data heterogeneity, noise, and biological resolution in human molecular data to better understand the etiology of human disease. |
URI: | http://arks.princeton.edu/ark:/88435/dsp01r781wj79x |
Alternate format: | The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu |
Type of Material: | Academic dissertations (Ph.D.) |
Language: | en |
Appears in Collections: | Computer Science |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Yao_princeton_0181D_12741.pdf | 46.55 MB | Adobe PDF | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.