Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/99999/fk4w96n176
Title: Analysis of Distal eQTLs across Multiple Human Tissues and Methods to Improve Their Quality
Authors: Jo, Brian
Advisors: Engelhardt, Barbara E
Contributors: Quantitative Computational Biology Department
Keywords: eQTL
False Discovery Rate
GTEx
GWAS
Human Genetics
Statistical Genetics
Subjects: Genetics
Bioinformatics
Issue Date: 2021
Publisher: Princeton, NJ : Princeton University
Abstract: As we experience exponential growth in available sequencing data for humans, a detailed understanding of the genetic basis of variation in gene expression is being achieved. One of the ways we are observing this is through genome-wide association mapping of distal expression quantitative loci (distal eQTLs). Unlike cis-eQTLs, distal eQTLs face numerous challenges, including weak signal and large number of hypotheses (contributing to low power), as well as numerous and often cryptic sources of confounding. However, despite all of these challenges, an extensive repository of human distal eQTLs can prove highly valuable; when used in conjunction with cis-eQTLs, GWAS of complex traits, gene networks and more recently, single cell datasets, a highly detailed picture of gene regulation can be achieved. This work first presents the most extensive mapping of human distal eQTLs to date, and explore approaches to improve their quality. Using the Genotype-Tissue Expression (GTEx) dataset with $838$ donors with samples across $49$ human tissues, I present a repository of over $5000$ distal eQTLs across multiple tissues. This work was one of the major contributions to the GTEx consortium publications, which contributes to the field of human genomics with an extensive public repository of genotype, expression, and other phenotypes, along with summary statistics. This work also focuses on exploring the relationship among the cis-eGene, trans-eGene and sample heterogeneity, or cell type information when available. We show that in many reported distal eQTLs, sample heterogeneity can play a regulatory role, while in other cases, a relationship between cis-eGene and trans-eGene can be narrowed down to specific cell types. Finally, I take a statistical approach and analyze the landscape of association statistics, especially in the context of covariance structures among variants, genes and tissues. I find that if we take these covariance structures into account properly, the significance testing procedure can yield results that depart significantly from results obtained using common methods, which treat all tissues, genes and variants identically. All of these results contribute to human genetics research by identifying potential issues with the ever increasing collection of human distal eQTLs, and how we can address them.
URI: http://arks.princeton.edu/ark:/99999/fk4w96n176
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Quantitative Computational Biology

Files in This Item:
File SizeFormat 
Jo_princeton_0181D_13795.pdf10.5 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.