Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01hd76s238c
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorStorey, John Den_US
dc.contributor.authorRobinson, David Garretten_US
dc.contributor.otherQuantitative Computational Biology Departmenten_US
dc.date.accessioned2015-06-23T19:38:29Z-
dc.date.available2015-06-23T19:38:29Z-
dc.date.issued2015en_US
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp01hd76s238c-
dc.description.abstractThe development of quantitative sequencing technologies, such as RNA-Seq, Bar-Seq, ChIP-Seq, and metagenomics, has offered great insight into molecular biology. Proper design and analysis of these experiments require statistical models and techniques that consider the specific nature of sequencing data, which typically consists of a matrix of read counts per feature. An issue of particular importance to the development of these methods is the role of read depth in statistical accuracy and power. The depth of an experiment affects the power to make biological conclusions, meaning an experiment design must consider the tradeoff between cost, power, and the number of samples that are examined. Similarly, per-gene read depth affects each gene's power and accuracy, and must be taken into account in any downstream analysis. Here I explore many facets of the role of read depth in the design and analysis of sequencing experiments, and offer computational and statistical methods for addressing them. To assist in the design of sequencing experiments, I present subSeq, which examines the effect of depth in an experiment by subsampling reads to simulate lower depths. I use this method to examine the extent of read saturation across a variety of RNA-Seq experiments, and demonstrate a statistical model for predicting the effect of increasing depth in any experiment. I consider intensity-dependence in a technology comparison between microarrays and RNA-Seq, and show that the variance added by RNA-Seq depends more on depth than the variance in microarray depends on fluorescence intensity. I demonstrate that Bar-Seq data shares these depth-dependent properties with RNA-Seq and can be analyzed by the same tools, and further provide suggestions on the appropriate depth for Bar-Seq experiments. Finally, I show that per-gene read depth can be taken into account in multiple hypothesis testing to improve power, and introduce the method of functional false discovery rate (fFDR) control.en_US
dc.language.isoenen_US
dc.publisherPrinceton, NJ : Princeton Universityen_US
dc.relation.isformatofThe Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the <a href=http://catalog.princeton.edu> library's main catalog </a>en_US
dc.subjectDifferential expressionen_US
dc.subjectExperimental designen_US
dc.subjectFalse discovery rateen_US
dc.subjectRead depthen_US
dc.subjectRNA-Seqen_US
dc.subjectSequencingen_US
dc.subject.classificationBioinformaticsen_US
dc.subject.classificationStatisticsen_US
dc.subject.classificationGeneticsen_US
dc.titleThe Role of Read Depth in the Design and Analysis of Sequencing Experimentsen_US
dc.typeAcademic dissertations (Ph.D.)en_US
pu.projectgrantnumber690-2143en_US
Appears in Collections:Quantitative Computational Biology

Files in This Item:
File Description SizeFormat 
Robinson_princeton_0181D_11406.pdf5.02 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.