Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp010k225f039
Title: Identifying Structural Variants from Single-Cell DNA Sequencing Data
ORIGINAL
Identifying Structural Variants from Single-Cell DNA Sequencing Data
Identifying Structural Variants from Single-Cell DNA Sequencing Data
Sharo_thesis_Sharo_Andrew.pdf
Authors: Du, Claire
Advisors: Raphael, Ben
Department: Computer Science
Class Year: 2020
Abstract: Recent single-cell DNA sequencing technologies enable the sequencing of thousands of cells at single cell resolution, providing an unprecedented opportunity to study intra-tumor genetic heterogeneity. One tradeoff of these technologies is that the data collected per single cell is extremely sparse (0.02 - 0.05X coverage). As a result, recent studies have focused on the analysis of copy-number aberrations (CNAs) in single-cell data. However, CNAs are only a subclass of a larger class of mutations known as structural variants, which have been found to have a significant contribution to human genetic variation and disease. Structural variants (SVs) can affect clonal composition in ways that would be undetectable from an analysis of CNAs; therefore the study of SVs in single-cell data can provide important insights about tumor evolution. No study has yet focused on the identification and analysis of SVs from ultra-low coverage single-cell DNA sequencing data. We develop a pipeline to identify structural variants from sparse single-cell sequencing data by using the approach of pooling cells together into a pseudo-bulk sample, calling SVs, and then reassigning SVs to individual cells. We then test our pipeline on three different single-cell datasets sequenced on the 10X Genomics Chromium platform and evaluate the validity of the detected SVs. We find that although the detected SVs are too sparse to recover the clonal composition for any of the datasets, we are still able to identify individual SVs that explain previously reported copy number events in the same datasets. Through simulated experiments we also conclude that as single-cell barcoding technologies improve to have higher fidelity barcoding and higher coverage, it will become more feasible to use SVs detected by this pipeline to infer clonal structure.
URI: http://arks.princeton.edu/ark:/88435/dsp010k225f039
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Computer Science, 1988-2020

Files in This Item:
File Description SizeFormat 
DU-CLAIRE-THESIS.pdf4.57 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.