Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/99999/fk4n31k22f
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorTROYANSKAYA, OLGA G.
dc.contributor.authorThistlethwaite, William
dc.contributor.otherQuantitative Computational Biology Department
dc.date.accessioned2025-02-11T15:40:09Z-
dc.date.available2025-02-11T15:40:09Z-
dc.date.created2024-01-01
dc.date.issued2025
dc.identifier.urihttp://arks.princeton.edu/ark:/99999/fk4n31k22f-
dc.description.abstractSingle-cell technologies have enabled us to profile the internal state of individual cells with increasingly high granularity, but computational methods to extract deep biological insight from these complex, multi-omic datasets remain underdeveloped. Unstandardized workflows with arbitrary quality-control (QC) thresholds lead to low reproducibility, and it remains challenging for scientists without coding skills to gain biological insight using these valuable data. Here, we develop an end-to-end computational pipeline for rigorous, reproducible analysis of single-cell data, and then we apply this pipeline, along with other analytical techniques, to epigenomic and transcriptomic data gathered from an influenza challenge study to better understand how influenza infection shapes innate immune memory. We first describe our work on SPEEDI (Single-cell Pipeline for End to End Data Integration), a computational end-to-end pipeline that processes single-cell RNA-seq (scRNA-seq), single-cell ATAC-seq (scATAC-seq), or multiome data in a reproducible, robust manner. After reading input data, the pipeline automatically filters the data using algorithmically determined thresholds for common QC metrics, integrates data using a novel data-derived batch inference method, annotates cell types using an internal or user-provided reference object, and then performs preliminary downstream analyses within each cell type. Importantly, SPEEDI is available both as an R package for advanced users and as an interactive web server for biologists with no prior coding experience. We next apply SPEEDI and other analytical techniques to investigate how innate immune memory develops following influenza infection. We leverage blood samples from a human influenza virus challenge study to conduct integrative multi-omic data analyses, focusing specifically on the epigenetic and transcriptomic profiles of subjects at 1 day pre-challenge and 28 days post-challenge. We find that the innate immune system enters a state of suppressed inflammation after resolution of infection, with decreased cytokine and AP-1 gene expression and decreased chromatin accessibility at AP-1 targeted loci and promoter regions of interleukin-related genes. However, increased chromatin accessibility at promoter regions of interferon-related genes and increased MAP kinase gene expression may suggest that the innate immune system is primed to respond to subsequent infection.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.publisherPrinceton, NJ : Princeton University
dc.subjectdata science
dc.subjectinfluenza
dc.subjectmulti-omics
dc.subjectsoftware development
dc.subject.classificationBioinformatics
dc.subject.classificationComputer science
dc.subject.classificationBiology
dc.titleINTEGRATIVE MULTI-OMIC DATA ANALYSIS AND SOFTWARE DEVELOPMENT
dc.typeAcademic dissertations (Ph.D.)
pu.date.classyear2025
pu.departmentQuantitative Computational Biology
Appears in Collections:Quantitative Computational Biology

Files in This Item:
File SizeFormat 
Thistlethwaite_princeton_0181D_15342.pdf10.6 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.