Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/99999/fk4n31k22f
Title: INTEGRATIVE MULTI-OMIC DATA ANALYSIS AND SOFTWARE DEVELOPMENT
Authors: Thistlethwaite, William
Advisors: TROYANSKAYA, OLGA G.
Contributors: Quantitative Computational Biology Department
Keywords: data science
influenza
multi-omics
software development
Subjects: Bioinformatics
Computer science
Biology
Issue Date: 2025
Publisher: Princeton, NJ : Princeton University
Abstract: Single-cell technologies have enabled us to profile the internal state of individual cells with increasingly high granularity, but computational methods to extract deep biological insight from these complex, multi-omic datasets remain underdeveloped. Unstandardized workflows with arbitrary quality-control (QC) thresholds lead to low reproducibility, and it remains challenging for scientists without coding skills to gain biological insight using these valuable data. Here, we develop an end-to-end computational pipeline for rigorous, reproducible analysis of single-cell data, and then we apply this pipeline, along with other analytical techniques, to epigenomic and transcriptomic data gathered from an influenza challenge study to better understand how influenza infection shapes innate immune memory. We first describe our work on SPEEDI (Single-cell Pipeline for End to End Data Integration), a computational end-to-end pipeline that processes single-cell RNA-seq (scRNA-seq), single-cell ATAC-seq (scATAC-seq), or multiome data in a reproducible, robust manner. After reading input data, the pipeline automatically filters the data using algorithmically determined thresholds for common QC metrics, integrates data using a novel data-derived batch inference method, annotates cell types using an internal or user-provided reference object, and then performs preliminary downstream analyses within each cell type. Importantly, SPEEDI is available both as an R package for advanced users and as an interactive web server for biologists with no prior coding experience. We next apply SPEEDI and other analytical techniques to investigate how innate immune memory develops following influenza infection. We leverage blood samples from a human influenza virus challenge study to conduct integrative multi-omic data analyses, focusing specifically on the epigenetic and transcriptomic profiles of subjects at 1 day pre-challenge and 28 days post-challenge. We find that the innate immune system enters a state of suppressed inflammation after resolution of infection, with decreased cytokine and AP-1 gene expression and decreased chromatin accessibility at AP-1 targeted loci and promoter regions of interleukin-related genes. However, increased chromatin accessibility at promoter regions of interferon-related genes and increased MAP kinase gene expression may suggest that the innate immune system is primed to respond to subsequent infection.
URI: http://arks.princeton.edu/ark:/99999/fk4n31k22f
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Quantitative Computational Biology

Files in This Item:
File SizeFormat 
Thistlethwaite_princeton_0181D_15342.pdf10.6 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.