CKavity Library: Next-Generation Sequencing

Leach, Robert; Hecht, Michael; Karas, Christina

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp015999n626m

Full metadata record

DC Field	Value	Language
dc.contributor	Karas, Christina	-
dc.contributor.other	National Science Foundation	en_US
dc.coverage.spatial	United States--New Jersey--Princeton	en_US
dc.coverage.temporal	start=2018-04-20; end=2018-05-20	en_US
dc.creator	Leach, Robert	-
dc.creator	Hecht, Michael	-
dc.creator	Karas, Christina	-
dc.date.accessioned	2019-08-27T14:36:47Z	-
dc.date.available	2019-08-27T14:36:47Z	-
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/dsp015999n626m	-
dc.description.abstract	Protein sequence space is vast; nature uses only an infinitesimal fraction of possible sequences to sustain life. Are there solutions to biological problems other than those provided by nature? Can we create artificial proteins that sustain life? To investigate this question, the Hecht lab has created combinatorial collections, or libraries, of novel sequences with no homology to those found in living organisms. These libraries were subjected to screens and selections, leading to the identification of sequences with roles in catalysis, modulating gene regulation, and metal homeostasis. However, the resulting functional proteins formed dynamic rather than well-ordered structures. This impeded structural characterization and made it difficult to ascertain a mechanism of action. To address this, Christina Karas's thesis work focuses on developing a new model of libraries based on the de novo protein S-824, a four-helix bundle with a very stable three-dimensional structure. The first part of this research focused on mutagenesis of S-824 and characterization of the resulting proteins, revealing that this scaffold tolerates amino acid substitutions, including buried polar residues and the removal of hydrophobic side chains to create a putative cavity. Distinct from previous libraries, Karas targeted variability to a specific region of the protein, seeking to create a cavity and potential active site. The second part of this work details the design and creation of a library encoding 1.7 x 10^6 unique proteins, assembled from degenerate oligonucleotides. The third and fourth parts of this work cover the screening effort for a range of activities, both in vitro and in vivo. I found that this collection binds heme readily, leading to abundant peroxidase activity. Hits for lipase and phosphatase activity were also detected. This work details the development of a new strategy for creating de novo sequences geared toward function rather than structure.	en_US
dc.description.tableofcontents	CKavity_Lib_Reference: Reference information explaining the library design at the DNA level and the encoded amino acids at the protein level.<br>Folder 1-Raw_Data_20180425: Contains raw data obtained from sequencing in commonly used FASTQ format. Two separate reads are provided in the contained files.<br>2-Results_Summary: Analyses done on the raw data<br>ANALYSIS: contains text files that can be opened in Excel. Filtered results exclude frameshifts while raw results do not.<br>AA_FREQS: Amino acid frequencies for each of the designed variable codons.<br>NT_FREQS: Nucleotide frequencies for each of the designed variable base positions.<br>PIPELINE: Scripts used in the generation of this data.<br>QC: Quality control information generated during DNA sequencing.<br>REFERENCE: Material to assist in interpretation of the data.<br>amplicon: FASTA DNA file showing the linear PCR product derived from the CKavity library, which was then used for all sequencing.<br>probes: Sequences of the oligonucleotide probes used for Next-Generation Sequencing.<br>3-Random_Invariant: Contains CSV files repeating this analysis for random positions not designed to have combinatorial diversity, for comparison purposes.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Princeton University Lewis-Sigler Institute	en_US
dc.source	https://htseq.princeton.edu/cgi-bin/batchDownload.pl?mode=download;assay_id=2948	en_US
dc.subject	de novo genes	en_US
dc.subject	synthetic biology	en_US
dc.subject	Next-generation sequencing	en_US
dc.subject	DNA library	en_US
dc.title	CKavity Library: Next-Generation Sequencing	en_US
dc.title.alternative	A library of novel genes with combinatorially diverse cavities, built on a stably folded structural template	en_US
dc.type	Dataset	en_US
pu.embargo.lift	2022-07-01	en_US
pu.embargo.terms	2022-07-01	en_US
Appears in Collections:	Research DataSets

Files in This Item:

This content is embargoed until 2022-07-01. For questions about theses and dissertations, please contact the Mudd Manuscript Library. For questions about research datasets, as well as other inquiries, please contact the DataSpace curators.

Show simple item record

Search

Browse