Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp015999n626m
Full metadata record
DC FieldValueLanguage
dc.contributorKaras, Christina-
dc.contributor.otherNational Science Foundationen_US
dc.coverage.spatialUnited States--New Jersey--Princetonen_US
dc.coverage.temporalstart=2018-04-20; end=2018-05-20en_US
dc.creatorLeach, Robert-
dc.creatorHecht, Michael-
dc.creatorKaras, Christina-
dc.date.accessioned2019-08-27T14:36:47Z-
dc.date.available2019-08-27T14:36:47Z-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp015999n626m-
dc.description.abstractProtein sequence space is vast; nature uses only an infinitesimal fraction of possible sequences to sustain life. Are there solutions to biological problems other than those provided by nature? Can we create artificial proteins that sustain life? To investigate this question, the Hecht lab has created combinatorial collections, or libraries, of novel sequences with no homology to those found in living organisms. These libraries were subjected to screens and selections, leading to the identification of sequences with roles in catalysis, modulating gene regulation, and metal homeostasis. However, the resulting functional proteins formed dynamic rather than well-ordered structures. This impeded structural characterization and made it difficult to ascertain a mechanism of action. To address this, Christina Karas's thesis work focuses on developing a new model of libraries based on the de novo protein S-824, a four-helix bundle with a very stable three-dimensional structure. The first part of this research focused on mutagenesis of S-824 and characterization of the resulting proteins, revealing that this scaffold tolerates amino acid substitutions, including buried polar residues and the removal of hydrophobic side chains to create a putative cavity. Distinct from previous libraries, Karas targeted variability to a specific region of the protein, seeking to create a cavity and potential active site. The second part of this work details the design and creation of a library encoding 1.7 x 10^6 unique proteins, assembled from degenerate oligonucleotides. The third and fourth parts of this work cover the screening effort for a range of activities, both in vitro and in vivo. I found that this collection binds heme readily, leading to abundant peroxidase activity. Hits for lipase and phosphatase activity were also detected. This work details the development of a new strategy for creating de novo sequences geared toward function rather than structure.en_US
dc.description.tableofcontentsCKavity_Lib_Reference: Reference information explaining the library design at the DNA level and the encoded amino acids at the protein level.<br>Folder 1-Raw_Data_20180425: Contains raw data obtained from sequencing in commonly used FASTQ format. Two separate reads are provided in the contained files.<br>2-Results_Summary: Analyses done on the raw data<br>ANALYSIS: contains text files that can be opened in Excel. Filtered results exclude frameshifts while raw results do not.<br>AA_FREQS: Amino acid frequencies for each of the designed variable codons.<br>NT_FREQS: Nucleotide frequencies for each of the designed variable base positions.<br>PIPELINE: Scripts used in the generation of this data.<br>QC: Quality control information generated during DNA sequencing.<br>REFERENCE: Material to assist in interpretation of the data.<br>amplicon: FASTA DNA file showing the linear PCR product derived from the CKavity library, which was then used for all sequencing.<br>probes: Sequences of the oligonucleotide probes used for Next-Generation Sequencing.<br>3-Random_Invariant: Contains CSV files repeating this analysis for random positions not designed to have combinatorial diversity, for comparison purposes.en_US
dc.language.isoen_USen_US
dc.publisherPrinceton University Lewis-Sigler Instituteen_US
dc.sourcehttps://htseq.princeton.edu/cgi-bin/batchDownload.pl?mode=download;assay_id=2948en_US
dc.subjectde novo genesen_US
dc.subjectsynthetic biologyen_US
dc.subjectNext-generation sequencingen_US
dc.subjectDNA libraryen_US
dc.titleCKavity Library: Next-Generation Sequencingen_US
dc.title.alternativeA library of novel genes with combinatorially diverse cavities, built on a stably folded structural templateen_US
dc.typeDataseten_US
pu.embargo.lift2022-07-01en_US
pu.embargo.terms2022-07-01en_US
Appears in Collections:Research DataSets

Files in This Item:
This content is embargoed until 2022-07-01. For questions about theses and dissertations, please contact the Mudd Manuscript Library. For questions about research datasets, as well as other inquiries, please contact the DataSpace curators.


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.