Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01h415pd38k
Title: | Conditioning Language Models for Domain |
Authors: | Arora, Karan |
Advisors: | Narasimhan, Karthik |
Department: | Computer Science |
Class Year: | 2019 |
Abstract: | We consider a setting in which a language model - given access to some information about an input’s domain - is trained to learn a task over an entire distribution of domains, with the goal of generalizing to inputs from domains that are not in its training data. Drawing inspiration from existing methods outside our problem setting, we develop a mechanism that conditions an operation in a language model to modify its representation of an input based on information about a domain. This mechanism is meant to be trained jointly with the taskperforming model, and makes few assumptions about the model architecture. We perform experiments in which we compare the performance of a model that is augmented with our mechanism to a baseline that is not for language modeling and sentiment analysis tasks. While the conditioning mechanism does not currently provide a performance improvement on real data, experiments with synthetic data suggest that it is capable of doing so, and that some fine-tuning and further experimentation may enable it to work better. |
URI: | http://arks.princeton.edu/ark:/88435/dsp01h415pd38k |
Type of Material: | Princeton University Senior Theses |
Language: | en |
Appears in Collections: | Computer Science, 1988-2020 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
ARORA-KARAN-THESIS.pdf | 857.02 kB | Adobe PDF | Request a copy |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.