Conditioning Language Models for Domain

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01h415pd38k

Title:	Conditioning Language Models for Domain
Authors:	Arora, Karan
Advisors:	Narasimhan, Karthik
Department:	Computer Science
Class Year:	2019
Abstract:	We consider a setting in which a language model - given access to some information about an input’s domain - is trained to learn a task over an entire distribution of domains, with the goal of generalizing to inputs from domains that are not in its training data. Drawing inspiration from existing methods outside our problem setting, we develop a mechanism that conditions an operation in a language model to modify its representation of an input based on information about a domain. This mechanism is meant to be trained jointly with the taskperforming model, and makes few assumptions about the model architecture. We perform experiments in which we compare the performance of a model that is augmented with our mechanism to a baseline that is not for language modeling and sentiment analysis tasks. While the conditioning mechanism does not currently provide a performance improvement on real data, experiments with synthetic data suggest that it is capable of doing so, and that some fine-tuning and further experimentation may enable it to work better.
URI:	http://arks.princeton.edu/ark:/88435/dsp01h415pd38k
Type of Material:	Princeton University Senior Theses
Language:	en
Appears in Collections:	Computer Science, 1988-2020

Files in This Item:

File	Size	Format
ARORA-KARAN-THESIS.pdf	857.02 kB	Adobe PDF	Request a copy

Search

Browse