Sparse and Efficient Transfer Learning via Winning Lottery Tickets

Mehta, Rahul

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01k0698b331

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Arora, Sanjeev	-
dc.contributor.author	Mehta, Rahul	-
dc.date.accessioned	2019-07-24T18:23:51Z	-
dc.date.available	2019-07-24T18:23:51Z	-
dc.date.created	2019-05-06	-
dc.date.issued	2019-07-24	-
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/dsp01k0698b331	-
dc.description.abstract	In this thesis, we extend the Lottery Ticket Hypothesis of Frankle & Carbin (ICLR `19) to a variety of transfer learning problems. We identify sparse, trainable sub-networks that can be found on a source dataset and transferred to a variety of down-stream tasks. Our results show that sparse sub-networks with approximately 85-95% of weights removed exceed the accuracy of the original network when transferred to other tasks. We experimentally show that a sparse representation learned by a deep convolutional network trained on CIFAR-10 can be transferred to SmallNORB and FashionMNIST in a number of realistic settings. In addition, we show the existence of the first sparse, trainable sub-networks for natural language tasks; in particular, we show that BERT with up to 81.5% of parameters removed can reach the original test accuracy for the CoNLL-2003 Named Entity Recognition task.	en_US
dc.format.mimetype	application/pdf	-
dc.language.iso	en	en_US
dc.title	Sparse and Efficient Transfer Learning via Winning Lottery Tickets	en_US
dc.type	Princeton University Senior Theses	-
pu.date.classyear	2019	en_US
pu.department	Computer Science	en_US
pu.pdf.coverpage	SeniorThesisCoverPage	-
pu.contributor.authorid	960960895	-
pu.certificate	Center for Statistics and Machine Learning	en_US
Appears in Collections:	Computer Science, 1988-2020

Files in This Item:

File	Description	Size	Format
MEHTA-RAHUL-THESIS.pdf		1.1 MB	Adobe PDF	Request a copy

Show simple item record

Search

Browse