Moving from Recognition to Reasoning in Image Captioning

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01pz50gz967

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Russakovsky, Olga	-
dc.contributor.advisor	Narasimhan, Karthik	-
dc.contributor.author	Feng, Berthy	-
dc.date.accessioned	2019-09-04T17:42:34Z	-
dc.date.available	2019-09-04T17:42:34Z	-
dc.date.created	2019-05-03	-
dc.date.issued	2019-09-04	-
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/dsp01pz50gz967	-
dc.description.abstract	Image captioning is an artificial intelligence (AI) task at the intersection of vision and language. Current approaches to the task are recognition-based, leading to models that struggle to reason about image content and context. We explore the current state of image captioning and offer solutions for advancing the task from recognition to reasoning, specifically through the use of unpaired training data and evaluation based on a novel discriminativeness metric.	en_US
dc.format.mimetype	application/pdf	-
dc.language.iso	en	en_US
dc.title	Moving from Recognition to Reasoning in Image Captioning	en_US
dc.type	Princeton University Senior Theses	-
pu.date.classyear	2019	en_US
pu.department	Computer Science	en_US
pu.pdf.coverpage	SeniorThesisCoverPage	-
pu.contributor.authorid	960932079	-
pu.certificate	Center for Statistics and Machine Learning	en_US
Appears in Collections:	Computer Science, 1988-2020

Files in This Item:

File	Description	Size	Format
FENG-BERTHY-THESIS.pdf		2.36 MB	Adobe PDF	Request a copy

Search

Browse