Moving from Recognition to Reasoning in Image Captioning

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01pz50gz967

Title:	Moving from Recognition to Reasoning in Image Captioning
Authors:	Feng, Berthy
Advisors:	Russakovsky, Olga Narasimhan, Karthik
Department:	Computer Science
Certificate Program:	Center for Statistics and Machine Learning
Class Year:	2019
Abstract:	Image captioning is an artificial intelligence (AI) task at the intersection of vision and language. Current approaches to the task are recognition-based, leading to models that struggle to reason about image content and context. We explore the current state of image captioning and offer solutions for advancing the task from recognition to reasoning, specifically through the use of unpaired training data and evaluation based on a novel discriminativeness metric.
URI:	http://arks.princeton.edu/ark:/88435/dsp01pz50gz967
Type of Material:	Princeton University Senior Theses
Language:	en
Appears in Collections:	Computer Science, 1988-2020

Files in This Item:

File	Description	Size	Format
FENG-BERTHY-THESIS.pdf		2.36 MB	Adobe PDF	Request a copy

Search

Browse