Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01pz50gz967
Title: Moving from Recognition to Reasoning in Image Captioning
Authors: Feng, Berthy
Advisors: Russakovsky, Olga
Narasimhan, Karthik
Department: Computer Science
Certificate Program: Center for Statistics and Machine Learning
Class Year: 2019
Abstract: Image captioning is an artificial intelligence (AI) task at the intersection of vision and language. Current approaches to the task are recognition-based, leading to models that struggle to reason about image content and context. We explore the current state of image captioning and offer solutions for advancing the task from recognition to reasoning, specifically through the use of unpaired training data and evaluation based on a novel discriminativeness metric.
URI: http://arks.princeton.edu/ark:/88435/dsp01pz50gz967
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Computer Science, 1988-2020

Files in This Item:
File Description SizeFormat 
FENG-BERTHY-THESIS.pdf2.36 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.