Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01rj430716r
Title: StaTIC: A Dataset of Question-Driven Inference Chains
Authors: Demszky, Dora
Advisors: Fellbaum, Christiane D.
Contributors: Katz, Joshua T.
Department: Independent Concentration
Certificate Program: Applications of Computing Program
Class Year: 2017
Abstract: Modeling logical inference has become a key building block in the improvement of manyNLP tasks, including summarization, question answering and information extraction.However, since the rules that underlie inference are rarely made explicit in naturallanguage, there is a need for specialized datasets from which these rules can be learned.We introduce Stanford Textual Inference Chains (StaTIC), a dataset of sentence pairs inan entailment relation that are also “minimal pairs”, differing from each other by a smallsyntactic or lexical change. In this thesis, we describe and evaluate our data collectionmethods and analyze the lexical and syntactic properties of the results, focusing on theways in which they make the dataset suitable for informing systems of natural languageunderstanding.
URI: http://arks.princeton.edu/ark:/88435/dsp01rj430716r
Type of Material: Princeton University Senior Theses
Language: en_US
Appears in Collections:Independent Concentration, 1972-2020

Files in This Item:
File SizeFormat 
ddemszky_thesis.pdf1.18 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.