Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01xp68kg386
Title: Applications of Rhetorical Structure Theory in Text Generation
Authors: Nakos, Constantine
Advisors: Hug, Josh
Department: Computer Science
Class Year: 2014
Abstract: Natural Langauge Generation, also known as text generation, deals with the use of computers to convey information in human language. The problem is a sizable one and touches on many aspects of computer science and linguistics, ranging from Information Retrieval to the rules of grammar. One of the necessary components of the NLG process is a detailed, automated version of the outlining performed by human authors: document planning. To plan a document, a text generation program must have a model of its internal structure. Rhetorical Structure Theory offers such a model, as well as a soft guarantee that the resulting text will be coherent. In this paper I will discuss some of the challenges of applying Rhetorical Structure Theory to Natural Language Generation. Section 2 will contain background on NLG, RST, and the problems and deficiencies that stem from mixing the two. Section 3 will outline the structure of a text generation program, where RST fits into the pipeline, and how other theories can shore up its deficiencies. Section 4 will address the compromises necessary to apply RST to NLG, the problems that remain, and the approaches available to handle them. Finally I will conclude that using RST as the basis of a text generation system leaves much to be desired, and that future efforts are better spent improving existing systems.
Extent: 28 pages
URI: http://arks.princeton.edu/ark:/88435/dsp01xp68kg386
Type of Material: Princeton University Senior Theses
Language: en_US
Appears in Collections:Computer Science, 1988-2020

Files in This Item:
File SizeFormat 
Nakos_Constantine_Thesis.pdf203.11 kBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.