Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp010z708w54n
Title: | A Computational Tool for Enhancing the Quality of Translated Documents Constructed from Individual Human-Translated Sentences |
Authors: | Yu, Keunwoo |
Advisors: | Mimno, David |
Department: | Computer Science |
Class Year: | 2013 |
Abstract: | The aim of this thesis is to develop a computational tool that would help curate documents that have been translated via crowd-sourced translation. The computational tool specifically focuses on a problem called “proper noun translation disagreement” or “named entity translation disagreement” in which translators use different translations for the same proper noun or named entity. In the process, the thesis presents various ways to preprocess training corpora and algorithms to improve the quality of Korean-English word-level alignment. A user interface is also introduced, which allows users to identify grammatical problems and correct them with ease. In trying to address issues that arise during the implementation of the computational tool, this thesis draws from previous research in statistical natural language processing for Western languages, as well as Korean and Chinese. There has not been much research in statistical natural language processing specifically geared towards crowdsourced translation. This thesis employs various techniques and algorithms developed by researchers focusing on the problem of bilingual alignment, and attempts to shed light on the various issues of crowd-sourced translation. |
Extent: | 56 pages |
URI: | http://arks.princeton.edu/ark:/88435/dsp010z708w54n |
Access Restrictions: | Walk-in Access. This thesis can only be viewed on computer terminals at the Mudd Manuscript Library. |
Type of Material: | Princeton University Senior Theses |
Language: | en_US |
Appears in Collections: | Computer Science, 1988-2020 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
Keunwoo Peter Yu.pdf | 943.04 kB | Adobe PDF | Request a copy |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.