Can Computers Learn from Emoticons? Sentiment Analysis Through Supervised Machine Learning Techniques

Morin, Valerie

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01z029p7345

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Kpotufe, Samory K.	-
dc.contributor.author	Morin, Valerie	-
dc.date.accessioned	2017-07-19T16:27:46Z	-
dc.date.available	2017-07-19T16:27:46Z	-
dc.date.created	2017-04-16	-
dc.date.issued	2017-4-16	-
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/dsp01z029p7345	-
dc.description.abstract	With the rise of social media have come innovative breakthroughs in sentiment analysis, owing to the wealth of textual data now readily available. However, senti- ment analysis has long suffered from the unfortunately limited amount of labeled data due to the costly manual labeling process. This has led a lot of research to focus on unsupervised techniques instead, and research that continues to focus on supervised techniques accepts the loss in potential accuracy due to the small fraction of data it can train on.There are, however, already emotional signals embedded in social media text that have been overlooked by most sentiment analysis methods. Xia Hu, Jiliang Tang, Huiji Gao, and Huan Liu set out to answer the question of whether these signals could be helpful in sentiment analysis, and they were able to statistically verify that emotional signals such as emoticons were indeed representative of manually-labeled sentiment [1].In this thesis, I utilized text containing emoticons as a novel labeled dataset on which to train a sentiment analyzer through supervised machine learning techniques. By treating emoticons as sentiment labels for the text, I greatly expanded the dataset, a process known to improve classification accuracy [2]. I employed different feature selection methods, including n-grams, term-frequency inverse-document frequency, and doc2vec, to vectorize the Stanford Twitter Sentiment140 dataset expanded by my additional emoticon-labeled data. Using these features, I trained an array of super- vised learning classifiers and a convolutional neural network to determine sentiment polarity, with the aim of enhancing accuracy of a widely applicable field.	en_US
dc.language.iso	en_US	en_US
dc.title	Can Computers Learn from Emoticons? Sentiment Analysis Through Supervised Machine Learning Techniques	en_US
dc.type	Princeton University Senior Theses	-
pu.date.classyear	2017	en_US
pu.department	Operations Research and Financial Engineering	en_US
pu.pdf.coverpage	SeniorThesisCoverPage	-
pu.contributor.authorid	960861667	-
pu.contributor.advisorid	961116620	-
pu.certificate	Applications of Computing Program	en_US
Appears in Collections:	Operations Research and Financial Engineering, 2000-2019

Files in This Item:

File	Size	Format
Thesis_Valerie_Morin.pdf	2.78 MB	Adobe PDF	Request a copy

Show simple item record

Search

Browse