Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp018049g730x
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Chiang, Mung | en_US |
dc.contributor.author | Wong, Ming Fai Felix | en_US |
dc.contributor.other | Electrical Engineering Department | en_US |
dc.date.accessioned | 2015-03-26T14:29:45Z | - |
dc.date.available | 2015-03-26T14:29:45Z | - |
dc.date.issued | 2015 | en_US |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/dsp018049g730x | - |
dc.description.abstract | As new sources of large-scale data with increasing volume and complexity are being created, finding scalable ways to gain insights from unstructured big data has become a big challenge. Furthermore, our big data challenge is exacerbated by the data often being noisy, sparse and heterogeneous. This dissertation illustrates, through three studies, the benefits of using an optimization framework to devise methods for big data analytics. We study three problems in computational political science, finance and recommender systems, analyzing a wide range of data, including time series, text, ratings and social networks. First, we propose an inference technique to quantify the political leaning of Twitter users based on the patterns of how they get retweeted. We apply the technique to Twitter data collected during the U.S. presidential election of 2012. Second, we propose a joint latent space model for stock price movements and word usage patterns in newspapers. We apply the model and develop an algorithm to predict stock closing prices of a given day using the full text of The Wall Street Journal of the morning. Finally, we study the fundamental question of evaluating the quality of a social recommender network. We propose a pair of metrics to quantify (a) a network's efficiency in disseminating recommendations and (b) the quality of a user's neighbors in the network. Then we empirically study their tradeoff on Yelp data, and devise an algorithm to improve a network's quality through friend recommendation and news feed curation. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Princeton, NJ : Princeton University | en_US |
dc.relation.isformatof | The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the <a href=http://catalog.princeton.edu> library's main catalog </a> | en_US |
dc.subject.classification | Electrical engineering | en_US |
dc.subject.classification | Computer science | en_US |
dc.title | Optimization Techniques for Data Analytics | en_US |
dc.type | Academic dissertations (Ph.D.) | en_US |
pu.projectgrantnumber | 690-2143 | en_US |
Appears in Collections: | Electrical Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Wong_princeton_0181D_11235.pdf | 1.99 MB | Adobe PDF | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.