Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp018p58pd125
Full metadata record
DC FieldValueLanguage
dc.contributorSinger, Amit-
dc.contributor.advisorArora, Sanjeev-
dc.contributor.authorZhu, Michael-
dc.date.accessioned2014-07-22T20:13:53Z-
dc.date.available2014-07-22T20:13:53Z-
dc.date.created2014-05-05-
dc.date.issued2014-07-22-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp018p58pd125-
dc.description.abstractLatent factor models and matrix factorization algorithms were some of the most successful stand-alone algorithms used for predicting movie ratings in the Netflix Prize. To address the sparsity in the movie rating training set, many matrix factorization algorithms train only on the observed ratings and use regularization to avoid overfitting. Topic modeling algorithms must also be able to handle high sparsity. Given a collection of documents, the purpose of topic modeling is to discover the high-level thematic structure that best explains the collection of documents as a whole. In the same way, we might hope that given a collection of movie ratings, we can uncover the high-level movie genres that best explain the collection of movie ratings as a whole. Mathematically, topic modeling can be interpreted as recovering the first factor in a matrix factorization, subject to some constraints. By this view, perhaps a topic modeling algorithm can be the first step in a matrix factorization algorithm that predicts Netflix movie ratings. In this thesis, we develop a three-step algorithm for predicting movie ratings using a matrix factorization of the form M = AW: first we obtain a collection of genres using a topic modeling algorithm, then we generate a suitable A matrix from the collection of genres, and finally we use the A matrix to get the W matrix.en_US
dc.format.extent23 pagesen_US
dc.language.isoen_USen_US
dc.titlePredicting Netflix Movie Ratings using a Topic Modeling Algorithmen_US
dc.typePrinceton University Senior Theses-
pu.date.classyear2014en_US
pu.departmentMathematicsen_US
pu.pdf.coverpageSeniorThesisCoverPage-
Appears in Collections:Mathematics, 1934-2020

Files in This Item:
File SizeFormat 
Michael Zhu thesis.pdf607.5 kBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.