Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01tx31qm55v
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorNarayanan, Arvind-
dc.contributor.authorLi, Frank-
dc.date.accessioned2019-08-19T12:18:27Z-
dc.date.available2019-08-19T12:18:27Z-
dc.date.created2019-05-02-
dc.date.issued2019-08-19-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp01tx31qm55v-
dc.description.abstractInternet users can accidentally expose their own private information in a myriad of ways. This paper describes our approach to a large-scale measurement study on one case of online privacy leakage wherein users upload files for publication and sharing, files that can contain users’ private information hidden within them. We analyze comments in TeX source files of arXiv publications using various natural language processing techniques to identify specific attributes of comments that may represent privacy violations. We also perform near-duplicate detection and clustering on a large data set of privacy policy texts to understand how online privacy policy is communicated to users. We find that arXiv publications contain many interesting comments despite the ease with which authors can strip out all comments. We find that many privacy policy texts are duplicates or near-duplicates of one another.en_US
dc.format.mimetypeapplication/pdf-
dc.language.isoenen_US
dc.titlePrivacy Implications of Not-So-Hidden Comments in arXiv Files and Analysis of Online Privacy Policiesen_US
dc.typePrinceton University Senior Theses-
pu.date.classyear2019en_US
pu.departmentElectrical Engineeringen_US
pu.pdf.coverpageSeniorThesisCoverPage-
pu.contributor.authorid961167173-
pu.certificateApplications of Computing Programen_US
Appears in Collections:Electrical Engineering, 1932-2020

Files in This Item:
File Description SizeFormat 
LI-FRANK-THESIS.pdf717.91 kBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.