Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp017d278w81g
Title: | Rethinking the Science of Statistical Privacy |
Authors: | Liu, Changchang |
Advisors: | Mittal, Prateek |
Contributors: | Electrical Engineering Department |
Keywords: | Auxiliary Information Differential Privacy Statistical Privacy |
Subjects: | Computer science Statistics |
Issue Date: | 2019 |
Publisher: | Princeton, NJ : Princeton University |
Abstract: | Nowadays, more and more data, such as social network data, mobility data, business data, medical data, are shared or made public to enable real world applications. Such data is likely to contain sensitive information and thus needs to be obfuscated prior to release, to protect privacy. However, existing statistical data privacy mechanisms in the security community have several weaknesses: 1) they are limited in protecting sensitive information in the static scenario, and can not be generally applied to accommodate temporal dynamics. With the increasing development of data science, a large amount of sensitive data such as personal social relationships are becoming public, making the privacy concerns of a time series of data more and more challenging; 2) these privacy mechanisms do not explicitly capture correlations, leaving open the possibility of inference attacks. In many real world scenarios, the data tuple dependence/ correlation occurs naturally in datasets due to social, behavioral and genetic interactions between users; 3) there are very few practical guidelines on how to apply existing statistical privacy notions in practice, and a key challenge is how to set an appropriate value for the privacy parameters. In this thesis, we aim to overcome these weaknesses to provide privacy guarantees for protecting dynamic data structures, dependent (correlated) data structures. We also aim to discover useful and interpretable guidelines for selecting proper values of parameters in the state-of-the-art privacy-preserving frameworks. Furthermore, we investigate how an auxiliary information -- in the form of prior distribution of the database and correlation across records and time -- can influence the proper choice of the privacy parameters. Specifically, we 1) first propose the design of a privacy-preserving system called LinkMirage, that mediates access to dynamic social relationships in social networks, while effectively supporting social graph-based data analytics; 2) explicitly incorporate structural properties of data into current differential privacy metrics and mechanisms, to enable privacy-preserving data analytics for dependent/correlated data; and 3) finally provide a quantitative analysis of how hypothesis testing can guide the choice of the privacy parameters in an interpretable manner for differential privacy and other statistical privacy frameworks. Overall, our work aims to place the field of statistical data privacy on a firm analytic foundation that is coupled with the design of practical systems. |
URI: | http://arks.princeton.edu/ark:/88435/dsp017d278w81g |
Alternate format: | The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu |
Type of Material: | Academic dissertations (Ph.D.) |
Language: | en |
Appears in Collections: | Electrical Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Liu_princeton_0181D_12887.pdf | 4.04 MB | Adobe PDF | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.