Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01p2676z32x
Title: | Privacy-Preserving Machine Learning via Data Compression & Differential Privacy |
Authors: | Chanyaswad, Thee |
Advisors: | Kung, Sun-Yuan Mittal, Prateek |
Contributors: | Electrical Engineering Department |
Keywords: | Compressive privacy Differential privacy Discriminant analysis Kernel methods Machine learning Privacy-preserving machine learning |
Subjects: | Electrical engineering Artificial intelligence Computer science |
Issue Date: | 2018 |
Publisher: | Princeton, NJ : Princeton University |
Abstract: | In the current world of ubiquitous information, data are being generated at every moment. These data have been used to improve our daily living in many aspects such as image recognition, speech recognition, and medical diagnosis. The successful utilization of the data can largely be attributed to the development and progress in machine learning, which has given rise to potent tools such as kernel machines, and artificial neural networks. Despite the benefits provided by machine learning, the ubiquitous nature of the data comes with an important concern, i.e. the privacy of the individuals to whom the data are associated. While machine learning tries to learn as much as possible from the data, the privacy concern results in the desire to conceal as much information as possible. Therefore, the intersection between the two is inevitable. The body of work presented in this dissertation explores the intersection between data privacy and machine learning, using the knowledge from linear systems theory, signal processing, statistics, and information theory. It considers two prominent privacy protection regimes -- compressive privacy and differential privacy. The former aims at protecting privacy by publishing/using only the minimally required information within the data for the specific task. The latter, meanwhile, aims at protecting privacy by ensuring that the presence of any record in the database cannot be inferred. For compressive privacy, this dissertation explores techniques based on the discriminant analysis, which are applied to the problems of data compression and desensitization, multi-kernel learning, kernel and feature-map selection, and outlier removal. Under differential privacy, data compression and machine learning techniques also prove to be impactful. This dissertation explores how the coupling of dimensionality reduction and generative models can solve the challenging problem of non-interactive differentially-private data release; and how theories in linear systems and statistics can be utilized to analyze and design a novel mechanism for differential privacy under matrix-valued query. The results have spawned many exciting directions of research that cover many areas including signal processing, matrix analysis, and information theory. Lastly, despite its rapid development in recent years, privacy-preserving machine learning is possibly still in its infant stage compared to other fields. Nevertheless, its importance cannot be understated, and future works in the area would certainly have a meaningful impact on our quality of life in this evermore technologically-oriented world. |
URI: | http://arks.princeton.edu/ark:/88435/dsp01p2676z32x |
Alternate format: | The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu |
Type of Material: | Academic dissertations (Ph.D.) |
Language: | en |
Appears in Collections: | Electrical Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Chanyaswad_princeton_0181D_12796.pdf | 4.86 MB | Adobe PDF | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.