Multi-scale adaptive representation of signals: models and algorithms

Tai, Cheng

Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01jm214r56n

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	E, Weinan	-
dc.contributor.author	Tai, Cheng	-
dc.contributor.other	Applied and Computational Mathematics Department	-
dc.date.accessioned	2016-06-08T18:42:04Z	-
dc.date.available	2016-06-08T18:42:04Z	-
dc.date.issued	2016	-
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/dsp01jm214r56n	-
dc.description.abstract	Representations of data play a key role in many signal processing and machine learning applications. For low-level signal processing tasks, dictionary learning is a very popular approach for representing signals and has been successfully used in image/video denoising, compression and inpainting tasks. For high-level signal processing tasks such as object recognition, deep learning models, especially convolutional neural networks (CNN), are becoming an increasingly popular approach to provide abstract representations of signals. Dictionary learning and deep learning are two motivating sources of this thesis. In this thesis, we study the modeling and algorithmic aspects of building multi-scale adaptive representations of signals. Special attention is given to the computational efficiency of such representations. In the first part of this thesis, we provide a framework for constructing adaptive waveletframes and bi-frames (abbreviated as AdaFrame). This framework gives multi-scale, sparse representations of the signal, with an efficiency comparable to that of the wavelets at inference time. Similar to dictionary learning, it is also adapted to data. These features make AdaFrame an attractive alternative to dictionary learning and the more conventional wavelet frames.The proposed framework is formally similar to the first few layers of a convolutional network. As a byproduct, we show that the proposed framework gives a better way of visualizing the activations of the intermediate layers of a neural net in terms of reconstruction error. Some examples are given to demonstrate the wide applicability of AdaFrame, including image denoising, image compression, object recognition and video super-resolution. In the second part of the this thesis, we consider accelerating CNNs by low-rank approximations. CNNs are typical examples of adaptive representations of signals, and have delivered impressive performance in various computer vision applications. However the storage and computation requirements make it problematic for deploying these models on mobile devices. We propose a new tensor decomposition technique for accelerating CNNs. As a result, typical modern CNNs can be accelerated by a factor up to 3 and the number of parameters is also reduced by a significant factor. In the third part of the thesis, we consider the dynamics of stochastic gradient algorithms (SGAs). SGAs have become the algorithm of choice for large scale machine learning applications. They are used for training AdaFrame and CNNs on large datasets. As we focus on the computational efficiency, we want to understand the dynamics of SGAs as well as ways to improve them. We propose the method of stochastic modified equations (SME) to analyze the dynamics of the SGA. Using this technique, we can give precise characterizations for both the initial convergence speed and the eventual oscillations, at least in some special cases. Furthermore, the SME formalism allows us to characterize various speed-up techniques, such as introducing momentum, adjusting the learning rate and adjusting the mini-batch sizes. Previously, these techniques relied mostly on heuristics. Besides introducing simple examples to illustrate the SME formalism, we also apply the framework to improve the relaxed randomized Kaczmarz method for solving linear equations. The SME framework is a precise and unifying approach to understanding and improving the SGA, and has the potential to be applied to many more stochastic algorithms.	-
dc.language.iso	en	-
dc.publisher	Princeton, NJ : Princeton University	-
dc.relation.isformatof	The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: http://catalog.princeton.edu/	-
dc.subject	computer vision	-
dc.subject	machine learning	-
dc.subject	signal processing	-
dc.subject.classification	Applied mathematics	-
dc.title	Multi-scale adaptive representation of signals: models and algorithms	-
dc.type	Academic dissertations (Ph.D.)	-
pu.projectgrantnumber	690-2143	-
Appears in Collections:	Applied and Computational Mathematics

Files in This Item:

File	Description	Size	Format
Tai_princeton_0181D_11706.pdf		8.09 MB	Adobe PDF	View/Download

Show simple item record

Search

Browse