Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01jm214r56n
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | E, Weinan | - |
dc.contributor.author | Tai, Cheng | - |
dc.contributor.other | Applied and Computational Mathematics Department | - |
dc.date.accessioned | 2016-06-08T18:42:04Z | - |
dc.date.available | 2016-06-08T18:42:04Z | - |
dc.date.issued | 2016 | - |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/dsp01jm214r56n | - |
dc.description.abstract | Representations of data play a key role in many signal processing and machine learning applications. For low-level signal processing tasks, dictionary learning is a very popular approach for representing signals and has been successfully used in image/video denoising, compression and inpainting tasks. For high-level signal processing tasks such as object recognition, deep learning models, especially convolutional neural networks (CNN), are becoming an increasingly popular approach to provide abstract representations of signals. Dictionary learning and deep learning are two motivating sources of this thesis. In this thesis, we study the modeling and algorithmic aspects of building multi-scale adaptive representations of signals. Special attention is given to the computational efficiency of such representations. In the first part of this thesis, we provide a framework for constructing adaptive waveletframes and bi-frames (abbreviated as AdaFrame). This framework gives multi-scale, sparse representations of the signal, with an efficiency comparable to that of the wavelets at inference time. Similar to dictionary learning, it is also adapted to data. These features make AdaFrame an attractive alternative to dictionary learning and the more conventional wavelet frames.The proposed framework is formally similar to the first few layers of a convolutional network. As a byproduct, we show that the proposed framework gives a better way of visualizing the activations of the intermediate layers of a neural net in terms of reconstruction error. Some examples are given to demonstrate the wide applicability of AdaFrame, including image denoising, image compression, object recognition and video super-resolution. In the second part of the this thesis, we consider accelerating CNNs by low-rank approximations. CNNs are typical examples of adaptive representations of signals, and have delivered impressive performance in various computer vision applications. However the storage and computation requirements make it problematic for deploying these models on mobile devices. We propose a new tensor decomposition technique for accelerating CNNs. As a result, typical modern CNNs can be accelerated by a factor up to 3 and the number of parameters is also reduced by a significant factor. In the third part of the thesis, we consider the dynamics of stochastic gradient algorithms (SGAs). SGAs have become the algorithm of choice for large scale machine learning applications. They are used for training AdaFrame and CNNs on large datasets. As we focus on the computational efficiency, we want to understand the dynamics of SGAs as well as ways to improve them. We propose the method of stochastic modified equations (SME) to analyze the dynamics of the SGA. Using this technique, we can give precise characterizations for both the initial convergence speed and the eventual oscillations, at least in some special cases. Furthermore, the SME formalism allows us to characterize various speed-up techniques, such as introducing momentum, adjusting the learning rate and adjusting the mini-batch sizes. Previously, these techniques relied mostly on heuristics. Besides introducing simple examples to illustrate the SME formalism, we also apply the framework to improve the relaxed randomized Kaczmarz method for solving linear equations. The SME framework is a precise and unifying approach to understanding and improving the SGA, and has the potential to be applied to many more stochastic algorithms. | - |
dc.language.iso | en | - |
dc.publisher | Princeton, NJ : Princeton University | - |
dc.relation.isformatof | The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: http://catalog.princeton.edu/ | - |
dc.subject | computer vision | - |
dc.subject | machine learning | - |
dc.subject | signal processing | - |
dc.subject.classification | Applied mathematics | - |
dc.title | Multi-scale adaptive representation of signals: models and algorithms | - |
dc.type | Academic dissertations (Ph.D.) | - |
pu.projectgrantnumber | 690-2143 | - |
Appears in Collections: | Applied and Computational Mathematics |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Tai_princeton_0181D_11706.pdf | 8.09 MB | Adobe PDF | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.