Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01ff365817r
Title: | Synthesis of Efficient Neural Networks |
Authors: | Dai, Xiaoliang |
Advisors: | Jha, Niraj K |
Contributors: | Electrical Engineering Department |
Keywords: | Computer vision Deep learning Neural network |
Subjects: | Electrical engineering Computer science |
Issue Date: | 2019 |
Publisher: | Princeton, NJ : Princeton University |
Abstract: | Over the last decade, deep neural networks (DNNs) have begun to revolutionize numerous research areas. Their ability to distill knowledge through multi-level abstraction has yielded state-of-the-art performance for myriad applications. However, the design and deployment of DNNs are challenging and suffer from four problems: (1) Lengthy search: Searching for an appropriate DNN architecture through trial-and-error or reinforcement learning is inefficient and computationally intensive. (2) Vast computation cost: Due to the presence of millions of parameters and floating-point operations, deployment of DNNs consumes substantial resources. (3) Heterogeneous platform characteristics: Real-world applications run on very different platforms with diverging hardware characteristics. It’s hard for a single DNN architecture to run optimally across different platforms. (4) Model update difficulty: In many real-world scenarios, training data are collected in a continuous manner. However, conventional DNNs with fixed architectures cannot adaptively adjust their capacity to accommodate new data. To address these problems, we propose an efficient DNN synthesis framework to automatically and efficiently generate very accurate and compact DNN models based on dynamic data. We first introduce a network growth algorithm that complements network pruning to learn both weights and compact DNN architectures during training. We propose a DNN synthesis tool (NeST) that combines both methods to automate the synthesis of compact and accurate DNNs. NeST starts with a randomly initialized sparse network called the seed architecture. It iteratively tunes the architecture with gradient-based growth and magnitude-based pruning. Our experimental results show that NeST yields accurate and very compact DNNs. For example, we reduce the network parameters for VGG-16 on the ImageNet dataset by 33.2×.Then, we develop a synthesis method to train efficient long short-term memories (LSTMs). We propose a hidden-layer LSTM (H-LSTM) that adds hidden layers to LSTM’s original one-level control gates. H-LSTM increases accuracy while employing fewer external stacked layers, thus reducing the number of parameters. We extend the grow-and-prune synthesis paradigm to train the hidden layers. This learns both the weights and the compact architecture. The generated architecture is compact, fast, and accurate. For example, for the NeuralTalk architecture on the MSCOCO dataset, our three models reduce the number of parameters by 38.7×, latency by 4.5×, and improve the CIDEr-D score by 2.8%, respectively. To update an existing DNN to accommodate new data, we propose an efficient incremental learning framework. When new data arrive, the network first grows new connections to increase its capacity to accommodate new information. Then, it prunes away connections based on the magnitude of weights to enhance compactness, and hence recover efficiency. Finally, the model rests at a lightweight DNN that is both ready for inference and suitable for future grow-and-prune updates. The proposed framework improves accuracy, shrinks model size, and reduces the additional training cost simultaneously. For efficient DNN deployment, we propose a framework called Chameleon to adapt DNNs to fit target latency and/or energy constraints for real-world applications. At the core of our algorithm lies an accuracy predictor built atop Gaussian process with Bayesian optimization for iterative sampling. With a one-time building cost for accuracy as well as two other predictors (latency and energy), our algorithm produces state-of-the-art model architectures on different platforms under given constraints in just minutes. The generated ChamNet models achieve significant accuracy improvements relative to state-of-the-art handcrafted and automatically designed architectures. At reduced latency, our models achieve up to 8.2% and 6.7% absolute top-1 accuracy improvements compared to MobileNetV2 and MnasNet on a mobile CPU, respectively. |
URI: | http://arks.princeton.edu/ark:/88435/dsp01ff365817r |
Alternate format: | The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu |
Type of Material: | Academic dissertations (Ph.D.) |
Language: | en |
Appears in Collections: | Electrical Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Dai_princeton_0181D_13118.pdf | 5.28 MB | Adobe PDF | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.