Regularization and Compression of Deep Neural Networks

Najeeb Khan, Ph.D. Candidate

 

Abstract: Deep neural networks (DNN) are the state-of-the-art machine learning models outperforming traditional machine learning methods in a number of domains from vision and speech to natural language understanding and autonomous control. With large amounts of data becoming available, the task performance of DNNs in these domains predictably scales with the size of the DNNs. However, in data-scarce scenarios, large DNNs overfit to the training dataset resulting in inferior performance. Additionally, in scenarios where enormous amounts of data is available, large DNNs incur large inference latencies and memory costs. Thus, while imperative for achieving state-of-the-art performance, large DNNs require large amounts of data for training and large computational resources during inference. These two problems could be solved by sparsely training large DNNs.  Imposing sparsity constraints during training limits the capacity of the model to overfit to the training set while still being able to obtain good generalization. Sparse DNNs have most of their weights close to zero after training. Therefore, most of the weights could be removed resulting in smaller inference costs.

In this talk, two new sparse stochastic regularization techniques called Bridgeout and Sparseout will be presented. Sparsity control of the weights and activations of DNNs will be demonstrated using these techniques. Evaluation of the proposed techniques on fully connected convolutional and recurrent neural networks in computer vision and language modeling will be presented. Furthermore, the talk will demonstrate the use of Bridgeout for the pruning of filters in convolutional neural networks for low-cost inference.

 

Biography: Najeeb Khan is a Ph.D. Candidate with the Department of Computer Science supervised by Prof. Ian Stavness. His research interests broadly include fundamental problems in machine learning as well as the novel applications of ML to solve real-world problems in computer vision, satellite communications and biomechanics. Najeeb's MS thesis involved speech signal processing and BS thesis involved the DSP implementation of adaptive filters.

March 22, 2021 at 2:00 PM via Zoom