Differential Equation Interpretation of Deep Neural Networks 公开
Zhang, Qihang (Spring 2019)
Abstract
Deep Neural Networks have become the state-of-the-art tools for supervised machine learning with the ability to extract features and find patterns from complicated input data. Although there were no general guidelines for how to design good architectures that generalize well to unseen data, researchers have recently proposed the connection between deep neural network structures and ordinary differential equations. This innovative approach has gained much attention and has been widely accepted. In this thesis, we firstly illustrate the continuous interpretation of deep neural networks on a concrete example. To do so, we use a modified Residual Neural Network structure, which allows us to discretize the network and the weights separately. While the step size to discretize the network is big, the network cannot classify the labels well. However, as the step size gets smaller, the network's classification performance gets better. Next, based on the continuous interpretation, we look into the stability of the forward propagation of a deep neural network. We compute the eigenvalues of the Jacobian matrix of each time point. Furthermore, we visualize the objective function of a deep neural network to explore how to modify the network structure to make the training more efficient. Last, we propose new forward propagation architecture with Runge Kutta method of order four and compare it with the forward propagation of a standard Residual Neural Network.
Table of Contents
1 Introduction 1
1.1 Contributions and Outline . . . . . . . . . . . . . . . . . . . . 3
2 Background 5
2.1 Theoretical Background of Deep Neural Networks . . . . . . . 6
2.2 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 Loss Function and Optimization Methods . . . . . . . 9
2.2.2 The Residual Neural Network . . . . . . . . . . . . . . 10
2.3 Differential Equation Interpretation . . . . . . . . . . . . . . . 11
2.4 Stability of Deep Neural Networks . . . . . . . . . . . . . . . . 13
2.4.1 Stability of ODEs . . . . . . . . . . . . . . . . . . . . . 13
2.5 Runge Kutta Methods . . . . . . . . . . . . . . . . . . . . . . 15
3 Experiments and Results 17
3.1 ResNets with different step sizes . . . . . . . . . . . . . . . . . 21
3.2 Stability of the ResNet Forward Propagation . . . . . . . . . . 25
3.3 Visualizing the Loss Function . . . . . . . . . . . . . . . . . . 29
3.4 RK4 Forward Propagation . . . . . . . . . . . . . . . . . . . . 33
4 Summary and Conclusion 36
5 MATLAB R© Code 39
About this Honors Thesis
School | |
---|---|
Department | |
Degree | |
Submission | |
Language |
|
Research Field | |
关键词 | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Differential Equation Interpretation of Deep Neural Networks () | 2019-04-09 12:17:51 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|