Differential Equation Interpretation of Deep Neural Networks Restricted; Files Only

Zhang, Qihang (Spring 2019)

Permanent URL: https://etd.library.emory.edu/concern/etds/t435gf14j?locale=en
Published

Abstract

Deep Neural Networks have become the state-of-the-art tools for supervised machine learning with the ability to extract features and find patterns from complicated input data. Although there were no general guidelines for how to design good architectures that generalize well to unseen data, researchers have recently proposed the connection between deep neural network structures and ordinary differential equations. This innovative approach has gained much attention and has been widely accepted. In this thesis, we firstly illustrate the continuous interpretation of deep neural networks on a concrete example. To do so, we use a modified Residual Neural Network structure, which allows us to discretize the network and the weights separately. While the step size to discretize the network is big, the network cannot classify the labels well. However, as the step size gets smaller, the network's classification performance gets better. Next, based on the continuous interpretation, we look into the stability of the forward propagation of a deep neural network. We compute the eigenvalues of the Jacobian matrix of each time point. Furthermore, we visualize the objective function of a deep neural network to explore how to modify the network structure to make the training more efficient. Last, we propose new forward propagation architecture with Runge Kutta method of order four and compare it with the forward propagation of a standard Residual Neural Network.

Table of Contents

1 Introduction 1

1.1 Contributions and Outline . . . . . . . . . . . . . . . . . . . . 3

2 Background 5

2.1 Theoretical Background of Deep Neural Networks . . . . . . . 6

2.2 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.1 Loss Function and Optimization Methods . . . . . . . 9

2.2.2 The Residual Neural Network . . . . . . . . . . . . . . 10

2.3 Differential Equation Interpretation . . . . . . . . . . . . . . . 11

2.4 Stability of Deep Neural Networks . . . . . . . . . . . . . . . . 13

2.4.1 Stability of ODEs . . . . . . . . . . . . . . . . . . . . . 13

2.5 Runge Kutta Methods . . . . . . . . . . . . . . . . . . . . . . 15

3 Experiments and Results 17

3.1 ResNets with different step sizes . . . . . . . . . . . . . . . . . 21

3.2 Stability of the ResNet Forward Propagation . . . . . . . . . . 25

3.3 Visualizing the Loss Function . . . . . . . . . . . . . . . . . . 29

3.4 RK4 Forward Propagation . . . . . . . . . . . . . . . . . . . . 33

4 Summary and Conclusion 36

5 MATLAB R© Code 39

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files