Differential Equation Interpretation of Deep Neural Networks 公开

Zhang, Qihang (Spring 2019)

Permanent URL: https://etd.library.emory.edu/concern/etds/t435gf14j?locale=zh

Published

Abstract

Deep Neural Networks have become the state-of-the-art tools for supervised machine learning with the ability to extract features and find patterns from complicated input data. Although there were no general guidelines for how to design good architectures that generalize well to unseen data, researchers have recently proposed the connection between deep neural network structures and ordinary differential equations. This innovative approach has gained much attention and has been widely accepted. In this thesis, we firstly illustrate the continuous interpretation of deep neural networks on a concrete example. To do so, we use a modified Residual Neural Network structure, which allows us to discretize the network and the weights separately. While the step size to discretize the network is big, the network cannot classify the labels well. However, as the step size gets smaller, the network's classification performance gets better. Next, based on the continuous interpretation, we look into the stability of the forward propagation of a deep neural network. We compute the eigenvalues of the Jacobian matrix of each time point. Furthermore, we visualize the objective function of a deep neural network to explore how to modify the network structure to make the training more efficient. Last, we propose new forward propagation architecture with Runge Kutta method of order four and compare it with the forward propagation of a standard Residual Neural Network.

1 Introduction 1

1.1 Contributions and Outline . . . . . . . . . . . . . . . . . . . . 3

2 Background 5

2.1 Theoretical Background of Deep Neural Networks . . . . . . . 6

2.2 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.1 Loss Function and Optimization Methods . . . . . . . 9

2.2.2 The Residual Neural Network . . . . . . . . . . . . . . 10

2.3 Differential Equation Interpretation . . . . . . . . . . . . . . . 11

2.4 Stability of Deep Neural Networks . . . . . . . . . . . . . . . . 13

2.4.1 Stability of ODEs . . . . . . . . . . . . . . . . . . . . . 13

2.5 Runge Kutta Methods . . . . . . . . . . . . . . . . . . . . . . 15

3 Experiments and Results 17

3.1 ResNets with different step sizes . . . . . . . . . . . . . . . . . 21

3.2 Stability of the ResNet Forward Propagation . . . . . . . . . . 25

3.3 Visualizing the Loss Function . . . . . . . . . . . . . . . . . . 29

3.4 RK4 Forward Propagation . . . . . . . . . . . . . . . . . . . . 33

4 Summary and Conclusion 36

About this Honors Thesis

Rights statement

Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.

School	Emory College
Department	Mathematics
Degree	B.S.
Submission	Honors Thesis
Language	English
Research Field	Mathematics Computer Science
关键词	Numerical Discretization Optimization Deep Learning Ordinary differential equations
Committee Chair / Thesis Advisor	Ruthotto, Lars, Emory University
Committee Members	Xi, Yuanzhe, Emory University Fisher, David, Emory University

Primary PDF

Thumbnail	Title	Date Uploaded	Actions
	Differential Equation Interpretation of Deep Neural Networks ()	2019-04-09 12:17:51 -0400	Download

Differential Equation Interpretation of Deep Neural Networks 公开

Zhang, Qihang (Spring 2019)

Abstract

Table of Contents

About this Honors Thesis

Primary PDF

Supplemental Files