Acceleration Algorithms for Machine Learning Models Public

He, Huan (Spring 2022)

Permanent URL: https://etd.library.emory.edu/concern/etds/x633f2274?locale=fr
Published

Abstract

Machine Learning has revolutionized the fields of computer vision, natural language understanding, speech recognition, information retrieval and more. However, with the progressive improvements in deep learning models, their number of parameters, latency, resources required to train, etc. have all have increased significantly. There are many recent examples that can illustrate the tremendous growth in scientific data generation in the literature. It is estimated that there are thousands of wireless sensors currently in place, which generates about a gigabyte of data per sensor per day. Consequently, it has become important to pay attention to these footprint metrics of a model as well, not just its quality. Nowadays, there is a greater need to develop efficient machine learning models to cope with future demands that are in line with similar energy-related initiatives. Either training or inference efficient algorithms are important for a number of data-intensive areas, as they affect many related industries. However, despite the fact that advanced and powerful machine learning models are proposed, there is a huge demand and space for such efficient and fast machine learning methods for large and complex data-intensive fields.

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 SGranite: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.2 Fast-CP: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.3 GDA-AM: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.4 MedDiff: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Background 8

2.1 Parallel Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.1 The Map-Reduce algorithm . . . . . . . . . . . . . . . . . . . 8

2.2 Acceleration via mathematical methods . . . . . . . . . . . . . . . . . 9

2.2.1 Stochastic Gradient Descent . . . . . . . . . . . . . . . . . . . 10

2.2.2 Nonlinear Acceleration Techniques . . . . . . . . . . . . . . . 10

3 Accelerating Tensor Decomposition via Parallel Algorithms 12

3.1 Background and Notations . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2 SGranite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.2.1 Example of Useful Regularization Terms . . . . . . . . . . . . 18

3.2.2 Parallel Algorithm using Spark . . . . . . . . . . . . . . . . . 22

3.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4 Acceleration Tensor Decomposition via numerical methods 32

4.1 Generalized CP decomposition using SGD . . . . . . . . . . . . . . . 33

4.2 Extrapolated Stochastic Gradient Descent . . . . . . . . . . . . . . . 34

4.3 Theoretical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5 Accelerating general minimax optimization via Anderson Acceleration...42

5.1 Minimax Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.2 Our Method: GDA-AM . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.2.1 Fixed-Point Iteration and Anderson Mixing (AM) . . . . . . . 45

5.2.2 AM and Generalized Minimal Residual (GMRES) . . . . . . . 47

5.2.3 GDA-AM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.3 Convergence results for GDA-AM . . . . . . . . . . . . . . . . . . . . . 53

5.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.6 Theoretical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.6.1 Discussion of obtained rates . . . . . . . . . . . . . . . . . . . 73

6 Accelerating Sampling Procedure for diffusion based generative models

6.1 Motivation of Synthetic EHRs . . . . . . . . . . . . . . . . . . . . . . 88

6.2 Background in Diffusion Models . . . . . . . . . . . . . . . . . . . . . 90

6.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.4 Proposed Method: MedDiff . . . . . . . . . . . . . . . . . . . . . . . 96

6.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

7 Conclusion and Future Work 110

7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

Bibliography 113

About this Dissertation

Rights statement

Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.

School	Laney Graduate School
Department	Computer Science and Informatics
Degree	Ph.D.
Submission	Dissertation
Language	English
Research Field	Mathematics Computer Science
Mot-clé	Numerical Analysis Machine Learning Parallel Processing Acceleration
Committee Chair / Thesis Advisor	Ho, Joyce C., Emory University Xi, Yuanzhe, Emory University
Committee Members	Zhao, Liang, Emory University Saad, Yousef, University of Minnesota

Dernière modification

Primary PDF

Thumbnail	Title	Date Uploaded	Actions
	Acceleration Algorithms for Machine Learning Models ()	2022-03-31 14:52:27 -0400	Download

Acceleration Algorithms for Machine Learning Models Public

He, Huan (Spring 2022)

Abstract

Table of Contents

About this Dissertation

Primary PDF

Supplemental Files