Acceleration Algorithms for Machine Learning Models Public

He, Huan (Spring 2022)

Permanent URL: https://etd.library.emory.edu/concern/etds/x633f2274?locale=fr
Published

Abstract

Machine Learning has revolutionized the fields of computer vision, natural language understanding, speech recognition, information retrieval and more. However, with the progressive improvements in deep learning models, their number of parameters, latency, resources required to train, etc. have all have increased significantly. There are many recent examples that can illustrate the tremendous growth in scientific data generation in the literature. It is estimated that there are thousands of wireless sensors currently in place, which generates about a gigabyte of data per sensor per day. Consequently, it has become important to pay attention to these footprint metrics of a model as well, not just its quality. Nowadays, there is a greater need to develop efficient machine learning models to cope with future demands that are in line with similar energy-related initiatives. Either training or inference efficient algorithms are important for a number of data-intensive areas, as they affect many related industries. However, despite the fact that advanced and powerful machine learning models are proposed, there is a huge demand and space for such efficient and fast machine learning methods for large and complex data-intensive fields.

Table of Contents

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 SGranite: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.2 Fast-CP: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.3 GDA-AM: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.4 MedDiff: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Background 8

2.1 Parallel Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.1 The Map-Reduce algorithm . . . . . . . . . . . . . . . . . . . 8

2.2 Acceleration via mathematical methods . . . . . . . . . . . . . . . . . 9

2.2.1 Stochastic Gradient Descent . . . . . . . . . . . . . . . . . . . 10

2.2.2 Nonlinear Acceleration Techniques . . . . . . . . . . . . . . . 10

3 Accelerating Tensor Decomposition via Parallel Algorithms 12

3.1 Background and Notations . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2 SGranite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.2.1 Example of Useful Regularization Terms . . . . . . . . . . . . 18

3.2.2 Parallel Algorithm using Spark . . . . . . . . . . . . . . . . . 22

3.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4 Acceleration Tensor Decomposition via numerical methods 32

4.1 Generalized CP decomposition using SGD . . . . . . . . . . . . . . . 33

4.2 Extrapolated Stochastic Gradient Descent . . . . . . . . . . . . . . . 34

4.3 Theoretical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5 Accelerating general minimax optimization via Anderson Acceleration...42

5.1 Minimax Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.2 Our Method: GDA-AM . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.2.1 Fixed-Point Iteration and Anderson Mixing (AM) . . . . . . . 45

5.2.2 AM and Generalized Minimal Residual (GMRES) . . . . . . . 47

5.2.3 GDA-AM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.3 Convergence results for GDA-AM . . . . . . . . . . . . . . . . . . . . . 53

5.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.6 Theoretical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.6.1 Discussion of obtained rates . . . . . . . . . . . . . . . . . . . 73

6 Accelerating Sampling Procedure for diffusion based generative models

88

6.1 Motivation of Synthetic EHRs . . . . . . . . . . . . . . . . . . . . . . 88

6.2 Background in Diffusion Models . . . . . . . . . . . . . . . . . . . . . 90

6.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.4 Proposed Method: MedDiff . . . . . . . . . . . . . . . . . . . . . . . 96

6.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

7 Conclusion and Future Work 110

7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

Bibliography 113

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Mot-clé
Committee Chair / Thesis Advisor
Committee Members
Dernière modification

Primary PDF

Supplemental Files