Improving Multigrid Methods with Deep Neural Networks Público

Huang, Ru (Spring 2022)

Permanent URL: https://etd.library.emory.edu/concern/etds/hh63sx21w?locale=pt-BR
Published

Abstract

Multigrid methods are one of the most efficient techniques for solving large sparse linear systems arising from Partial Differential Equations (PDEs) and graph Laplacians from machine learning applications. There are two key components of multigrid, smoothing which aims at reducing high-frequency errors on each grid level and coarse grid correction which interpolates the solution at the coarse grid. However, finding optimal smoothing algorithms is problem-dependent and can impose challenges for many problems. Meanwhile, as the multigrid hierarchy is formed, coarse-grid operators have significantly more nonzeros per row than the original fine-grid operator, which generates high parallel communication costs on coarse-levels. In this thesis, I first propose an efficient adaptive framework for learning optimal smoothers from operator stencils in the form of convolutional neural networks (CNNs). The CNNs are trained on small-scale problems from a given type of PDEs based on a supervised loss function derived from multigrid convergence theories, and can be applied to largescale problems of the same class of PDEs. I also propose a deep learning framework for sparsifying coarse grid operators. Two neural networks are constructed to learn the sparsity pattern and the corresponding values, respectively. The learned sparser operator has the same interpolation accuracy on algebraic smooth basis. Numerical results on challenging anisotropic rotated Laplacian problems, variable coefficient diffusion problems and linear elasticity problems demonstrate the superior performance of the proposed framework over classical hand-crafted methods.

Table of Contents

Contents 1 Introduction 1 1.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Contributions of Work . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Outline of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Background on PDEs 6 2.1 Poisson’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Finite difference discretization . . . . . . . . . . . . . . . . . . . . . . 7 3 Iterative methods for PDEs 12 3.1 Relaxation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Polynomial based methods . . . . . . . . . . . . . . . . . . . . . . . . 15 3.3 GMRES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.4 Multigrid methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.4.1 Prolongation . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.4.2 Restriction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.4.3 Multigrid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.5 Relationship between PDEs and CNNs . . . . . . . . . . . . . . . . . 32 4 Learning deep neural smoothers 34 4.1 Learning deep neural smoothers for constant coefficient PDEs . . . . 35 4.1.1 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.1.2 Training and generalization . . . . . . . . . . . . . . . . . . . 37 4.2 Interpretation of learned smoothers . . . . . . . . . . . . . . . . . . . 42 4.3 Learning deep neural smoothers for variable coefficient PDEs . . . . . 45 4.3.1 Parameterization with fully connected layers . . . . . . . . . . 47 4.3.2 Parameterization with convolutional layers . . . . . . . . . . . 48 4.4 Numerical experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.4.1 Constant coefficient PDEs . . . . . . . . . . . . . . . . . . . . 49 4.4.2 Training details . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.4.3 Variable coefficient PDEs . . . . . . . . . . . . . . . . . . . . . 61 4.4.4 Incorporation with FGMRES . . . . . . . . . . . . . . . . . . 63 4.4.5 Comparison with Chebyshev smoothers . . . . . . . . . . . . . 65 4.4.6 Comparison with GMRES smoothers . . . . . . . . . . . . . . 65 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5 Learning sparsified coarse-grid operator 68 5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.1.1 Theoretical considerations . . . . . . . . . . . . . . . . . . . . 70 5.2 Sparsification with machine learning . . . . . . . . . . . . . . . . . . 74 5.3 Numerical Experiements . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.3.1 Circulant stencil . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.3.2 Rotated Laplacian . . . . . . . . . . . . . . . . . . . . . . . . 80 5.3.3 2-D elasticity problem . . . . . . . . . . . . . . . . . . . . . . 84 5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6 Conclusions and Future Work 90 Bibliography 92

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Palavra-chave
Committee Chair / Thesis Advisor
Committee Members
Última modificação

Primary PDF

Supplemental Files