Low Synchronization Gram-Schmidt and GMRES Algorithms

Stephen Thomas, National Renewable Energy Laboratory SCAIM Seminar
October 30, 2018 12:30 pm ESB 4133
Communication-avoiding and pipelined variants of Krylov solvers are critical for the scalability of linear system solvers on future exascale architectures. We present low synchronization variants of iterated classical (CGS) and modified Gram-Schmidt (MGS) algorithms that require one and two global reduction communication steps.  Derivations of low synchronization algorithms are based on observations by Ruhe.  Our main contribution is to introduce a backward normalization lag into the compact WY form of MGS resulting in a {\cal O}(\eps)\kappa(A) stable GMRES algorithm that requires only one global synchronization per iteration.  The reduction operations are overlapped with computations and pipelined to increase speed. Further improvements in performance are achieved by accelerating GMRES BLAS-2 operations on GPUs. Extensions to re-cycled Krylov iterations are explored.
Co-authors/collaborators: Kasia Swirydowicz (NREL), Julien Langou (CU Denver), Shreyas Ananthan (NREL), Ulrike Yang (LLNL).