Geometric multigrid solvers within adaptive mesh
refinement (AMR) applications often reach a point where further coarsening of the grid becomes impractical as individual
subdomain sizes approach unity. At this point the most common
solution is to use a bottom solver, such as BiCGStab, to reduce
the residual by a fixed factor at the coarsest level. Each
iteration of BiCGStab requires multiple global reductions (MPI
collectives). As the number of BiCGStab iterations required
for convergence grows with problem size, and the time for
each collective operation increases with machine scale, bottom
solves in large-scale applications can constitute a significant
fraction of the overall multigrid solve time. In this paper, we
implement, evaluate, and optimize a communication-avoiding
s-step formulation of BiCGStab (CABiCGStab for short) as a
high-performance, distributed-memory bottom solver for geometric
multigrid solvers. This is the first time s-step Krylov subspace
methods have been leveraged to improve multigrid bottom solver
performance. We use a synthetic benchmark for detailed analysis
and integrate the best implementation into BoxLib in order to
evaluate the benefit of a s-step Krylov subspace method on the
multigrid solves found in the applications LMC and Nyx on up to
32,768 cores on the Cray XE6 at NERSC. Overall, we see bottom
solver improvements of up to 4.2x on synthetic problems and up to 2.7×
in real applications. This results in as much as a 1.5×
improvement in solver performance in real application.
Publications
Tags
2D
Accelerators
Algorithms
Architectures
Arrays
Big Data
Bootstrapping
C++
Cache Partitioning
Cancer
Careers
Chisel
Communication
Computer Architecture
CTF
DIABLO
Efficiency
Energy
FPGA
GAP
Gaussian Elimination
Genomics
GPU
Hardware
HLS
Lower Bounds
LU
Matrix Multiplication
Memory
Multicore
Oblivious
Open Space
OS
Parallelism
Parallel Reduction
Performance
PHANTOM
Processors
Python
Research Centers
RISC-V
SEJITS
Tall-Skinny QR
Technical Report
Test generation