Precise calculation of molecular electronic wavefunctions by methods such as coupled-cluster requires the computation of tensor
contractions, the cost of which has polynomial computational scaling
with respect to the system and basis set sizes. Each contraction may be executed via matrix multiplication on a properly ordered and structured tensor. However, data transpositions are often needed to
reorder the tensors for each contraction. Writing and optimizing distributed-memory kernels for each transposition and contraction is tedious since the number of contractions scales combinatorially with the
number of tensor indices. We present a distributed-memory numerical library (Cyclops Tensor Framework (CTF)) that automatically manages tensor blocking and redistribution to perform any user-specified
contractions. CTF serves as the distributed memory contraction engine in Aquarius, a new program designed for high-accuracy and
massively-parallel quantum chemical computations. Aquarius implements a range of coupled-cluster and related methods such as CCSD and CCSDT by writing the equations on
top of a C++ templated domain-specific language. This DSL calls CTF directly to manage the data and
perform the contractions. Our CCSD and CCSDT implementations achieve high parallel scalability on
the BlueGene/Q and Cray XC30 supercomputer architectures showing that accurate electronic structure
calculations can be effectively carried out on top of general distributed memory tensor primitives.
Publications
Tags
2D
Accelerators
Algorithms
Architectures
Arrays
Big Data
Bootstrapping
C++
Cache Partitioning
Cancer
Careers
Chisel
Communication
Computer Architecture
CTF
DIABLO
Efficiency
Energy
FPGA
GAP
Gaussian Elimination
Genomics
GPU
Hardware
HLS
Lower Bounds
LU
Matrix Multiplication
Memory
Multicore
Oblivious
Open Space
OS
Parallelism
Parallel Reduction
Performance
PHANTOM
Processors
Python
Research Centers
RISC-V
SEJITS
Tall-Skinny QR
Technical Report
Test generation