The 2020 Association for Computing Machinery (ACM) Gordon Bell Prize was given to a team of researchers from institutions in the USA and China for their project titled "Pushing the limit of molecular dynamics with ab initio accuracy to 100 million atoms with machine learning". The team introduced Deep Potential Molecular dynamics (DPMD) which is a new machine learning-based protocol that can simulate more than 1 nanosecond-long trajectory of over 100 million atoms per day.
Molecular Dynamics is a computer simulation methodology that analyzes the motion and interactions of atoms during a fixed period of time. For systems as small as single cells and systems as large and complex as clouds of gas, scientists use molecular dynamics simulations to understand how these molecular compounds behave over time. For over thirty-five years, researchers have used a simulation method called ab initio for molecular dynamics because it has proven to be the most accurate. While the ab initio method (which means "from first principles" in Latin) achieves high accuracy in its simulations, the approach requires significant computation resources, limiting its application to smaller systems containing thousands of atoms at the most.
The team behind DPMD detailed the limitations of ab initio method in this paper, finding that it scales cubically with respect to the number of electronic degrees of freedom. A setup with the typical spatial and temporal scales achievable with the ab initio method are ~100 atoms and ~10 picoseconds. Ab initio obeys the cubic-scaling law almost perfectly, leaving simulations such as complex chemical reactions, electrochemical cells, nanocrystalline materials and radiation damage out of the question even for the world’s largest supercomputers.
The accuracy of the DP [Deep Potential] model stems from the distinctive ability of deep neural networks (DNN) to approximate high-dimensional functions, the proper treatment of physical requirements like symmetry constraints, and the concurrent learning scheme that generates a compact training dataset with a guarantee of uniform accuracy within the relevant configuration space.
The DPMD team chose to leverage the GPUs on the IBM's Summit system, the world’s second fastest supercomputer, to run almost all of the computational and communications tasks. Because of the limited size of the computational granularity in the Deep Potential model, the team found that relying heavily on the GPUs alone would be inefficient. With algorithmic innovations, including a new data layout for the neighbor list that avoids branching in the computation of the embedded matrix, compressing the elements in the new data structure into 64-bit integers to improve GPU optimization for the custom TensorFlow operations, and creating mixed-precision computation for the Deep Potential model, the team was able to optimize for the GPU-related inefficiencies. With these improvements, the researchers have opened the door to simulating unprecedented size and time scales with the same accuracy as ab initio.
The Gordon Bell Prize recognizes achievement in high performance computing and finalists must demonstrate that their algorithms can scale on the world’s most powerful supercomputers. The GPU Deep MD-Kit efficiently scaled up to the entire Summit supercomputer, attaining 91 PFLOPS and 162/275 PFLOPS in mixed-single/half precision. This achievement poses challenges to the next-generation supercomputer for a better integration of machine learning and physical modeling.