Shashank Pant, Zachary Smith, Yihang Wang, Emad Tajkhorshid, and Pratyush
Tiwary.
Confronting pitfalls of AI-augmented molecular dynamics using
statistical physics.
Journal of Chemical Physics, 153, 2020.
PANT2020-ET
Artificial intelligence (AI)-based approaches have had indubitable impact across the sciences through the
ability to extract relevant information from raw data. Recently AI has also seen use for enhancing the
efficiency of molecular simulations, wherein AI derived slow modes are used to accelerate the simulation
in targeted ways. However, while typical fields where AI is used are characterized by a plethora of data,
molecular simulations per-construction suffer from limited sampling and thus limited data. As such the use
of AI in molecular simulations can suffer from a dangerous situation where the AI-optimization could get
stuck in spurious regimes, leading to incorrect characterization of the reaction coordinate (RC) for the
problem at hand. When such an incorrect RC is then used to perform additional simulations, one could start
to deviate progressively from the ground truth. To deal with this problem of spurious AI-solutions, here
we report a novel and automated algorithm using ideas from statistical mechanics. It is based on the
notion that a more reliable AI-solution will be one that maximizes the time-scale separation between slow
and fast processes. To learn this time-scale separation even from limited data, we use a maximum caliber-
based framework. We show the applicability of this automatic protocol for 3 classic benchmark problems,
namely the conformational dynamics of a model peptide, ligand-unbinding from a protein, and
folding/unfolding energy landscape of the C-terminal domain of protein G. We believe our work will lead to
increased and robust use of trustworthy AI in molecular simulations of complex systems.