M. Shekhar, G. Terashi, C. Gupta, D. Sarkar, J. Nguyn, N. J. Sisco, A. Mondal,
J. Vant, P. Fromme, W. D. Van Horn, Emad Tajkhorshid, D. Kihara, K. Dill,
A. Perez, and A. Singharoy.
CryoFold: Determining protein structures and data-guided ensembles
from cryo-EM density maps.
Matter, 4:3195-3216, 2021.
SHEK2021-ET
Cryo-electron microscopy (EM) requires molecular modeling to refine
structural details from data.
Ensemble models arrive at low free-energy molecular structures, but
are computationally expensive and
limited to resolving only small proteins that cannot be resolved by
cryo-EM. Here, we introduce
CryoFold - a pipeline of molecular dynamics simulations that
determines ensembles of protein structures
directly from sequence by integrating density data of varying sparsity
at 3–5 Åresolution with coarse-
grained topological knowledge of the protein folds. We present six
examples showing its broad
applicability for folding proteins between 72 to 2000 residues,
including large membrane and multi-domain
systems, and results from two EMDB competitions. Driven by the data
from a single known state, CryoFold
discovers common low-energy models together with rare low-probability
structures that capture the
equilibrium distribution of proteins, and simultaneously reflect in
the quality of multiple density maps.
Many of these conformations, unseen by traditional methods, are
experimentally validated and functionally
relevant. We arrive at a set of best practices for data-guided protein
folding that are controlled using a
Python GUI.