Avoiding False Positive Conclusions in Molecular Simulation: The Importance of Replicas

J Chem Theory Comput. 2018 Dec 11;14(12):6127-6138. doi: 10.1021/acs.jctc.8b00391. Epub 2018 Nov 9.

Abstract

Molecular simulations are a computational technique used to investigate the dynamics of proteins and other molecules. The free energy landscape of these simulations is often rugged, and minor differences in the initial velocities, floating-point precision, or underlying hardware can cause identical simulations (replicas) to take different paths in the landscape. In this study we investigated the magnitude of these effects based on 310 000 ns of simulation time. We performed 100 identically parametrized replicas of 3000 ns each for a small 10 amino acid system as well as 100 identically parametrized replicas of 100 ns each for an 827 residue T-cell receptor/MHC system. Comparing randomly chosen subgroups within these replica sets, we estimated the reproducibility and reliability that can be achieved by a given number of replicas at a given simulation time. These results demonstrate that conclusions drawn from single simulations are often not reproducible and that conclusions drawn from multiple shorter replicas are more reliable than those from a single longer simulation. The actual number of replicas needed will always depend on the question asked and the level of reliability sought. On the basis of our data, it appears that a good rule of thumb is to perform a minimum of five to 10 replicas.