PREMVAL

Benchmarking generative models of protein conformational ensembles against all-atom molecular dynamics.

Leaderboard

Model Data RMWD ↓ RMSF r ↑ Time/chain ↓ N
AlphaFlow-MD held-out (weak) 3.81 0.814 n/a 82
AlphaFlow-MD (distilled) held-out (weak) 4.76 0.783 n/a 82
ESMFlow-MD held-out (weak) 5.98 0.704 n/a 82
BioEmu held-out 6.17 0.751 718s 81
ESMFlow-MD (distilled) held-out (weak) 6.43 0.699 365s 82
ESMDiff uncertain 7.74 0.664 237s 82

RMWD root-mean Wasserstein distance (Å), lower is better · RMSF r per-residue flexibility correlation, higher is better · Time/chain median inference time per chain (s), lower is faster · N chains scored

Data how strong is the ATLAS held-out guarantee? · held-out never trained on ATLAS/MD · held-out (weak) trained on ATLAS train, temporal split only · uncertain unknown

Browse chains →

Scored on the ATLAS test split, against all-atom MD trajectories from the ATLAS dataset.

References

  1. Jing, B., Berger, B., & Jaakkola, T. (2024). AlphaFold Meets Flow Matching for Generating Protein Ensembles. International Conference on Machine Learning (ICML). arXiv:2402.04845. [link]
  2. Lewis, S., et al. (2024). Scalable emulation of protein equilibrium ensembles with generative deep learning. bioRxiv 2024.12.05.626885. [link]
  3. Lu, J., Chen, X., Lu, S. Z., Shi, C., Guo, H., Bengio, Y., & Tang, J. (2025). Structure Language Models for Protein Conformation Generation. International Conference on Learning Representations (ICLR). arXiv:2410.18403. [link]
  4. Vander Meersche, Y., Cretin, G., Gheeraert, A., Gelly, J.-C., & Galochkina, T. (2024). ATLAS: protein flexibility description from atomistic molecular dynamics simulations. Nucleic Acids Research, 52(D1), D384-D392. [link]