NMREluBench is a comprehensive benchmark specifically designed for evaluating deep learning models on the inverse elucidation of molecular structures from experimental 1H and 13C NMR chemical shifts. This benchmark addresses a critical gap in computational chemistry by providing standardized evaluation protocols for NMR-based structure determination.
- 📈 Two Core Tasks: De novo structure generation and library matching
- 🧪 Experimental Data Focus: Real-world NMR chemical shifts from experimental measurements
- 🔄 Comparative Analysis: Performance evaluation against computed NMR datasets
- 📊 Standardized Metrics: Rigorous evaluation protocols for fair model comparison
- 🌐 Open Source: Publicly available for research and development
NMREluBench/
├── nmr_denovo/ # De novo structure generation task
│ ├── ... # Task-specific code
│ └── README.md # Task-specific documentation
├── nmr_retrieval/ # Library matching task
│ ├── ... # Task-specific code
│ └── README.md # Task-specific documentation
└── README.md # NMREluBench documentation
🎨 Task 1: De Novo Structure Generation
Generate molecular structures directly from experimental NMR chemical shifts without prior knowledge of potential candidates.
Input: 1H and 13C NMR chemical shifts
Output: Molecular structure (Smiles or Selfies)
Evaluation: For the de novo molecular structure generation task, we report the overall molecular validity rate across all generated structures (
🔍 Task 2: Library Matching
Identify the most likely molecular structure by matching experimental NMR data against a curated molecular library.
Input: 1H and 13C NMR chemical shifts
Output: Ranked list of candidate structures from library
Evaluation: For the library matching task, we use Top-1, Top-3, and Top-10 performance for structural match rate and MCES distance.
Please kindly cite us after publication if you use our data or code.
This project is licensed under the MIT License - see the LICENSE file for details.
This project is built upon the following open-source works, and we deeply appreciate the contributions of their authors:
- NMRNet - Provided the NMR spectral dataset.
- MassSpecGym - Our code is extended from this mass spectrometry toolkit.
- CMGNet - The BART-based model was adapted from this repository.
We also thank the broader open-source community for enabling reproducible research.
