🔬 NMREluBench: Benchmarking Molecular Structure Elucidation from Experimental NMR Chemical Shifts

🎯 Overview

NMREluBench is a comprehensive benchmark specifically designed for evaluating deep learning models on the inverse elucidation of molecular structures from experimental ¹H and ¹³C NMR chemical shifts. This benchmark addresses a critical gap in computational chemistry by providing standardized evaluation protocols for NMR-based structure determination.

✅ Key Features

📈 Two Core Tasks: De novo structure generation and library matching
🧪 Experimental Data Focus: Real-world NMR chemical shifts from experimental measurements
🔄 Comparative Analysis: Performance evaluation against computed NMR datasets
📊 Standardized Metrics: Rigorous evaluation protocols for fair model comparison
🌐 Open Source: Publicly available for research and development

🚀 Quick Start

📁 Dataset Structure

NMREluBench/
├── nmr_denovo/         # De novo structure generation task
│   ├── ...             # Task-specific code
│   └── README.md       # Task-specific documentation
├── nmr_retrieval/      # Library matching task
│   ├── ...             # Task-specific code
│   └── README.md       # Task-specific documentation
└── README.md           # NMREluBench documentation

📋 Tasks Overview

🎨 Task 1: De Novo Structure Generation

Generate molecular structures directly from experimental NMR chemical shifts without prior knowledge of potential candidates.

Input: ¹H and ¹³C NMR chemical shifts
Output: Molecular structure (Smiles or Selfies)
Evaluation: For the de novo molecular structure generation task, we report the overall molecular validity rate across all generated structures ($R_{\text{valid}}$), along with Top-1 and Top-10 performance for structural match rate, MCES distance ($D_{\text{mces}}^{(1)}$, $D_{\text{mces}}^{(10)}$), and Tanimoto similarity ($S_{\text{tani}}^{(1)}$, $S_{\text{tani}}^{(10)}$).

🔍 Task 2: Library Matching

Identify the most likely molecular structure by matching experimental NMR data against a curated molecular library.

Input: ¹H and ¹³C NMR chemical shifts
Output: Ranked list of candidate structures from library
Evaluation: For the library matching task, we use Top-1, Top-3, and Top-10 performance for structural match rate and MCES distance.

📚 Citation

Please kindly cite us after publication if you use our data or code.

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

This project is built upon the following open-source works, and we deeply appreciate the contributions of their authors:

✔️ Dataset & Baseline Methods

NMRNet - Provided the NMR spectral dataset.

✔️ Core Development Framework

MassSpecGym - Our code is extended from this mass spectrometry toolkit.

✔️ Model Architecture

CMGNet - The BART-based model was adapted from this repository.

We also thank the broader open-source community for enabling reproducible research.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
nmr_denovo		nmr_denovo
nmr_retrieval		nmr_retrieval
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
nmrelubench.png		nmrelubench.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔬 NMREluBench: Benchmarking Molecular Structure Elucidation from Experimental NMR Chemical Shifts

🎯 Overview

✅ Key Features

🚀 Quick Start

📁 Dataset Structure

📋 Tasks Overview

🎨 Task 1: De Novo Structure Generation

🔍 Task 2: Library Matching

📚 Citation

📜 License

🙏 Acknowledgments

✔️ Dataset & Baseline Methods

✔️ Core Development Framework

✔️ Model Architecture

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

License

Colin-Jay/NMREluBench

Folders and files

Latest commit

History

Repository files navigation

🔬 NMREluBench: Benchmarking Molecular Structure Elucidation from Experimental NMR Chemical Shifts

🎯 Overview

✅ Key Features

🚀 Quick Start

📁 Dataset Structure

📋 Tasks Overview

🎨 Task 1: De Novo Structure Generation

🔍 Task 2: Library Matching

📚 Citation

📜 License

🙏 Acknowledgments

✔️ Dataset & Baseline Methods

✔️ Core Development Framework

✔️ Model Architecture

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages