GitHub - BIH-CEI/LLM_on_SNOMED: State-of-the-art Large Language Models (LLMs) for mapping clinical terms to SNOMED CT concepts

BIH-CEI / LLM_on_SNOMED Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

State-of-the-art Large Language Models (LLMs) for mapping clinical terms to SNOMED CT concepts

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Data		Data
LLM_Data.xlsx		LLM_Data.xlsx
README		README
environment.yml		environment.yml
llm_on_snomed.ipynb		llm_on_snomed.ipynb

Repository files navigation

# LLM ON SNOMED

This project evaluates state-of-the-art Large Language Models (LLMs) for mapping clinical terms to SNOMED CT concepts—both the Fully Specified Name (FSN) and the SNOMED CT Identifier (SCTID). We assess (in)correctness using the ISO/TS 21564 Equivalence Assessment Score (“MapQual”), aiming to understand how well current LLMs can support human coders in semantic mapping.

Performance is tested using the German Corona Consensus Dataset (GECCO), a harmonized dataset used for research in COVID-19 with expert-validated FSN and SCTID. We use a representative subset of GECCO that serves as a benchmark for this project. The FSN and SCTID of the benchmark and of all models are evaluated using MapQual.

## Project Organization

├── README.md           <- Top-level README for developers using this project
├── Data                <- Folder that stores model outputs with the respective MapQual scoring
├── Figures             <- Empty Folder that stores figures which are created running llm_on_snomed.ipynb
├── Tables              <- Empty Folder that stores tables which are created running llm_on_snomed.ipynb
├── LLM_Data.xml        <- Benchmark dataset with the respective MapQual scoring
├── llm_on_snomed.ipynb <- Notebook for data loading, cleaning & all analysis
├── environment.yml     <- conda env requirements


## Installation

1) Clone
git clone https://github.com/yourusername/LLM_on_SNOMED.git
cd LLM_on_SNOMED  # <- adjust to your actual folder name if different

2) Create the environment
conda env create -f environment.yml

3) Activate it
conda activate llm_on_snomed-env


## License

This project is licensed under the MIT License.