Skip to content

BIH-CEI/LLM_on_SNOMED

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

# LLM ON SNOMED

This project evaluates state-of-the-art Large Language Models (LLMs) for mapping clinical terms to SNOMED CT concepts—both the Fully Specified Name (FSN) and the SNOMED CT Identifier (SCTID). We assess (in)correctness using the ISO/TS 21564 Equivalence Assessment Score (“MapQual”), aiming to understand how well current LLMs can support human coders in semantic mapping.

Performance is tested using the German Corona Consensus Dataset (GECCO), a harmonized dataset used for research in COVID-19 with expert-validated FSN and SCTID. We use a representative subset of GECCO that serves as a benchmark for this project. The FSN and SCTID of the benchmark and of all models are evaluated using MapQual.

## Project Organization

├── README.md           <- Top-level README for developers using this project
├── Data                <- Folder that stores model outputs with the respective MapQual scoring
├── Figures             <- Empty Folder that stores figures which are created running llm_on_snomed.ipynb
├── Tables              <- Empty Folder that stores tables which are created running llm_on_snomed.ipynb
├── LLM_Data.xml        <- Benchmark dataset with the respective MapQual scoring
├── llm_on_snomed.ipynb <- Notebook for data loading, cleaning & all analysis
├── environment.yml     <- conda env requirements


## Installation

1) Clone
git clone https://github.com/yourusername/LLM_on_SNOMED.git
cd LLM_on_SNOMED  # <- adjust to your actual folder name if different

2) Create the environment
conda env create -f environment.yml

3) Activate it
conda activate llm_on_snomed-env


## License

This project is licensed under the MIT License.

About

State-of-the-art Large Language Models (LLMs) for mapping clinical terms to SNOMED CT concepts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published