Universal Dependencies POS Tagging with LLMs

Project Overview

This repository contains code and analysis for fine-grained Part-of-Speech (POS) tagging using Large Language Models (LLMs), specifically focused on the Universal Dependencies (UD) framework. The project explores how LLMs can perform complex linguistic annotation tasks and addresses the challenges of tokenization and POS tagging in a unified pipeline.

Key Features

Universal Dependencies POS tagging with Google Gemini 2.0 Flash Lite
Segmentation pipeline that follows UD tokenization guidelines
Comprehensive error analysis comparing LLM performance against traditional methods
Evaluation framework for both tokenization and tagging accuracy
Integrated pipeline that handles both segmentation and tagging in one flow

Installation and Setup

Prerequisites

Python (> 3.11)
Git
uv (https://docs.astral.sh/uv/getting-started/)
Visual Studio Code

Environment Setup

Create a folder for the assignment:
```
mkdir hw1; cd hw1
```

Retrieve the dataset we will use and the code from this repo:

git clone https://github.com/UniversalDependencies/UD_English-EWT.git
git clone https://github.com/melhadad/nlp-with-llms-2025-hw1.git

Load the required python libraries:
```
cd nlp-with-llms-2025-hw1; uv sync
```

Define your API keys in either gemini_key.ini or grok_key.ini

# Unix like
source grok_key.ini
export GROK_API_KEY=$GROK_API_KEY

For Google Gemini:

export GOOGLE_API_KEY="your-api-key"  # On Windows: set GOOGLE_API_KEY=your-api-key

Activate the project virtual env:

source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Open ud_pos_tagger_sklearn.ipynb in VS Code and verify you can execute the cells.

Dataset

The project uses the Universal Dependencies English-EWT dataset. The code expects the dataset directory structure as follows:

UD_English-EWT/
├── en_ewt-ud-dev.conllu
├── en_ewt-ud-test.conllu
└── en_ewt-ud-train.conllu

Running the Code

Basic POS Tagging

# Run the basic POS tagger
uv run ud_pos_tagger_gemini.py

Tokenization and Segmentation

# Run the segmentation model
uv run ud_pos_llm_segmentor_gemini.py

Integrated Pipeline

# Run the improved LLM tagger with integrated segmentation
uv run ud_pos_improved_llm_tagger.py

Analysis Notebook

To analyze the results and view the visualizations:

jupyter notebook ud_pos_tagger_gemini.ipynb

Project Structure

ud_pos_tagger_gemini.py: Main POS tagging implementation using Gemini
ud_pos_llm_segmentor_gemini.py: Specialized tokenization model for UD guidelines
ud_pos_improved_llm_tagger.py: Integrated pipeline for segmentation and tagging
utils.py: Helper functions for data processing and evaluation
schema.py: Data structures and type definitions
prompts.py: LLM prompts for tagging and segmentation
ud_pos_tagger_gemini.ipynb: Jupyter notebook with detailed analysis and visualizations

Results and Findings

POS Tagging Performance

The LLM tagger achieves strong performance on Universal Dependencies POS tagging, with the following key findings:

Strong overall accuracy, with particular strengths in:
- Adpositions (ADP)
- Proper nouns (PROPN)
- Common nouns (NOUN)
- Verbs (VERB)
Improvement areas compared to traditional machine learning approaches:
- Pronouns (PRON) recognition
- Distinguishing determiners (DET) from pronouns
- Particle (PART) vs. adverb (ADV) disambiguation

Segmentation Impact

Our analysis of tokenization shows:

38.7% average error reduction when using proper UD tokenization
Most significant improvements on sentences with hyphenated compounds and punctuation
Most challenging segmentation cases: hyphenated terms, contractions, and special punctuation

Challenging Cases

The LLM tagger struggles most with:

Deictic words that can be pronoun or determiner - "this/that/these/those"
Discourse pronoun "there" vs. locative adverb
Subordinating conjunction vs. adposition - words like "for", "in", "to"
Verb-particle/adverb vs. preposition - "up", "out", "in", "off", "on"
Possessive pronouns classified as determiners

Future Work

Experiment with fine-tuning approaches for the LLM
Explore parameter-efficient adaptation for specialized linguistic domains
Implement additional languages from Universal Dependencies
Create a more robust evaluation framework for cross-linguistic performance
Develop a web interface for interactive POS tagging demonstrations

Acknowledgments

Universal Dependencies project for the dataset and guidelines
Google for providing access to the Gemini API
The NLP community for benchmarks and evaluation methodologies
BGU CS Course 'NLP with LLMs' - Spring 2025 - Michael Elhadad

Authors: Gil Barel and Daniel Ohayon

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
datasets		datasets
.env.template		.env.template
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
count_pos.py		count_pos.py
gemini_key.ini		gemini_key.ini
grok_key.ini		grok_key.ini
hello.py		hello.py
hw1.md		hw1.md
hw1.pdf		hw1.pdf
pos_defs.py		pos_defs.py
prompts.py		prompts.py
pyproject.toml		pyproject.toml
schema.py		schema.py
ud_pos_improved_llm_tagger.py		ud_pos_improved_llm_tagger.py
ud_pos_llm_segmentor_gemini.py		ud_pos_llm_segmentor_gemini.py
ud_pos_tagger_gemini.ipynb		ud_pos_tagger_gemini.ipynb
ud_pos_tagger_gemini.py		ud_pos_tagger_gemini.py
ud_pos_tagger_grok.py		ud_pos_tagger_grok.py
ud_pos_tagger_sklearn.ipynb		ud_pos_tagger_sklearn.ipynb
utils.py		utils.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Universal Dependencies POS Tagging with LLMs

Project Overview

Key Features

Installation and Setup

Prerequisites

Environment Setup

Dataset

Running the Code

Basic POS Tagging

Tokenization and Segmentation

Integrated Pipeline

Analysis Notebook

Project Structure

Results and Findings

POS Tagging Performance

Segmentation Impact

Challenging Cases

Future Work

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Universal Dependencies POS Tagging with LLMs

Project Overview

Key Features

Installation and Setup

Prerequisites

Environment Setup

Dataset

Running the Code

Basic POS Tagging

Tokenization and Segmentation

Integrated Pipeline

Analysis Notebook

Project Structure

Results and Findings

POS Tagging Performance

Segmentation Impact

Challenging Cases

Future Work

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages