-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Problem
Is your proposal tackling an existing problem or limitation?
- No, it's an addition
Proposal
Add the MELO Benchmark datasets [*, **] as ranking tasks in WorkRB. The implementation would be similar to that of the new JobTitleSimilarityRanking task proposed in #24.
Architectural consideration: dataset indexing within each task
In the current WorkRB architecture, each task contains one or more datasets, indexed by language. This design limits each task to having at most one dataset per language. This constraint arises from the code in data loading, evaluation, and result aggregation. However, MELO datasets are identified by (country, query_language, corpus_languages) tuples, so multiple datasets share the same language. We propose generalizing the indexing from Language to arbitrary string identifiers. This would allow WorkRB to fully support MELO and accommodate future tasks with arbitrary indexing for datasets.
@Mattdl Thanks again for inviting us to contribute! The codebase is clean and well-designed. This proposal does add some complexity, but if you think this makes sense, I would be happy to open a separate issue to discuss the refactor. Once aligned, I can submit a PR for the refactor first, and then implement the MELO task on top of those changes.
-
Type:
- New Ontology (data source for multiple tasks)
- New Task(s)
- New Model(s)
- New Metric(s)
- Other
-
Area(s) of code: paths, modules, or APIs you expect to touch
src/workrb/tasks/__init__.py
src/workrb/tasks/abstract/ranking_base.py
src/workrb/tasks/ranking/__init__.py
src/workrb/tasks/ranking/melo.py
tests/test_task_loading.py
Additional Context
Dataset source:
- HuggingFace: https://huggingface.co/datasets/Avature/MELO-Benchmark
- GitHub: https://github.com/avature/melo-benchmark
Publication:
Retyk et al. (2024) introduced the MELO Benchmark in "MELO: Multilingual Entity Linking of Occupations" (RecSys in HR 2024).
https://ceur-ws.org/Vol-3788/RecSysHR2024-paper_2.pdf
Dataset statistics:
Full statistics for all 48 datasets are available in the HuggingFace dataset card.
Task characteristics:
- Task type: Ranking
- Label type: Multi-label (each query maps to one ESCO occupation, but occupations have multiple surface forms)
- Query input type: Job titles
- Target input type: Job titles (ESCO occupation surface forms)
- Evaluation metrics:
MRR,Hit@1,Hit@5, andHit@10, as used by Retyk et al. (2024).
Potential future addition:
We have unpublished equivalent datasets for skill entity linking to the ESCO Skills taxonomy (~8 datasets) [***]. These follow the same structure as MELO. We can discuss adding these as a separate task if you are interested!
Implementation
- I plan to implement this in a PR
- I am proposing the idea and would like someone else to pick it up