Skip to content

[FEATURE] Add MELO Benchmark datasets as a ranking task for job title normalization #30

@federetyk

Description

@federetyk

Problem

Is your proposal tackling an existing problem or limitation?

  • No, it's an addition

Proposal

Add the MELO Benchmark datasets [*, **] as ranking tasks in WorkRB. The implementation would be similar to that of the new JobTitleSimilarityRanking task proposed in #24.

Architectural consideration: dataset indexing within each task
In the current WorkRB architecture, each task contains one or more datasets, indexed by language. This design limits each task to having at most one dataset per language. This constraint arises from the code in data loading, evaluation, and result aggregation. However, MELO datasets are identified by (country, query_language, corpus_languages) tuples, so multiple datasets share the same language. We propose generalizing the indexing from Language to arbitrary string identifiers. This would allow WorkRB to fully support MELO and accommodate future tasks with arbitrary indexing for datasets.

@Mattdl Thanks again for inviting us to contribute! The codebase is clean and well-designed. This proposal does add some complexity, but if you think this makes sense, I would be happy to open a separate issue to discuss the refactor. Once aligned, I can submit a PR for the refactor first, and then implement the MELO task on top of those changes.

  • Type:

    • New Ontology (data source for multiple tasks)
    • New Task(s)
    • New Model(s)
    • New Metric(s)
    • Other
  • Area(s) of code: paths, modules, or APIs you expect to touch
    src/workrb/tasks/__init__.py
    src/workrb/tasks/abstract/ranking_base.py
    src/workrb/tasks/ranking/__init__.py
    src/workrb/tasks/ranking/melo.py
    tests/test_task_loading.py

Additional Context

Dataset source:

Publication:
Retyk et al. (2024) introduced the MELO Benchmark in "MELO: Multilingual Entity Linking of Occupations" (RecSys in HR 2024).
https://ceur-ws.org/Vol-3788/RecSysHR2024-paper_2.pdf

Dataset statistics:

Full statistics for all 48 datasets are available in the HuggingFace dataset card.

Task characteristics:

  • Task type: Ranking
  • Label type: Multi-label (each query maps to one ESCO occupation, but occupations have multiple surface forms)
  • Query input type: Job titles
  • Target input type: Job titles (ESCO occupation surface forms)
  • Evaluation metrics: MRR, Hit@1, Hit@5, and Hit@10, as used by Retyk et al. (2024).

Potential future addition:
We have unpublished equivalent datasets for skill entity linking to the ESCO Skills taxonomy (~8 datasets) [***]. These follow the same structure as MELO. We can discuss adding these as a separate task if you are interested!

Implementation

  • I plan to implement this in a PR
  • I am proposing the idea and would like someone else to pick it up

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions