Skip to content

Conversation

@nikos-livathinos
Copy link
Contributor

Support external predictions according to the description given in #112

  • Extend all evaluators to support the optional parameter external_predictions_path.
  • If such a path is provided, it is used to load DoclingDocument objects from files instead of the parquet dataset.
  • GT is always taken from the parquet.
  • The path can contain files with predicted DoclingDocuments in various formats (json, doctags, yaml).
  • Update unit tests.
  • Extend CLI for docling-eval evaluate:
--external-predictions-path        PATH            Path to load existing DoclingDocument predictions. The filename must follow the pattern [doc_id].[json|dt|yaml|yml]

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…mmy entries in all evaluators.

Extend the CLI to support the --external-predictions-path

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…various formats

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…th. Add unit test

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…d unit test.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
@nikos-livathinos nikos-livathinos marked this pull request as draft December 4, 2025 16:20
@github-actions
Copy link
Contributor

github-actions bot commented Dec 4, 2025

DCO Check Passed

Thanks @nikos-livathinos, all your commits are properly signed off. 🎉

@nikos-livathinos nikos-livathinos self-assigned this Dec 4, 2025
@mergify
Copy link

mergify bot commented Dec 4, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🔴 Require two reviewer for test updates

This rule is failing.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…it test.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…unit test

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…dd unit test.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…. Add unit test

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…dd unit test

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…ngOrderEvaluator. Fix main

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
…oclingDocument from doctags and

the GT image.
- Introduce the staticmethod load_doctags() which covers all cases on page image loading.
- Refactor the FilePredictionProvider to use the load_doctags() from ExternalDoclingDocumentLoader.
- Refactor all evaluators to use the new ExternalDoclingDocumentLoader.

Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants