Skip to content

Move benchmark and ingestion files where they belong#25

Open
AymanL wants to merge 4 commits intomainfrom
code_relocation
Open

Move benchmark and ingestion files where they belong#25
AymanL wants to merge 4 commits intomainfrom
code_relocation

Conversation

@AymanL
Copy link
Collaborator

@AymanL AymanL commented Mar 17, 2026

Tidy up the codebase, this is only moving files and resolving imports

Copy link
Collaborator

@cgoudet cgoudet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merci pour ce boulot.

2-3 petits commentaires mais tu pourras merger direct après.

J'ai vu qu'il y avait pleins de pdf qui ont disparu. On les avait versionné par erreur avant?

from hierarchical.postprocessor import ResultPostprocessor

from docling_postprocess import render_docling_output
from ..docling_postprocess import render_docling_output
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Est ce que tu pourrais mettre partout des imports absolus?

file and yield that path. Downstream (e.g. Docling) expects a path to a
real file on disk.
"""
from django.core.files.storage import default_storage
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A moins qu'il y ait une raison très technique les imports sont mieux en haut de fichier.

if benchmark_parsing_dir_str not in sys.path:
sys.path.insert(0, benchmark_parsing_dir_str)
parse_docling = importlib.import_module("benchmarking.parsers").parse_docling
from .benchmarking.parsers import parse_docling
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import global et absolu.

ingestion/parsing/output/benchmark_results_extended.csv
ingestion/parsing/output/extraction_scores.csv
ingestion/parsing/output/analysis/
eu_fact_force/ingestion/parsing/output/extracted_texts/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Je pense qu'il est préférable de mettre les output dans un sous dossier de data/ pour ne pas mélanger du code et des données.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants