Skip to content

Phases #13

@ChakshuGautam

Description

@ChakshuGautam

Dimensions

  1. Complexity of PDFs structure => Single PDF, 2 Delinked PDFs, Linked PDFs, PDFs with Images, PDFs with Tables.
  2. Complexity of Questions => Fact Parrot, Neural Coref, Reasoning, Calculations.
  3. Complexity of Ingestion of PDF - Reading, Chunking in the most simplest way
  4. Complexity of Retrieval - Em only, Rank reeval, Hybrid, ColBERT, ...
  5. Automation of 1 and 2

Evaluation

  1. User Feedback
  2. Maximum Content => Breadth

Plan

  1. Start with one PDF and then add more

  2. Ingestion - manual first and then automated

  3. Stages to production

    • PM Testing
    • User Testing
    • Production
  4. Simpler PDF getting ingested => Working for a

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions