Skip to content

pjcodes404/Autonomous-Knowledge-Extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 Autonomous Knowledge Extractor & Quiz Builder

Turn educational text into a validated, difficulty-ranked quiz using an agentic pipeline.
Deterministic by default, with optional free LLM-based refinement.


🚀 The Problem

Students and educators often struggle to convert raw notes into effective assessment material. Existing tools either require manual effort or rely entirely on opaque large language models that can hallucinate, behave inconsistently, and are difficult to validate.


💡 The Solution

Autonomous Knowledge Extractor is an agentic, modular pipeline that processes educational text end-to-end to:

  1. Extract key concepts and definitions
  2. Organize concepts into a hierarchical knowledge graph
  3. Generate quiz questions from grounded definitions
  4. Rank questions by difficulty
  5. Validate difficulty logic and consistency

The system runs locally using deterministic logic, with an optional LLM refinement step that improves wording without affecting correctness.


⚡ Key Features

  • Agentic architecture with clearly separated stages
  • Explicit knowledge graph (IS_A, PART_OF, CONTAINS relationships)
  • Deterministic core logic (same input produces same output)
  • Automatic difficulty ranking based on graph structure
  • Built-in self-validation of quiz difficulty
  • Optional LLM refinement with safe fallback (no dependency on LLM availability)

🏗️ High-Level Architecture

Raw Text → Preprocessor → Concept Extractor → Hierarchy Builder
→ Knowledge Graph → Quiz Generator → Difficulty Ranker
→ Difficulty Validator → Final Quiz


🛠️ Installation

Python 3.10 or newer is required.

Clone the repository and move into the project directory:

git clone https://github.com/yourusername/autonomous-knowledge-extractor.git
cd autonomous-knowledge-extractor

There are no mandatory external dependencies.


🏃 Usage

Demo Mode

Run the built-in demo text:

python -m src.main --demo

Process Your Own Notes

Create a text file (for example, notes.txt) containing educational content and run:

python -m src.main --input-file notes.txt

Optional LLM Refinement

Enable optional LLM-based wording refinement:

python -m src.main --input-file notes.txt --use-llm

The system will automatically fall back to deterministic logic if no API token is available.


🧩 How It Works

Concept Extraction

Heuristic NLP identifies noun phrases and definition-style sentences to extract candidate concepts.

Hierarchy Construction

Semantic relationships such as IS_A and PART_OF are inferred to build a knowledge graph.

Quiz Generation

Questions are generated only from grounded definitions to avoid noise and hallucinations.

Difficulty Ranking

Difficulty is computed using graph depth, concept specificity, and relationship complexity.

Validation

A validation stage checks that difficulty ordering is logically consistent before output is shown.


🤖 Why This Is Agentic

The system follows an agentic workflow:

  • Decomposition: each stage is handled by a specialized agent
  • State passing: agents communicate via structured artifacts such as graphs and quizzes
  • Reflection: the validator critiques and approves the output before release

This mirrors modern agentic system design without relying entirely on LLM reasoning.


📜 License

MIT License.
Built for Hackathon 2025.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages