Skip to content

Sripaadpatel/SkillSync-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 SkillSync: AI-Powered Career Matchmaker

SkillSync AI Banner

Privacy-First • Local Inference • Hybrid Search • Gap Analysis

📖 Executive Summary

SkillSync is an intelligent microservice designed to bridge the gap between talent and opportunity. Unlike traditional Applicant Tracking Systems (ATS) that rely on rigid keyword matching, SkillSync uses Hybrid Semantic Search (Vector + Metadata) to understand the context of a candidate's profile.

Powered by Llama 3.2 and ChromaDB, it runs entirely locally, ensuring data privacy while providing enterprise-grade recommendations. It solves the "Cold Start" problem by intelligently extracting hard constraints (Location, Job Type) from resumes in a single pass.

✨ Key Features

Feature

Description

  • 🚀 Hybrid Search Engine

    Combines Vector Similarity (Semantic) with Metadata Filtering (Hard Constraints) for precise matching.

  • ❄️ Cold Start Solver

Instantly extracts User ID, Location, and Role preferences from a PDF resume using a custom ETL pipeline.

  • 🧠 Intelligent Parsing

    Uses Llama 3.2 (Single-Pass Extraction) to understand resumes, not just regex them.

  • 📊 AI Gap Analysis

    Provides a detailed breakdown of Matching Skills, Missing Skills, and a Match Score (%) with actionable advice.

  • 🔒 Privacy First

    Zero data leakage. All inference runs on local hardware using Ollama and local Vector Stores.

🏗️ System Architecture

graph TD User([User / Client]) -->|Upload PDF| API[FastAPI Server]

subgraph "SkillSync Core"
    API -->|Raw File| Parser[PDF Parser]
    Parser -->|Text| ETL[Llama 3.2 ETL Agent]
    ETL -->|Filters + Summary| Engine[Recommendation Engine]
    
    Engine -->|Query Vector| Chroma[(ChromaDB)]
    Engine -->|Apply Filters| Chroma
    
    Chroma -->|Top K Candidates| Engine
    
    Engine -->|Gap Analysis| Analyst[Llama 3.2 Reasoning Agent]
    Analyst -->|Structured JSON| API
end

API -->|Final Response| User

🛠️ Tech Stack

  • Backend: Python, FastAPI, Uvicorn

  • LLM Orchestration: LangChain

  • Local Inference: Ollama (Llama 3.2 3B Model)

  • Vector Database: ChromaDB

  • Embedding Model: nomic-embed-text

  • Data Processing: DuckDB, Pandas, PyPDF

📊 Dataset

The job postings data used to power the recommendation engine is sourced from the LinkedIn Job Postings (2023-2024) dataset.

  • Download Link: LinkedIn Job Postings (Kaggle)

  • Content: Contains over 100,000 real-world job postings including titles, descriptions, and metadata.

  • Setup: To rebuild the database, download postings.csv from the link above and place it in a folder named Dataset in the project root.

🚀 Getting Started

Prerequisites

  • Python 3.10+ installed.

  • Ollama installed and running.

  • Pull the required models:

      ollama pull llama3.2
      ollama pull nomic-embed-text
    

Installation

Clone the repository

git clone [https://github.com/yourusername/skillsync.git](https://github.com/yourusername/skillsync.git)
cd skillsync

Install Dependencies

pip install -r requirements.txt

Initialize the Database

Note: You must run the indexer from the data directory.

cd data
python indexer.py
cd ..


#Run the Server

python main.py

🔌 API Documentation

Once the server is running, access the interactive Swagger UI at: 👉 http://localhost:8000/docs

Primary Endpoint: Recommend Jobs

<details>
<summary>Click to view Request/Response details</summary>

`POST /recommend`

`Request: multipart/form-data`

`file: PDF Resume (Binary)`

Response (200 OK):

{
  "user_id": "candidate@example.com",
  "filters_applied": {
    "location": "New York, NY",
    "formatted_work_type": "Full-time"
  },
  "top_recommendation": {
    "title": "Senior Software Engineer",
    "company_name": "TechGlobal Inc.",
    "location": "New York, NY",
    "similarity_score": 0.89
  },
  "ai_analysis": {
    "match_score": "85%",
    "matching_skills": ["Python", "AWS", "FastAPI"],
    "missing_skills": ["Kubernetes", "GraphQL"],
    "advice": "Your backend experience is strong. To improve your match score, consider highlighting containerization projects."
  },
  "other_matches": [...]
}


</details>

📂 Project Structure

SkillSync/
├── data/
│   ├── data_processor.py    # Cleaning logic (DuckDB)
│   ├── indexer.py           # Vector embedding logic
│   ├── jobs_clean.csv       # Processed dataset
│   └── jobs_db/             # ChromaDB Persistent Store
├── main.py                  # FastAPI Application Entry Point
├── recsys_engine.py         # Core Business Logic & LLM Chains
├── pdf_parser.py            # PDF Extraction Utility
├── requirements.txt         # Dependencies
└── README.md                # Documentation

🤝 Contribution

Contributions are welcome! Please follow these steps:

Fork the repository.

Create a feature branch (git checkout -b feature/AmazingFeature).

Commit your changes (git commit -m 'Add some AmazingFeature').

Push to the branch (git push origin feature/AmazingFeature).

Open a Pull Request.
Built with ❤️ by Me

About

SkillSync is an AI-powered career matchmaker that uses hybrid vector search and local LLMs to provide instant, context-aware job recommendations and personalized skill gap analysis.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages