Skip to content

Feat/add webservice api#36

Open
nhorlava wants to merge 20 commits into
mainfrom
feat/add_webservice_api
Open

Feat/add webservice api#36
nhorlava wants to merge 20 commits into
mainfrom
feat/add_webservice_api

Conversation

@nhorlava
Copy link
Copy Markdown
Collaborator

Closes #31

@nhorlava nhorlava self-assigned this May 15, 2026
Copilot AI review requested due to automatic review settings May 15, 2026 07:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR exposes llamore as a FastAPI-based web service (closes #31). It adds a new api/ package with the FastAPI app, a Dockerfile and HuggingFace Spaces metadata, optional [api] extras in pyproject.toml/uv.lock, and CI jobs that deploy the app to a HF Space. It also makes a small fix in GeminiExtractor to delete uploaded PDFs and includes some unrelated notebook regeneration.

Changes:

  • New api/api.py exposing /extract/{openai,gemini}/{text,pdf} endpoints with API-key auth, request models, and PDF size validation.
  • Packaging and CI: optional api extra in pyproject.toml/uv.lock, plus deploy-api and rebuild-hf workflow jobs targeting a HF Space.
  • Minor: delete uploaded Gemini files after generation, .gitignore for notebooks/data, and regenerated quick_start.ipynb cells that include an unrelated schema change.

Reviewed changes

Copilot reviewed 7 out of 10 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
api/api.py New FastAPI service with auth, schemas, and extract endpoints.
api/Dockerfile Image for HF Space; pins install to a feature branch.
api/README.md HF Space front-matter.
pyproject.toml Adds [api] optional dependency group.
uv.lock Lock updates for FastAPI/uvicorn stack; also tightens lxml to ==5.3.1.
.github/workflows/ci.yml Adds deploy-api and rebuild-hf jobs and lxml system deps.
src/llamore/extractors.py Deletes uploaded PDF from Gemini after generate_content.
notebooks/quick_start.ipynb Regenerated outputs reflect a Person/Reference schema change not present in this PR; kernel metadata changed.
.gitignore Excludes notebooks/data.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/llamore/extractors.py
Comment on lines +269 to +271

if pdf:
self._client.files.delete(name=file.name)
Comment thread api/Dockerfile
RUN pip install uv

# RUN uv pip install --system git+https://github.com/mpilhlt/llamore.git[api]
RUN uv pip install --system "git+https://github.com/mpilhlt/llamore.git@feat/add_webservice_api#egg=llamore[api]"
Comment thread api/Dockerfile
Comment on lines +3 to +18
RUN useradd -m -u 1000 user

WORKDIR /app

RUN mkdir -p logs

RUN pip install uv

# RUN uv pip install --system git+https://github.com/mpilhlt/llamore.git[api]
RUN uv pip install --system "git+https://github.com/mpilhlt/llamore.git@feat/add_webservice_api#egg=llamore[api]"

COPY --chown=user api.py .

EXPOSE 7860

CMD ["uvicorn", "api:app", "--host", "0.0.0.0", "--port", "7860"]
Comment thread .github/workflows/ci.yml
Comment on lines +75 to +86
cd api
git init -b main
git config user.email "ci@github"
git config user.name "CI"
git remote add hf https://user:${HF_TOKEN}@huggingface.co/spaces/Llamore/api
git add .
if git diff --cached --quiet; then
echo "No changes to deploy — skipping push."
exit 0
fi
git commit -m "deploy from ${GITHUB_SHA}"
git push hf main --force
Comment thread .github/workflows/ci.yml
Comment on lines +88 to +104
rebuild-hf:
runs-on: ubuntu-latest
environment: hf
needs: [tests, deploy-api]
if: |
always()
&& (needs.tests.result == 'success'
|| needs.deploy-api.result == 'success'
|| github.event_name == 'workflow_dispatch')
steps:
- name: Trigger HF Space rebuild
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
curl --fail -X POST \
"https://huggingface.co/api/spaces/Llamore/api/restart?factory=true" \
-H "Authorization: Bearer ${HF_TOKEN}"
Comment thread api/api.py
Comment on lines +56 to +57
api_key_header = APIKeyHeader(name="X-Llamore-API-Key", scheme_name="Llamore API Key", auto_error=False)
provider_key_header = APIKeyHeader(name="X-LLM-Provider-Key", scheme_name="LLM Provider Key", auto_error=False)
Comment on lines 481 to +495
@@ -441,7 +492,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.16"
"version": "3.9.20"
Comment on lines 103 to +232
@@ -166,7 +166,7 @@
" \"Person\": {\n",
" \"description\": \"Contains a proper noun or proper-noun phrase referring to a person, possibly including one or more of the person's forenames, surnames, honorifics, added names, etc.\",\n",
" \"properties\": {\n",
" \"forename\": {\n",
" \"first_name\": {\n",
" \"anyOf\": [\n",
" {\n",
" \"type\": \"string\"\n",
@@ -176,8 +176,21 @@
" }\n",
" ],\n",
" \"default\": null,\n",
" \"description\": \"Contains a forename, given or baptismal name.\",\n",
" \"title\": \"Forename\"\n",
" \"description\": \"Contains a first name, given or baptismal name.\",\n",
" \"title\": \"First Name\"\n",
" },\n",
" \"middle_name\": {\n",
" \"anyOf\": [\n",
" {\n",
" \"type\": \"string\"\n",
" },\n",
" {\n",
" \"type\": \"null\"\n",
" }\n",
" ],\n",
" \"default\": null,\n",
" \"description\": \"Contains a middle name, written between a person's first and surname. It is often abbreviated.\",\n",
" \"title\": \"Middle Name\"\n",
" },\n",
" \"surname\": {\n",
" \"anyOf\": [\n",
@@ -191,6 +204,32 @@
" \"default\": null,\n",
" \"description\": \"Contains a family (inherited) name of a person, as opposed to a given, baptismal, or nick name.\",\n",
" \"title\": \"Surname\"\n",
" },\n",
" \"name_link\": {\n",
" \"anyOf\": [\n",
" {\n",
" \"type\": \"string\"\n",
" },\n",
" {\n",
" \"type\": \"null\"\n",
" }\n",
" ],\n",
" \"default\": null,\n",
" \"description\": \"Contains a connecting phrase or link used within a name but not regarded as part of it, such as 'van der' or 'of'.\",\n",
" \"title\": \"Name Link\"\n",
" },\n",
" \"role_name\": {\n",
" \"anyOf\": [\n",
" {\n",
" \"type\": \"string\"\n",
" },\n",
" {\n",
" \"type\": \"null\"\n",
" }\n",
" ],\n",
" \"default\": null,\n",
" \"description\": \"Contains a name component which indicates that the referent has a particular role or position in society, such as an official title or rank.\",\n",
" \"title\": \"Role Name\"\n",
Comment thread api/api.py
Comment on lines +1 to +30
import logging
import os
import secrets
import tempfile
import traceback
from contextlib import asynccontextmanager
from pathlib import Path
from typing import Annotated, Any, Dict, List, Literal, Optional

from fastapi import Depends, FastAPI, Form, HTTPException, Request, Security, UploadFile
from fastapi.responses import JSONResponse
from fastapi.security import APIKeyHeader

from llamore import (
GeminiExtractor,
LineByLinePrompter,
OpenaiExtractor,
References,
SchemaPrompter,
)
from pydantic import BaseModel, BeforeValidator, Field

logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
handlers=[logging.StreamHandler()],
)
logger = logging.getLogger(__name__)


Comment thread api/api.py
Comment on lines +196 to +205
text: str = Field(..., min_length=1, description="Raw text to extract references from.")



class GeminiExtractTextRequest(GeminiExtractionConfig):
"""Request body for Gemini text extraction."""

text: str = Field(..., min_length=1, description="Raw text to extract references from.")


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LLamore as a web service

3 participants