Convert one or more documents/images to Markdown using Mistral OCR or GLM OCR.
uv tool install https://github.com/sssxks/pdf2md_cli.git --reinstallfrom pdf2md_cli import ConvertOptions, convert_file, convert_files
# single file
res = convert_file("docs/sample.pdf", api_key="...", outdir="out")
print(res.markdown_path)
# batch (supports globs)
batch = convert_files(["docs/*.pdf", "docs/*.png"], api_key="...", outdir="out", workers=4)
print(len(batch.succeeded), len(batch.failed))Set your API key:
setx MISTRAL_API_KEY "YOUR_KEY"
setx BIGMODEL_API_KEY "YOUR_KEY"Run:
2md path\\to\\file.pdf
2md path\\to\\image.png
2md path\\to\\file.docx
2md path\\to\\slides.pptx
2md docs\\*.pdf --workers 4
2md docs\\*.png --workers 4
2md path\\to\\file.pdf -o out
2md path\\to\\file.pdf --model mistral-ocr-latest
2md path\\to\\file.pdf --backend glm --model glm-ocr
2md path\\to\\file.pdf --keep-remote-file
2md path\\to\\file.pdf # default: --table-format html (extract + inline; no tbl-*.html files)
2md path\\to\\file.pdf --table-format markdown
2md path\\to\\file.pdf --extract-table --table-format html
2md path\\to\\file.pdf --extract-table --table-format markdown
2md path\\to\\file.pdf --no-front-matter --no-page-markers
2md docs\\*.pdf --workers 4 --retries 8 --backoff-max-ms 60000Help:
2md -h
2md help tables
2md help advancedTable handling behavior:
| Flags | Mistral table_format |
Markdown output | Sidecar files |
|---|---|---|---|
(default) --table-format html |
html |
HTML tables are inlined | none |
--table-format markdown |
(not sent) | Tables stay inline as markdown | none |
--extract-table --table-format html |
html |
Links to tbl-*.html |
writes tbl-*.html |
--extract-table --table-format markdown |
markdown |
Links to tbl-*.md |
writes tbl-*.md |
Header/footer handling (advanced):
--header {inline,discard,extract,comment}and--footer {inline,discard,extract,comment}comment(default): extract and add them back as HTML comments in the markdowninline: keep headers/footers in the main markdowndiscard: extract headers/footers but drop themextract: extract and write<stem>_headers_footers.md
Supported document formats:
- PDF (
.pdf) - Word (
.docx) - PowerPoint (
.pptx) - Text (
.txt) - EPUB (
.epub) - XML / DocBook / JATS XML (
.xml) - RTF (
.rtf) - OpenDocument Text (
.odt) - BibTeX / BibLaTeX (
.bib) - FictionBook (
.fb2) - Jupyter Notebooks (
.ipynb) - LaTeX (
.tex) - OPML (
.opml) - Troff (
.1,.man)
Supported image formats:
- JPEG (
.jpg,.jpeg) - PNG (
.png) - AVIF (
.avif) - TIFF (
.tif,.tiff) - GIF (
.gif) - HEIC/HEIF (
.heic,.heif) - BMP (
.bmp) - WebP (
.webp)
Backend notes:
--backend mistral(default): setMISTRAL_API_KEY.--backend glm: setBIGMODEL_API_KEY(orZHIPUAI_API_KEY).--backend mock: hidden test backend, only available whenPDF2MD_ENABLE_MOCK=1.
Retries and backoff:
- By default the CLI retries transient API failures (e.g. HTTP 429/5xx) with exponential backoff + jitter.
- Tweak with
--retries,--backoff-initial-ms,--backoff-max-ms,--backoff-multiplier,--backoff-jitter. (For contributors: setPDF2MD_ENABLE_MOCK=1to enable the hidden mock backend for UX testing.)
For document inputs (e.g. .pdf, .docx, .pptx), this CLI uploads the file to Mistral's Files API with purpose="ocr" and deletes the uploaded file after the OCR call completes (best-effort).
For image inputs, the CLI sends a data: URL to the OCR API (no file upload), so there is no remote file to delete.
If CLI did best-effort cleanup and it still fails. You can do:
from mistralai import Mistral
import os
with Mistral(api_key=os.getenv("MISTRAL_API_KEY", "")) as mistral:
res = mistral.files.delete(file_id="3b6d45eb-e30b-416f-8019-f47e2e93d930")
print(res)