State-of-the-art accuracy
The 1.2B MinerU VLM tops OmniDocBench for text, tables, formulas, and reading order. Combined with the pipeline backend, 109 languages of OCR, handwriting included.
This is an open-source repo and RunPod Hub template — not a hosted service. Fork it, deploy from the Hub, and the worker runs on your RunPod account against your wallet.
Under the hood it’s one Docker image wrapping MinerU (3.2.x runtime, MinerU2.5-Pro-2605-1.2B VLM by default). Send a PDF as a URL, base64 blob, or path on a mounted volume. Get back Markdown, a structured content_list, the raw middle.json, and extracted images. When traffic stops, RunPod tears the worker down in seconds.
State-of-the-art accuracy
The 1.2B MinerU VLM tops OmniDocBench for text, tables, formulas, and reading order. Combined with the pipeline backend, 109 languages of OCR, handwriting included.
Pay per second, not per hour
idle_timeout=10 + FlashBoot means a 100-page doc lands for about $0.03 on a 24 GB RTX 4090. No worker running? No bill.
No glue code
Deploy from the Hub in one click, or fork and let RunPod auto-build your copy. Either way: ten minutes from sign-up to first parse.
Safe to ship
Apache 2.0 with clear commercial thresholds (100M MAU, $20M revenue). The cleanest open license in the GPU-PDF space.
from mineru_client import MineruClient
client = MineruClient(endpoint_id="<your-endpoint-id>")result = client.parse_document(file_url="https://example.com/report.pdf", end_page=4)client.save_tarball(result, "./out/doc")# → markdown + content_list + middle.json + imagesHow it's built
Architecture, what’s in the repo, supported workloads, and how it compares to Marker, GROBID, and Nougat.
Deploy in 10 minutes
Sign up for RunPod, deploy mineru-runpod from the Hub, paste the endpoint id, parse your first PDF.
Pick the right GPU
When 24 GB is enough, when to jump to 48 GB, and which RunPod pool IDs map to which workload shapes. Includes the official MinerU hardware compatibility table.
When something breaks
Hub build flakes, Blackwell crashes, Latin-on-Cyrillic, OOM, cold starts. How to read the debug block to diagnose anything else.
The source
MIT-licensed. ~30 files. Issues and PRs welcome.
Blog
Project notes and announcements.