Troubleshooting

When something doesn’t work, the worker tries hard to fail loudly. Every response (success and failure) includes a debug block with what backend ran, which model loaded, what GPU the worker landed on, and per-phase timings. Start there.

How to read the `debug` block

{
  "debug": {
    "backend": "vlm-auto-engine",
    "input_format": "pdf",
    "model_dir": "/root/.cache/huggingface/hub/models--opendatalab--MinerU2.5-Pro-2605-1.2B/snapshots/<hash>",
    "gpu": {
      "available": true,
      "name": "NVIDIA RTX 4090",
      "compute_capability": "8.9",
      "total_memory_gb": 23.99
    },
    "phase_ms": {
      "fetch_input": 12,
      "mineru_parse": 18420,
      "package": 95
    }
  }
}

What to look at:

Field	What’s wrong if it’s surprising
`backend`	The string passed to MinerU. If you set `pipeline` but see `vlm-auto-engine`, your caller isn’t sending what you think it is
`input_format`	Auto-detected from bytes. If you uploaded a PDF and see `unknown`, your transport returned an error page (HTML), not the file body
`model_dir`	Filesystem path of the snapshot that actually loaded. Both VLM and pipeline models are baked into the image at `/root/.cache/huggingface/`; if this is `null` on a successful job, the image build skipped the model bake step (unusual — file a bug)
`gpu.compute_capability`	8.6 = Ampere (3090, A5000, A6000), 8.9 = Ada (4090, RTX 6000 Ada), 9.0 = Hopper (H100), 12.0 = Blackwell — VLM will crash
`phase_ms.fetch_input`	If hundreds of seconds on a `file_url` job, the source URL is slow / failing
`phase_ms.mineru_parse`	Per-page guidance for warm workers, highly GPU- and content-dependent: MinerU upstream cites ~0.5 s/page (2.12 fps) on an A100 for the VLM backend; we measured a range of ~1 s/page on uniform multi-page reports up to ~10 s/page on dense financial forms on an A5000 24 GB (≈3.5 s/page is a reasonable single-number estimate) under the default `gpu_memory_utilization=0.5`. Pipeline backend is ~3–5 s/page across GPUs (CPU-bound for layout, GPU-bound only for OCR). First call on a fresh worker is much higher (model load + vLLM warmup adds ~90–130 s for the VLM backend). If a warm-worker call is 5× the expected per-page number for your GPU and content type, you’re memory-bound and vLLM is swapping

The worker also emits structured log lines visible in RunPod’s worker log viewer — see reading worker logs below.

Hub build fails on the validator test pod

After every push, RunPod’s Hub builds the image and then spins up a real GPU pod to execute .runpod/tests.json. The image is fine; the test pod fails. Three failure modes account for almost everything we’ve seen here:

“Pod could not be created”

Pod could not be created: This machine does not have the resources to deploy your pod. Please try a different machine.

Cause: RunPod can’t allocate the gpuTypeId declared in .runpod/tests.json during the build window. The Docker image is fine — RunPod just couldn’t find a free host of that type.

Fix: switch gpuTypeId to a higher-availability pool. The template currently uses "NVIDIA GeForce RTX 4090" because it has the best pool availability across RunPod’s regions; "NVIDIA RTX A5000" works too but tends to be scarcer. Re-trigger the build after editing.

`nvidia-container-cli: requirement error: unsatisfied condition: cuda>=12.9`

Error response from daemon: failed to create task for container: ...
nvidia-container-cli: requirement error: unsatisfied condition: cuda>=12.9,
please update your driver to a newer version, or use an earlier cuda container

Cause: the container’s CUDA floor (12.9, inherited from vllm/vllm-openai:v0.11.2) is higher than the CUDA version the host driver exposes. RunPod scheduled the test pod on a host that satisfied allowedCudaVersions on paper but doesn’t actually meet the container’s prestart-hook requirement.

The trap: allowedCudaVersions tells RunPod “the worker accepts these driver CUDA versions.” If older versions are listed there, RunPod is free to schedule on older-driver hosts, and the container’s own requirement labels then reject the host at prestart. Result: intermittent failures (depends which host got picked).

Fix: keep allowedCudaVersions in both .runpod/tests.json and .runpod/hub.json aligned with the actual minimum the container needs. For the current vLLM v0.11.x base, that’s ["13.0", "12.9"]. Don’t pad the list with older versions just because they look harmless — every entry that the container can’t actually run on is a future flake.

If you bump vLLM, re-check the CUDA floor from upstream’s release notes (vLLM v0.11.0 was the bump to CUDA 13).

Build timeout (30 minutes)

Build exceeded maximum time limit of 1800 seconds (30.0 minutes). Build terminated.

Cause: RunPod’s build pipeline has a hard 30-minute ceiling. The image bakes ~4 GB of model weights (MinerU VLM + PDF-Extract-Kit) and installs vLLM + Torch on top; on a slow build-region day, those steps alone can blow past the cap.

Fix: the Dockerfile already uses hf-xet with HF_XET_HIGH_PERFORMANCE=1 for fast model bakes, and the two model downloads are split into separate RUN layers so a partial cache survives retries. If you still time out:

Re-trigger the build (often a transient HF-egress slowdown)
Pin a smaller VLM model via MINERU_VL_MODEL_NAME for http-client backends — doesn’t help here since the bake is unconditional
As a last resort, drop one of the baked models and rely on RunPod’s per-endpoint model cache (one model only — see the model caching docs)

Escape hatch: skip the validator entirely

If a release is urgent and the Hub validator is the only thing blocking it, rename .runpod/tests.json to .runpod/tests_.json in the repo (underscore suffix). The Hub validator looks for the exact filename tests.json; the renamed file is invisible to it and no test pod is scheduled. Rename it back when the underlying issue is resolved.

This loses all CI signal — only use it as a temporary unblock, not a default.

VLM backend crashes on Blackwell GPUs (cc=12.0)

Symptom: worker logs show successful model load followed by:

compute_capability: 12.0 >= 8.0
INFO Starting to load model .../MinerU2.5-Pro-2605-1.2B/...
INFO Model loading took 2.16 GiB and 0.36 seconds
CUDA error (...flash-attention/hopper/flash_fwd_launch_template.h:188): invalid argument

debug.gpu.compute_capability in the response is 12.0 (e.g. NVIDIA RTX PRO 6000 Blackwell Server Edition MIG 1g.24gb).

Cause: xformers / flash-attention in vllm 0.11.2 (our base image) ships kernels for Ampere (8.x), Ada (8.9), and Hopper (9.0) — no Blackwell-SM120 code path. On Blackwell, xformers misroutes to the Hopper kernel and crashes during VLM model init.

Why we can’t just bump vllm: MinerU 3.2.x’s pyproject pins vllm>=0.10.1.1,<0.12. The first vllm release with any Blackwell mention in its notes is v0.13.0 (2025-12-19, SM103 / GB300 “Blackwell Ultra”); broader Blackwell coverage follows in v0.14+. All Blackwell-aware vllm versions sit above MinerU’s <0.12 ceiling, so until MinerU loosens that pin we’re stuck on v0.11.x, which has no SM120 kernel path. The v0.6.6 → v0.11.2 bump did not change this because v0.11.2 only adds SM100 (data-center Blackwell B200/GB200) MoE-prep code, not the SM120 (consumer 5090 / PRO 6000 Blackwell) flash-attn paths the VLM uses.

Fix: keep the default gpuIds: "ADA_24,AMPERE_24,AMPERE_48" — all unambiguously pre-Blackwell. If you’ve manually opted into ADA_48_PRO, that pool can mix in Blackwell SKUs since RunPod groups them under the same pool name — remove it from your endpoint’s GPU pool list. For workloads that need a 48 GB Ada-or-newer card, the pipeline backend doesn’t use xformers/flash-attn and is unaffected, so it runs on Blackwell fine; only the VLM/hybrid backends crash. See Choosing a GPU for the GPU-pool background.

“Pod scaled to zero” but the next job has a noticeable cold start

Symptom: spiky workload. First job after a quiet period takes a long time; subsequent jobs in the same window are fast.

Cause: RunPod tears down the worker after idle_timeout seconds of inactivity. The next request spins a fresh worker — the cost is unpacking the image, loading the model into VRAM, and (for the VLM backend) JIT-compiling vLLM kernels.

Expected magnitudes (measured on RTX A5000):

Scenario	First-job latency
Warm worker, VLM, on A100 (per MinerU upstream)	~0.5 s/page (2.12 fps)
Warm worker, VLM, on A5000 24 GB (our measurement)	~1 s/page (uniform reports) – ~10 s/page (dense forms)
Warm worker, pipeline (any GPU ≥ 4 GB)	~3–5 s/page
FlashBoot happy path — host reuse, snapshot restored	~7–8 s wall-clock — model + engine restored from snapshot
FlashBoot cold path — new host, image cached	~110 s — fresh boot, warmup runs (1× per host)
Cold worker, no warmup (`MINERU_SKIP_WARMUP=1`)	~110–130 s per request after every scale-from-zero (no per-host amortization)
Cold worker, pipeline backend, no warmup	~10–15 s per request (lighter; no vLLM warmup)
Brand-new worker host (no image cached)	+3–5 min for the initial image pull, on top of whichever path above applies

Per-phase cold-start breakdown (VLM, A5000 24 GB)

If you’re tracking why a cold start takes ~110 s, here’s the live measurement we captured against the deployed template. Times are wall-clock between consecutive log entries from RunPod’s worker log viewer, totals approximate ±2 s:

Phase	Time	Source log line
Worker boot + 7 fitness checks (CUDA, GPU, network, disk, memory)	~3 s	`--- Starting Serverless Worker ---` → `All fitness checks passed.`
Queue dispatch + RunPod SDK ready	~5 s	`All fitness checks passed.` → `Started.`
MinerU lazy import → `Using vllm-async-engine` selection	<1 s	`Started.` → `mineru.utils.engine_utils:get_vlm_engine — Using vllm-async-engine`
vLLM engine config + model path resolve (`HF_HUB_OFFLINE` lookup, arch detection)	~19 s	→ `arg_utils.py:592 HF_HUB_OFFLINE is True`
Model weight load (1 safetensors shard, 2.16 GiB → VRAM)	~21 s	`gpu_model_runner.py:3338 Model loading took 2.1601 GiB memory and 21.4 seconds`
`torch.compile` (Dynamo + Inductor, dynamic shape)	~25 s	`monitor.py:34 torch.compile takes 25.54 s in total`
KV cache profile + budget allocation	~2 s	`gpu_worker.py:359 Available KV cache memory: 8.17 GiB`
CUDA graph capture (35 mixed prefill-decode + 19 decode-FULL)	~3 s	`gpu_model_runner.py:4244 Graph capturing finished in 2 secs, took 0.27 GiB`
vLLM engine init total (sum of phases above)	34.22 s	`core.py:250 init engine (profile, create kv cache, warmup model) took 34.22 seconds`
MinerU’s wrapper-level total (includes vLLM init plus its own setup)	100.63 s	`mineru.backend.vlm.vlm_analyze:get_model — get vllm-async-engine predictor cost: 100.63s`
Actual page parse (single page)	~6 s	`VLM processing window 1/1` → response delivered
End-to-end cold start (queue → response)	~108 s	`Jobs in queue: 1` → response
Subsequent warm-worker parse, same page count	~6 s	mineru_parse phase_ms

Headline observations from this run:

vLLM engine init dominates (34 s of the 100 s MinerU wrapper time). Of that, torch.compile is 25 s — the single biggest cost.
Model weight load is only 21 s despite being 2.16 GiB. The image bakes the model into /root/.cache/huggingface/, so this is a local FS read, not a network download.
Available KV cache memory: 8.17 GiB on a 24 GB A5000. That’s vLLM’s KV budget after model + activations + reserve. Constraining factor for MINERU_MAX_CONCURRENCY if you try to raise it above 1.
Maximum concurrency for 8,192 tokens per request: 87.13x — vLLM’s in-engine batch ceiling on this hardware. Different from our per-worker concurrency knob; this is sequences per single vLLM forward pass.

Fix: not a bug, but levers if it’s a problem:

Bump idle_timeout (template default is 10 s) — workers stay warm longer, you pay for that time
Set workers_min=1 — at least one worker is always warm, you pay 24/7 for it
Enable RunPod’s FlashBoot explicitly in the endpoint config (it’s on by default for templates from the Hub)
Eager warmup is now active by default. The worker runs one throwaway parse against the baked test fixture during boot, before runpod.serverless.start() claims the event loop. This loads the MinerU model into VRAM and JIT-compiles vLLM kernels, so the first real request lands on a warm engine. Look for [mineru-warmup] starting (backend=... lang=... fixture=/worker/test-fixture.pdf) then [mineru-warmup] done in Ns in the worker logs. To disable (e.g., for debugging cold-start ordering), set MINERU_SKIP_WARMUP=1 on the endpoint.

Tune via env vars on the endpoint:
- MINERU_WARMUP_BACKEND (default vlm-auto-engine) — which backend to warm. Must match the backend most callers will use; warming vlm-auto-engine but serving pipeline requests means the first pipeline call still pays cold-start.
- MINERU_WARMUP_LANG (default en) — only meaningful for the pipeline backend; VLM ignores it.
- MINERU_SKIP_WARMUP=1 — bypass entirely (worker falls back to lazy load on first request, ~100s tax).
Expected post-warmup cold-start latency for the first request: ~7–8 s wall-clock on A5000 (measured 2026-05-26 against the deployed template). Breakdown: ~3 s of FlashBoot snapshot restore + ~5 s of parse on a fully-warm engine. FlashBoot empirically captures Python process memory + CUDA VRAM + the vLLM engine subprocess — see FlashBoot mechanism below for the full analysis.

FlashBoot mechanism (confirmed)

We confirmed on 2026-05-26 that FlashBoot is process-snapshot based (CRIU or functional equivalent), and that snapshots are scoped per (host, image-SHA) — not per endpoint. Each worker host maintains its own snapshot store. When RunPod’s scheduler picks a host that has run this image before, you get a fast restore. When it picks a new host, the worker re-runs warmup once.

The four-request investigation that pinned this down. Same short single-page PDF, same parameters every time, worker scaled to zero between every request:

#	Wall-clock	Host	Snapshot?	What the worker log showed
1	456 s	A (post-rebuild, fresh image pull)	none	Full cold path: image pull → fitness checks → `[mineru-warmup] done in 101.0s` → parse 5.6 s
2	7.6 s	A (same as R1)	yes (post-R1)	Zero boot logs. Went straight from `Jobs in queue: 1` to `"starting job"`. No `[mineru-warmup]` line.
3	122 s	B (different host)	none	Image cached on B, but fresh process: fitness checks + `[mineru-warmup] done in 101.5s` + parse 5.6 s
4	7.4 s	B (same as R3)	yes (post-R3)	Same pattern as R2 — snapshot restore, no boot logs

Worker identity is visible in the logs three ways: the EngineCore_DP0 pid=NNN line (different per container), the distributed_init_method=tcp://192.168.X.X pod-internal IP, and the request-id -u1 / -u2 suffix (RunPod’s region/partition identifier). All three agreed: R1+R2 were the same pod; R3+R4 were a different same pod.

The per-host model:

FlashBoot lookup = (worker host, image SHA)
- match → restore snapshot in ~3 s, parse in ~5 s → ~7-8 s wall-clock
- no match → fresh boot, run fitness checks + warmup → ~110 s wall-clock

What gets preserved on a successful restore: Python interpreter state, MinerU’s in-memory engine handle, vLLM’s AsyncLLMEngine subprocess (PID persists), CUDA VRAM (model weights + KV cache + CUDA graphs), torch.compile cache, and the boot-time signal handlers.

Practical implications:

The boot-time warmup pays off per host that the worker visits, not once per endpoint or once forever. Each new host pays the warmup tax once; every subsequent restore on that same host is fast.
Snapshot invalidation: the obvious triggers are image rebuild (new SHA), MINERU_SKIP_WARMUP=1, and presumably eventual eviction after long idle. RunPod doesn’t document the eviction policy.
Either way, the per-request cold tax is gone — even the slow case (~110 s) is the worker boot paying it once, not every request paying it.

What controls which path you’ll see:

Scenario	Likely outcome
`workers_min ≥ 1`	Worker stays on its host — every request is on a fully warm worker (~5 s parse, no cold start at all)
High-frequency endpoint, workers scale up and down fast	Same hosts get re-selected — most cold starts are happy-path restores (~7 s)
Quiet endpoint, infrequent requests, long idle gaps	RunPod’s scheduler may pick a different host — some cold starts will be on new hosts (~110 s)
First request after a rebuild	Always cold path — every endpoint’s first request after a fresh image pays ~5-7 min (image pull) + ~110 s (warmup). One-time cost per worker host.
`MINERU_SKIP_WARMUP=1`	Every cold start is ~110-130 s; no per-host amortization. Don’t do this in production.

CUDA out of memory

Symptom: handler errors with CUDA out of memory mid-parse. debug.gpu.total_memory_gb shows your card has fewer GB than the workload needs.

Cause: vLLM allocates KV cache upfront based on gpu_memory_utilization (default 0.5). On a 24 GB card this targets ~12 GB. If concurrency or document size pushes KV usage above the budget plus model weights (~2.2 GB) plus activations (~0.75 GB), you OOM.

Fix:

Bump to a 48 GB pool for the affected workload (AMPERE_48)
Switch to the pipeline backend which doesn’t use vLLM and is documented at 4 GB minimum VRAM (per MinerU’s hardware compatibility table), regardless of doc length
Reduce concurrency (concurrencyModifier on the endpoint) so fewer pages are in flight at once
For one-off huge docs, set backend: "pipeline" per-job — same worker can handle small docs with VLM and giant docs with pipeline

See Choosing a GPU for the VRAM math.

`unexpected handler return type: <class 'NoneType'>` after a successful parse

Symptom: the client raises MineruClientError: unexpected handler return type: <class 'NoneType'> (or run_sync returns None directly). Worker logs show the handler completed successfully — [mineru-worker] done: elapsed=Xs phase_ms={...} — immediately followed by:

"Failed to return job results. | 400, message='Bad Request',
 url='https://api.runpod.ai/v2/<endpoint>/job-done/<worker>/<request>?gpu=...&isStream=false'"

Cause: RunPod’s /runsync gateway caps the response payload at ~20 MB. The worker built a valid result; when it tried to POST it back via /job-done, the gateway returned HTTP 400 and discarded it. The SDK then sees no output → None → our client raises this error.

Triggers (measured on a real 82-page PDF):

transport: "inline" — markdown + content_list + middle.json + base64 images add up fast; ~80 pages with embedded images was enough to exceed the cap. If you only need the markdown, narrow with formats: ["markdown"] first.
transport: "tarball_b64" — gzip compresses the JSON, but the images inside the tarball are already raster bytes, so it often doesn’t fit either. Confirmed same failure on the same doc.

Fix: use transport: "s3" for large outputs. The worker uploads the .tar.gz to an S3-compatible bucket and returns only a small presigned URL (~1 h TTL) — no gateway cap in the path.

{
  "input": {
    "file_url": "https://example.com/big.pdf",
    "transport": "s3"
  }
}

Configure the bucket via these env vars on the endpoint (not the template):

Env var	Cloudflare R2 example
`BUCKET_ENDPOINT_URL`	`https://<account-id>.r2.cloudflarestorage.com`
`BUCKET_NAME`	your bucket name
`BUCKET_ACCESS_KEY_ID`	R2 API token access key
`BUCKET_SECRET_ACCESS_KEY`	R2 API token secret
`BUCKET_REGION` (optional)	`auto` for R2

The Python client handles the rest via client.save_s3_tarball(result, dest_dir) — it follows result["results"][0]["tarball_url"], downloads the .tar.gz, and extracts. From curl, the entry inside results[] includes tarball_url, tarball_url_expires_in, and bucket_key; download within the TTL.

If you can’t wire S3, the fallback is page chunking: split with start_page / end_page into segments small enough that each tarball fits under the cap, then concatenate the .md files client-side. Slower (two cold starts if workers go cold between calls) but no infra changes.

Worker returns `ValueError: input bytes do not match any supported format`

Symptom: the response has ok: false and the above error message.

Cause: the worker’s _detect_format checked the first ~16 bytes against known magic numbers (%PDF, \x89PNG, PK\x03\x04 for OOXML, etc.) and didn’t match anything.

Most common reasons:

file_url returned an HTML error page instead of the file. The URL is wrong, expired, or behind auth. The response body starts with <!DOCT or <html.
file_b64 was double-encoded or not base64. The decoded bytes are random.
volume_path points at a file that exists but isn’t a supported format (e.g. a .csv or .txt). MinerU doesn’t accept plain text — convert to PDF first.

Fix: verify the bytes. Download from your file_url directly with curl, run file on the result, or check that base64 -d < input.b64 | xxd | head shows the right magic bytes.

Job times out before the parse finishes

Symptom: MineruClientError: endpoint transport failed: timeout after some number of seconds (default 900 s in the client, configurable via timeout=).

Cause: large documents take longer than your client-side timeout, or longer than the endpoint’s executionTimeoutMs.

Fix:

Client-side: pass a larger timeout to parse_document(timeout=3600, ...)
Endpoint-side: bump --execution-timeout in deploy.py (defaults to 900 s). Use --execution-timeout 3600 for full books.
Per-page math (warm worker, GPU- and content-dependent): MinerU upstream cites ~0.5 s/page for the VLM backend on an A100. We measured ~1–10 s/page on an A5000 24 GB depending on content density (uniform multi-page reports run fast; dense financial forms run slow). Pipeline ≈ 3–5 s/page across GPUs. A 1000-page book on pipeline = ~3000–5000 s; on VLM-on-A5000 ≈ ~1000–10000 s depending on content; on VLM-on-A100 ≈ ~500 s. Add 90–130 s for the first call on a cold worker if VLM (model load + vLLM warmup) — the cold-start tax is paid once per worker, not per page.

Reading worker logs

The worker emits one JSON object per line by default (set LOG_FORMAT=text for human-readable output during local development). RunPod’s log viewer shows them as-is. To filter in CloudWatch, Loki, Axiom, or any other JSON log sink, key off the level, msg, and any of the structured fields.

Typical lines on a successful job:

{"ts":"2026-05-25T18:30:42.103Z","level":"info","logger":"mineru-worker","msg":"starting job","job_id":"queued-uuid-abc","backend":"vlm-auto-engine","lang":"en","start_page":0,"end_page":4,"gpu_name":"NVIDIA RTX 4090","compute_capability":"8.9"}
{"ts":"2026-05-25T18:30:48.612Z","level":"info","logger":"mineru-worker","msg":"done","job_id":"queued-uuid-abc","elapsed_seconds":6.51,"phase_ms":{"fetch_input":12,"mineru_parse":6420,"package":79},"model_dir":"/root/.cache/huggingface/hub/.../snapshots/<hash>","refresh_worker":false}

On failure:

{"ts":"2026-05-25T18:30:42.789Z","level":"error","logger":"mineru-worker","msg":"job failed","job_id":"queued-uuid-abc","error_type":"ValueError","error_message":"input bytes do not match any supported format","phase_ms":{"fetch_input":8}}

Key fields:

Field	Meaning
`level`	`debug`, `info`, `warning`, `error`
`logger`	Always `mineru-worker` for handler emissions
`msg`	Stable identifier for the event — safe to alert on
`job_id`	RunPod’s job UUID. Use this to correlate all lines from one request, especially when a worker handles multiple jobs in sequence. `<unknown>` when a sync caller submitted without a queued ID
`phase_ms`	Per-phase timings (`fetch_input`, `mineru_parse`, `package`)
`backend`, `lang`, `start_page`, `end_page`	Echoed from the job input — handy for correlating with the request
`refresh_worker`	`true` if the worker is recycling after this job (scaling guide)

Cancellation and worker recycling

Symptom: a job appears in RunPod’s queue, then disappears from the worker logs mid-parse with no done line.

Cause: RunPod sent SIGTERM to the worker — either because the endpoint scaled down due to idle timeout, you triggered a recycle from the dashboard, or the worker requested a refresh via the refresh_worker response flag.

When SIGTERM arrives, the worker logs:

{"ts":"...","level":"warning","logger":"mineru-worker","msg":"sigterm received, draining current job"}

What gets honored:

Cancellations between request phases (fetch_input → parse → package). The next phase check raises RuntimeError: worker shutting down and the job returns ok: false with that error.
The graceful drain handled by RunPod’s SDK on currently-in-flight jobs.

What does NOT get honored:

Cancellation mid-aio_do_parse. The vLLM forward pass is a blocking GPU call from asyncio’s point of view; interrupting it would corrupt the engine state. The worker finishes the current document even after SIGTERM, then exits cleanly.

If you need hard cancellation guarantees mid-parse, that’s an upstream MinerU feature request, not a template-level fix.

Getting more help

If your symptom isn’t here:

Pull the worker logs from RunPod’s dashboard — look for [mineru-worker] lines and any tracebacks
Check the debug block in the response
Open a bug report — the issue template asks for the response and the GPU pool, both of which let you diagnose 90% of issues at a glance
Parsing accuracy issues (output is structurally fine but wrong content) belong upstream at opendatalab/MinerU — they’re MinerU’s responsibility, not this template’s