Skip to content

Network volumes

A RunPod network volume is a persistent block-storage device that gets mounted into every worker of an endpoint at the same path (typically /runpod-volume). Survives scale-to-zero. Charged per-GB-month, not per-second like compute.

For mineru-runpod, network volumes solve three concrete problems:

  1. Avoid re-downloading large input corpora on each job
  2. Cache MinerU weights so cold starts skip the HuggingFace fetch
  3. Persist intermediate or final outputs between jobs in a multi-step pipeline

Problem 1: pre-staging large input corpora

Section titled “Problem 1: pre-staging large input corpora”

If you’re parsing a fixed collection of documents (a regulatory archive, a books library, customer-supplied PDFs you’ve already uploaded), uploading each one over file_url or file_b64 on every job wastes time and money:

  • file_url: the worker downloads on each request — costs cold-start time and RunPod’s egress
  • file_b64: capped at 20 MB total request size

Mount a network volume once, upload the corpus to /runpod-volume/inputs/, and reference each file via volume_path:

{
"input": {
"volume_path": "/runpod-volume/inputs/2024-q3-report.pdf",
"backend": "vlm-auto-engine"
}
}

No download. No size limit. Same volume is visible across all warm and cold workers of the endpoint.

RunPod gives you two options:

  1. CLI: runpodctl create volume then runpodctl push <local-path> <volume-name>:/<path>
  2. Filesystem pod: spin up a cheap CPU pod with the volume mounted, then scp / rsync from your machine. Tear it down when done.

I usually use the filesystem pod approach for anything over a few hundred MB — it’s faster and you can use familiar tools.

Network volume storage on RunPod is roughly $0.07/GB-month (check current pricing — this number drifts). For a 50 GB corpus that’s ~$3.50/month for storage, vs paying re-download time + egress for every job that touches it. Pays off quickly if you parse the same docs more than ~10 times.

You don’t need to do anything here — both model dependencies (opendatalab/MinerU2.5-Pro-2605-1.2B for the VLM backend and opendatalab/PDF-Extract-Kit-1.0 for the pipeline backend) are baked into the worker image at build time, at /root/.cache/huggingface/. No Cached Models setup, no Network Volume, no runtime download.

If you’d rather slim the image and rely on RunPod’s Cached Models feature for the VLM only (which has a single-model dashboard limit), fork the Dockerfile to skip the RUN snapshot_download step. The pipeline backend would then need either a Network Volume or HF_HUB_OFFLINE=0 for runtime downloads.

Problem 3: persisting outputs between jobs

Section titled “Problem 3: persisting outputs between jobs”

If a downstream pipeline (a follow-up job that does embeddings, indexing, classification) needs to read the parse output, you have three options:

  1. return: "s3" + S3 bucket — see Output modes. Best for cross-region or multi-tenant pipelines.
  2. return: "tarball_b64" + caller persists to its own store — fine if the caller already has a place to put things.
  3. Network volume + writes from handler — requires forking the handler to add a write step. Useful for tight self-contained pipelines but adds complexity.

Most users should use option 1. The third option is only worth it when both the producer and consumer of the parse output are RunPod endpoints sharing the same volume.

In the RunPod dashboard:

  1. Storage → Network Volumes → New — pick a region (must match the GPU pool your endpoint uses) and a size
  2. Serverless → your endpoint → Edit → Storage — attach the volume; default mount path is /runpod-volume

The volume is read-write by default. Every worker in the endpoint sees the same filesystem state — concurrent writes need standard filesystem-level coordination (lock files, atomic renames, etc.).

The volume_path field of the job input is just a filesystem path inside the worker container. It works for:

  • Files on a mounted network volume (/runpod-volume/<anything>)
  • Files baked into the Docker image at build time (e.g. /worker/test-fixture.pdf — the path we use in the disabled Hub test)
  • Files written by an earlier handler step (advanced — only if you fork the handler)

The worker validates the path is a real file and raises ValueError: volume_path not found inside container: <path> if not.

Network volumes are regional — a volume in us-east-1 is invisible to a worker in eu-central-1. When you create the volume, pick the region with the most GPU availability for your gpuIds list. For the template defaults (ADA_24,AMPERE_24,AMPERE_48), any US or EU region works.

If your endpoint has multiple regions enabled and you’re running on a network volume, pin the endpoint to the volume’s region (RunPod dashboard → endpoint → GPU configuration) — otherwise some workers may land in regions without the volume and fail with volume_path not found.

  • You parse each document once and discard. Just use file_url — no point paying storage rent.
  • Your input set is small and changes constantly. Inline base64 (file_b64) is fine for ≤20 MB inputs.
  • You only care about cold-start latency. Both MinerU models are baked into the image; a volume for model weights duplicates that work.