modal-batch-processing

Use this skill for Modal job orchestration with `.map`, `.starmap`, `.spawn`, `.spawn_map`, or `@modal.batched`. Trigger when the user needs to fan out work across Modal containers, collect results in-process, return a pollable job ID from a web server, cap concurrency, recover from partial failures, or batch many small requests into fewer GPU calls. Do not use it for vLLM or SGLang serving, model fine-tuning, or sandbox lifecycle questions.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "modal-batch-processing" with this command: npx skills add .

Modal Batch Processing

Quick Start

  1. Verify the actual local Modal environment before writing code.
modal --version
python -c "import modal,sys; print(modal.__version__); print(sys.executable)"
modal profile current
  • Do not assume the default python interpreter matches the environment behind the modal CLI.
  • Switch to the project virtualenv or the interpreter behind the installed modal CLI before writing examples or running scripts.
  • Use with modal.enable_output(): around with app.run(): when local provisioning logs or remote prints are needed for debugging.
  1. Classify the request before writing code.
  • Caller waits and needs the results back in the same process: use .map or .starmap.
  • Caller should return immediately and poll later: use .spawn.
  • Detached fan-out writes results somewhere durable: use .spawn_map.
  • Many small homogeneous requests should share one execution: use @modal.batched.
  1. Read exactly one primary reference before drafting code.

Choose the Workflow

  • Use .map or .starmap when the caller can wait for the full fan-out to finish and the results must come back to the same local process. Read references/map-and-gather.md.
  • Use .spawn when the caller should return immediately and keep a stable FunctionCall handle or job ID for later polling and collection. Deploy the function first if another service submits the work. Read references/job-queues-and-detached-runs.md.
  • Use .spawn_map only when each detached task writes its own durable output to a Volume, CloudBucketMount, database, or another external sink. Do not choose it when the caller expects programmatic result retrieval later. Read references/job-queues-and-detached-runs.md.
  • Use @modal.batched when many individual requests can be coalesced into fewer container or GPU executions. Keep the function contract list-in and list-out. Read references/dynamic-batching.md.

Default Rules

  • Start with plain @app.function functions for stateless work. Move to @app.cls only when the container must reuse loaded state or expensive initialization.
  • Keep orchestration local with @app.local_entrypoint or a plain Python script plus with app.run(): when the entire workflow can stay within one session.
  • Deploy with modal deploy and use modal.Function.from_name(...) when another service must submit jobs or look up a stable remote function later.
  • Set timeout= intentionally on remote work. Add retries= only when the work is idempotent and safe to re-run.
  • Set max_containers= when upstream systems, GPU quotas, or external APIs need a hard concurrency cap.
  • Persist outputs externally whenever detached work may outlive the caller or when using .spawn_map.
  • Use Volumes or CloudBucketMounts for durable caches, model weights, and shared intermediates; do not rely on ephemeral container disk.
  • Prefer .map or .starmap over .spawn when the caller genuinely needs results immediately and no durable job handle is required.
  • Prefer .spawn over .map when the caller needs a stable job ID or should return before the remote work finishes.
  • Treat .spawn_map() as detached fire-and-forget in Modal 1.3.4. The installed SDK docstring says programmatic result retrieval is not supported, so only use it when each task writes its output elsewhere.
  • If the task is really about OpenAI-compatible vLLM or SGLang serving, stop and use modal-llm-serving.
  • If the task is really about training model weights, stop and use modal-finetuning.
  • If the task is really about isolated interactive execution, tunnels, or sandbox restore flows, stop and use modal-sandbox.

Validate

  • Run npx skills add . --list after editing the package metadata or skill descriptions.
  • Keep evals/evals.json and evals/trigger-evals.json aligned with the actual workflow boundary of the skill.
  • Run scripts/smoke_test.py with a Python interpreter that can import modal when changing the workflow guidance or runnable artifact.

References

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

modal-llm-serving

No summary provided by upstream source.

Repository SourceNeeds Review
General

modal-finetuning

No summary provided by upstream source.

Repository SourceNeeds Review
General

modal-sandbox

No summary provided by upstream source.

Repository SourceNeeds Review
General

51mee Resume Parse

简历解析。触发场景:用户上传简历文件要求解析、提取结构化信息。

Registry SourceRecently Updated