python-errors-reliability

Use when designing error handling, retry policies, timeout behavior, or failure classification in Python. Also use when code swallows exceptions, loses error context across boundaries, has unbounded retries, silent failures, or lacks idempotency guarantees on retried writes.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "python-errors-reliability" with this command: npx skills add ahgraber/skills/ahgraber-skills-python-errors-reliability

Python Errors and Reliability

Overview

Design errors so callers can act and operators can diagnose quickly. Treat these recommendations as preferred defaults — when a default conflicts with project constraints, suggest a better-fit alternative and call out tradeoffs and compensating controls.

When to Use

  • Implementing or reviewing timeout, deadline, or retry logic.
  • Translating exceptions across layer boundaries (e.g., infra → domain).
  • Classifying failures as retryable vs. permanent.
  • Adding idempotency guarantees to retried writes.
  • Diagnosing swallowed exceptions or silent failures in existing code.

When NOT to use:

  • Pure data-transformation code with no I/O or failure modes.
  • Simple validation that raises immediately with no translation needed.
  • Performance tuning unrelated to failure handling (see python-concurrency-performance).

Quick Reference

  • Preserve cause chains at translation boundaries.
  • Catch only what can be handled.
  • Keep timeout/deadline policy explicit.
  • Keep retry policy explicit and bounded.
  • Classify retryable vs. permanent failures with explicit policy data.
  • Keep idempotency expectations explicit for retried writes.

Common Mistakes

  • Swallowing exceptions — bare except: pass or except Exception with no logging loses diagnostic context.
  • Unbounded retries — retrying without a maximum count or total deadline leads to cascading failures and resource exhaustion.
  • Dropping the cause chain — raising a new exception without from original discards the root cause operators need for diagnosis.
  • Treating all errors as retryable — retrying a 400 Bad Request or a validation error wastes resources and delays the real fix.
  • Implicit timeouts — relying on library defaults (or no timeout at all) produces unpredictable latency under failure conditions.

Scope Note

  • Treat these recommendations as preferred defaults for common cases, not universal rules.
  • If a default conflicts with project constraints or worsens the outcome, suggest a better-fit alternative and explain why it is better for this case.
  • When deviating, call out tradeoffs and compensating controls (tests, observability, migration, rollback).

Invocation Notice

  • Inform the user when this skill is being invoked by name: python-design-modularity.

References

  • references/error-strategy.md
  • references/retryability-classification.md
  • references/retries-timeouts.md

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

python-design-modularity

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

python-runtime-operations

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

python

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

python-workflow-delivery

No summary provided by upstream source.

Repository SourceNeeds Review