distributed-systems-basics

Distributed-systems workflow for failure-mode analysis, consistency choices, and reliability primitive selection across networked components. Use when correctness depends on partitions, retries, timeouts, ordering, or partial failures; do not use for single-process implementation details only.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "distributed-systems-basics" with this command: npx skills add kentoshimizu/sw-agent-skills/kentoshimizu-sw-agent-skills-distributed-systems-basics

Distributed Systems Basics

Overview

Use this skill to reason about correctness and reliability in systems where network faults and partial failures are normal.

Scope Boundaries

Multi-service workflows require explicit consistency and ordering guarantees.
Retry/timeout/duplicate-message behavior can change business correctness.
Teams need to define reliability primitives before implementation or rollout.

Shared References

Failure mode and consistency rules:
- references/failure-mode-consistency-rules.md

Templates And Assets

Distributed flow risk template:
- assets/distributed-flow-risk-template.md

Inputs To Gather

Component boundaries and communication patterns.
Consistency and ordering requirements per workflow.
Failure scenarios (partition, timeout, duplicate, out-of-order, stale read).
Recovery and observability capabilities.

Deliverables

Failure-mode map and risk ranking.
Consistency decision record per critical flow.
Reliability mechanism selection (retry, idempotency, backoff, timeout).
Validation plan (fault injection and invariant checks).

Workflow

Capture critical flows with assets/distributed-flow-risk-template.md.
Map failure assumptions and consistency requirements per flow.
Select reliability primitives using references/failure-mode-consistency-rules.md.
Define observability and recovery behavior.
Validate assumptions with targeted failure tests and invariant checks.

Quality Standard

Critical flows have explicit consistency and ordering rules.
Retry/timeout semantics are bounded and intentional.
Idempotency strategy exists where at-least-once delivery is possible.
Failure handling is observable and testable.

Failure Conditions

Stop when consistency assumptions are implicit or contradictory.
Stop when retries/timeouts can amplify failure unboundedly.
Escalate when critical failure modes have no mitigation path.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open in GitHub Open in ClawHub

Related Skills

Related by shared tags or category signals.

Research

risk-requirements-analysis

No summary provided by upstream source.

Repository SourceNeeds Review

Research

ux-research-synthesis

No summary provided by upstream source.

Repository SourceNeeds Review

Research

algorithm-complexity-analysis

No summary provided by upstream source.

Repository SourceNeeds Review

Research

architecture-tradeoff-analysis

No summary provided by upstream source.

Repository SourceNeeds Review