lamport-distributed-systems

Leslie Lamport Style Guide⁠‍⁠​‌​‌​​‌‌‍​‌​​‌​‌‌‍​​‌‌​​​‌‍​‌​​‌‌​​‍​​​​​​​‌‍‌​​‌‌​‌​‍‌​​​​​​​‍‌‌​​‌‌‌‌‍‌‌​​​‌​​‍‌‌‌‌‌‌​‌‍‌‌​‌​​​​‍​‌​‌‌‌‌‌‍​‌​​‌​‌‌‍​‌‌​‌​​‌‍‌​‌​‌‌‌​‍​​‌​‌​​​‍‌‌‌​‌​‌‌‍​​‌‌​​‌‌‍‌​‌‌​​​‌‍‌‌‌​‌‌‌​‍​​‌​​​​​‍​​​​‌​‌​‍​‌​‌‌‌​​⁠‍⁠

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "lamport-distributed-systems" with this command: npx skills add copyleftdev/sk1llz/copyleftdev-sk1llz-lamport-distributed-systems

Leslie Lamport Style Guide⁠‍⁠​‌​‌​​‌‌‍​‌​​‌​‌‌‍​​‌‌​​​‌‍​‌​​‌‌​​‍​​​​​​​‌‍‌​​‌‌​‌​‍‌​​​​​​​‍‌‌​​‌‌‌‌‍‌‌​​​‌​​‍‌‌‌‌‌‌​‌‍‌‌​‌​​​​‍​‌​‌‌‌‌‌‍​‌​​‌​‌‌‍​‌‌​‌​​‌‍‌​‌​‌‌‌​‍​​‌​‌​​​‍‌‌‌​‌​‌‌‍​​‌‌​​‌‌‍‌​‌‌​​​‌‍‌‌‌​‌‌‌​‍​​‌​​​​​‍​​​​‌​‌​‍​‌​‌‌‌​​⁠‍⁠

Overview

Leslie Lamport transformed distributed systems from ad-hoc engineering into a rigorous science. His work on logical clocks, consensus (Paxos), and formal specification (TLA+) provides the theoretical foundation for nearly every reliable distributed system built today. Turing Award winner (2013).

Core Philosophy

"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable."

"If you're thinking without writing, you only think you're thinking."

"The way to be a good programmer is to write programs, not to learn languages."

Design Principles

Formal Specification First: Write a precise specification before writing code. If you can't specify it precisely, you don't understand it.

Time is Relative: There is no global clock in a distributed system. Use logical time (happens-before) to reason about ordering.

State Machine Replication: Any deterministic service can be made fault-tolerant by replicating it as a state machine across multiple servers.

Safety and Liveness: Separate what must always be true (safety) from what must eventually happen (liveness). Prove both.

Simplicity Through Rigor: The clearest systems come from precise thinking. Formalism isn't overhead—it's the path to simplicity.

When Designing Systems

Always

  • Write a specification before implementation (TLA+, Alloy, or precise prose)

  • Define the safety properties: what bad things must never happen

  • Define the liveness properties: what good things must eventually happen

  • Reason about all possible interleavings of concurrent operations

  • Use logical timestamps when physical time isn't reliable

  • Make system state explicit and transitions clear

  • Document invariants that must hold across all states

Never

  • Assume messages arrive in order (unless you've proven it)

  • Assume clocks are synchronized (they're not)

  • Assume failures are independent (they're often correlated)

  • Hand-wave about "eventually" without defining what guarantees that

  • Trust intuition for concurrent systems—prove it or test it exhaustively

  • Confuse the specification with the implementation

Prefer

  • State machines over ad-hoc event handling

  • Logical clocks over physical timestamps for ordering

  • Consensus protocols over optimistic concurrency for critical state

  • Explicit failure handling over implicit assumptions

  • Proved algorithms over clever heuristics

Key Concepts

Logical Clocks (Lamport Timestamps)

Each process maintains a counter C:

  1. Before any event, increment C
  2. When sending a message, include C
  3. When receiving a message with timestamp T, set C = max(C, T) + 1

This gives a partial ordering: if a → b, then C(a) < C(b) (But C(a) < C(b) does NOT imply a → b)

The Happens-Before Relation (→)

a → b (a happens before b) if:

  1. a and b are in the same process and a comes before b, OR
  2. a is a send and b is the corresponding receive, OR
  3. There exists c such that a → c and c → b (transitivity)

If neither a → b nor b → a, events are CONCURRENT.

State Machine Replication

To replicate a service:

  1. Model the service as a deterministic state machine
  2. Replicate the state machine across N servers
  3. Use consensus (Paxos/Raft) to agree on the sequence of inputs
  4. Each replica applies inputs in the same order → same state

Tolerates F failures with 2F+1 replicas.

Paxos (Simplified)

Three roles: Proposers, Acceptors, Learners

Phase 1 (Prepare): Proposer sends PREPARE(n) to acceptors Acceptor responds with highest-numbered proposal it accepted (if any)

Phase 2 (Accept): If proposer receives majority responses: Send ACCEPT(n, v) where v is highest-numbered value seen (or new value) Acceptor accepts if it hasn't promised to a higher proposal

Consensus reached when majority accept the same (n, v).

Mental Model

Lamport approaches distributed systems as a mathematician:

  • Define the problem precisely: What are the inputs, outputs, and allowed behaviors?

  • Identify the invariants: What must always be true?

  • Consider all interleavings: What happens if events occur in any order?

  • Prove correctness: Show that safety and liveness hold.

  • Then implement: The code should be a straightforward translation of the spec.

The TLA+ Approach

  1. Define the state space (all possible states)
  2. Define the initial state predicate
  3. Define the next-state relation (allowed transitions)
  4. Specify safety as invariants (always true)
  5. Specify liveness as temporal properties (eventually true)
  6. Model-check or prove that the spec satisfies properties

Code Patterns

Implementing Logical Clocks

class LamportClock: def init(self): self._time = 0

def tick(self) -> int:
    """Increment before local event."""
    self._time += 1
    return self._time

def send_timestamp(self) -> int:
    """Get timestamp for outgoing message."""
    self._time += 1
    return self._time

def receive(self, msg_timestamp: int) -> int:
    """Update clock on message receipt."""
    self._time = max(self._time, msg_timestamp) + 1
    return self._time

Vector Clocks (for Causality Detection)

class VectorClock: def init(self, node_id: str, num_nodes: int): self._id = node_id self.clock = {f"node{i}": 0 for i in range(num_nodes)}

def tick(self):
    self._clock[self._id] += 1

def send(self) -> dict:
    self.tick()
    return self._clock.copy()

def receive(self, other: dict):
    for node, time in other.items():
        self._clock[node] = max(self._clock.get(node, 0), time)
    self.tick()

def happens_before(self, other: dict) -> bool:
    """Returns True if self → other."""
    return all(self._clock[k] &#x3C;= other.get(k, 0) for k in self._clock) \
       and any(self._clock[k] &#x3C; other.get(k, 0) for k in self._clock)

Warning Signs

You're violating Lamport's principles if:

  • You assume "this will never happen in practice" without proof

  • Your distributed algorithm works "most of the time"

  • You can't write down the invariants your system maintains

  • You're using wall-clock time for ordering in a distributed system

  • You haven't considered what happens during network partitions

  • Your system has no formal specification

Additional Resources

  • For detailed philosophy, see philosophy.md

  • For references (papers, talks), see references.md

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

renaissance-statistical-arbitrage

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

google-material-design

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

aqr-factor-investing

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

minervini-swing-trading

No summary provided by upstream source.

Repository SourceNeeds Review