Module leader_election

Module leader_election 

Source
Expand description

Leader election recipe for distributed consensus

§Leader Election for FoundationDB

A distributed leader election recipe using FoundationDB as coordination backend. Similar to Apache Curator’s LeaderLatch for ZooKeeper, but leveraging FDB’s serializable transactions for stronger guarantees.

§When to Use This

Good use cases:

  • Singleton services (only one instance should be active)
  • Job schedulers (one coordinator assigns work)
  • Primary/backup failover
  • Exclusive access to external resources

Consider alternatives if:

  • You need mutex/lock semantics for short critical sections (use FoundationDB transactions directly)
  • You need fair queuing (this uses priority-based preemption)

§API Overview

The main entry point is LeaderElection. Typical usage follows this pattern:

StepMethodFrequency
1. SetupnewOnce per process
2. InitializeinitializeOnce globally (idempotent)
3. Registerregister_candidateOnce per process
4. Election looprun_election_cycleEvery heartbeat interval
5. Shutdownresign_leadership + unregister_candidateOn graceful exit

For advanced use cases, lower-level methods are available:

§Key Concepts

§Ballots

Ballot numbers work like Raft’s term - a monotonically increasing counter that establishes ordering. Higher ballot always wins. Each leadership claim or lease refresh increments the ballot. This prevents split-brain scenarios after network partitions heal.

The ballot is returned in LeaderState::ballot and can be used as a fencing token when accessing external resources.

§Leases

Leaders hold time-bounded leases configured via lease_duration. A leader must call run_election_cycle (or refresh_lease) before the lease expires to maintain leadership.

If a leader fails to refresh (crash, network partition), other candidates can claim leadership after the lease expires.

§Preemption

When allow_preemption is true, higher-priority candidates can preempt lower-priority leaders. Priority is set via the priority parameter in register_candidate. This enables graceful leadership migration to new machines during rolling deployments or infrastructure upgrades.

§Configuration

Configure via ElectionConfig passed to initialize_with_config:

FieldDefaultDescription
lease_duration10sHow long leadership is valid without refresh
heartbeat_interval3sRecommended interval for calling run_election_cycle
candidate_timeout15sWhen to consider candidates dead
election_enabledtrueEnable/disable elections globally
allow_preemptiontrueAllow priority-based preemption

Rule of thumb: heartbeat_interval should be less than lease_duration / 3 to allow retries before lease expires.

§Return Types

§Safety Properties

  • Mutual Exclusion: At most one leader at any time (guaranteed by FDB serializable transactions)
  • Liveness: A correct process eventually becomes leader
  • Consistency: Ballot numbers provide total ordering of leadership changes

§Simulation Testing

This implementation is validated through FoundationDB’s deterministic simulation framework under extreme conditions including network partitions, process failures, and clock skew up to ±2 seconds.

Key invariants verified:

  • No overlapping leadership (mutual exclusion)
  • Ballot monotonicity (ballots never regress)
  • Fencing token validity (each claim increments ballot)

See foundationdb-recipes-simulation crate for test configurations.

Structs§

CandidateInfo
Information about a registered candidate
ElectionConfig
Global configuration for the leader election system
LeaderElection
Coordinator for distributed leader election.
LeaderState
The core leader state - stored at a single key

Enums§

ElectionResult
Result of an election cycle
LeaderElectionError
Leader election specific errors

Type Aliases§

Result
Result type for leader election operations