Safekeeper

Stable

How Safekeeper maintains WAL quorum and why durability in SPG99 does not depend on the life of one compute instance.

Updated: March 5, 2026

Safekeeper is the internal WAL service of SPG99 responsible for write durability. Put very simply, Safekeeper is where the platform first stores PostgreSQL change history reliably before it considers that history safe.

Safekeeper does not execute SQL, does not accept user DSN connections, and does not replace PostgreSQL. Its task is different:

  • accept WAL from Compute;
  • hold it on several nodes;
  • confirm the durable boundary;
  • provide that history to Pageserver and to the next generation of Compute.

For the user, this means one very important thing: data in SPG99 is protected not by the “permanent” local disk of a specific Compute node, but by WAL quorum in Safekeeper.

That is exactly why a database in SPG99 can:

  • start quickly on demand;
  • run on ephemeral Compute;
  • stop safely when idle;
  • survive the loss of a specific worker or a move to another Compute;
  • avoid losing already confirmed changes simply because the old worker disappeared.

Where Safekeeper sits in the chain

The typical SPG99 chain looks like this:

user / Console / API
    -> Control Plane
    -> Gateway
    -> Compute
    -> Safekeeper + Pageserver + Object Storage

If you follow the write lifecycle step by step:

  1. The user connects to the database through Gateway.
  2. Gateway brings the client to a live Compute.
  3. Compute executes SQL inside PostgreSQL.
  4. When PostgreSQL generates WAL, Compute sends that WAL to several Safekeepers at once.
  5. Safekeeper accepts the WAL, writes it into its local journal, and confirms the write boundary.
  6. After quorum flush, that WAL boundary is considered durable.
  7. Pageserver subscribes to WAL from Safekeeper and materializes the database history into storage state.
  8. During the next cold start, a new Compute takes basebackup from Pageserver and continues from the WAL history preserved by Safekeeper.

From this follows the main architectural meaning of Safekeeper: it is not “an archive off to the side,” but the central durability layer for writes in SPG99.

How Safekeeper differs from other services

To avoid confusion:

  • Control Plane creates tenants and databases, stores the catalog, and bumps the writer term.
  • Gateway accepts user connections and routes them to Compute.
  • Compute executes queries and generates WAL.
  • Pageserver turns WAL history into storage state and serves basebackup.
  • Safekeeper keeps WAL quorum and the durable boundary of writes.

Very short version:

  • Compute creates WAL;
  • Safekeeper makes WAL durable;
  • Pageserver turns that history into a startable storage state;
  • Control Plane and Gateway tie all of this together into a single product.

The main role of Safekeeper for the user

Safekeeper solves four practical tasks:

1. It makes commits durable in the serverless model

Already confirmed WAL history is preserved in a quorum even though Compute itself can be stopped or replaced.

2. It separates durability from the Compute lifecycle

Data is not tied to one container, VM, or local PGDATA.

3. It provides Pageserver with the source of truth for WAL

Pageserver does not guess the history; it gets it from Safekeeper.

4. It allows a new Compute to start without manual recovery

A new writer relies on the WAL already held by Safekeeper quorum.

Core Safekeeper entities

Tenant, database, timeline

For Safekeeper, WAL history is addressed not by user and not by DSN, but by the triple:

tenant + db + timeline

This is how the service understands which database a WAL stream belongs to.

WAL and LSN

Safekeeper stores not “tables,” but a sequence of WAL changes. For the user, the key idea is this: if an SQL operation changed data, its trace first goes into WAL, and Safekeeper is what makes that WAL reliable.

Important LSN boundaries are:

  • next_lsn — up to which position WAL has already been written locally;
  • durable_lsn — up to which position WAL is already considered durable;
  • flush_lsn / commit_lsn — the practical reflection of that same safe boundary.

WAL segments

WAL is split into segments. In the current scheme, the default is 16 MiB. On local disk, there are two main states:

  • .partial — a segment still being written;
  • .ready — a completed segment that can be uploaded to object storage and used for backfill.

Writer term

Writer term is the generation number of the active writer. The Control Plane increments it when a new Compute generation starts, and Safekeeper accepts WAL only from the current term. This is one of the key split-brain protection mechanisms.

How the write path works and why it is reliable

Quorum 2 out of 3

In the current SPG99 contract, Safekeeper is used as a set of three addresses, and WAL quorum is 2/3. This means:

  • Compute sends append to all three nodes;
  • as soon as two nodes successfully confirm append, the write path can move on;
  • as soon as two nodes successfully confirm flush, the corresponding WAL boundary is considered quorum-durable;
  • the third node may catch up later and does not need to slow down every write.

This is one of the main trade-offs between speed and reliability.

Why this matters to the user

If one of the three nodes is unavailable but the other two are working, the platform can still maintain WAL quorum. This usually means the database can keep writing and commits can still be confirmed. A single failure does not have to stop the service.

If quorum is lost and only one working node remains, write durability can no longer be honestly guaranteed. The normal platform behavior in that case is not to keep writing “silently and unsafely,” but to show degradation and avoid risking history consistency.

Why Pageserver lag is not the same as commit loss

Pageserver materializes WAL asynchronously. That is normal. A commit can already be durable in Safekeeper even if Pageserver has not yet caught up to the same point in apply. For the user, the important distinction is: durability and materialization are different stages.

How Safekeeper protects against split-brain

Writer term exists so that an old Compute generation cannot continue writing over a new one. If Safekeeper sees an append with a term lower than the current one, that request is rejected as a stale term.

There is also a subtler scenario: a new term may start writing not from the end of the local tail on a specific node, but from an earlier quorum-durable point. In that case, Safekeeper truncates the extra divergent tail and brings the journal to a consistent safe boundary, rather than trying to glue together two conflicting WAL branches.

This is a critically important correctness guarantee for the user: the system must not silently merge old and new write history.

Where Safekeeper stores data

In the current architecture, the service combines two storage layers:

1. A local journal on persistent storage

Each Safekeeper has its own local WAL journal on a persistent volume. This means restarting a pod or process is not the same as losing all local history.

2. Object storage

Closed .ready segments are uploaded to S3-compatible object storage. This is needed for:

  • backfilling older WAL;
  • limiting local disk growth;
  • more resilient recovery for history consumers.

But it is crucial not to confuse the roles: object storage is useful and important, but the durable commit path in SPG99 is based on Safekeeper quorum, not on the completion of an S3 upload.

Why Safekeeper helps with fast starts

At first glance, Safekeeper may look like a write-durability component only. In reality, its role is broader.

During cold start, the new Compute relies on a safe quorum-durable point, and Pageserver builds bootstrap relative to it. This means:

  • Compute can be started on a new worker;
  • Pageserver understands up to which LSN it can safely assemble startup state;
  • stop, crash, or redeploy do not require manual recovery of the database “from scratch.”

So Safekeeper is important not only for write durability, but also for correct and predictable database startup after stop or crash.

What is important for the user to remember

  • Safekeeper is an internal platform service. Users normally do not interact with it directly.
  • Data durability in SPG99 is not the same as “data sits on the Compute disk.”
  • Basebackup and object storage do not replace Safekeeper quorum.
  • Losing one Safekeeper does not necessarily stop the service.
  • Losing quorum is already a serious degradation, and the system should show that clearly.
  • Small Pageserver lag is not the same as commit loss.
  • Creating and deleting timelines is best done through the Control Plane, which coordinates Pageserver, Safekeeper, and the catalog consistently.

Short summary

Safekeeper in SPG99 is the service that makes serverless PostgreSQL truly reliable.

For the user, this means:

  • durability does not depend on the lifetime of one Compute;
  • commits are protected by WAL quorum;
  • Pageserver receives correct change history;
  • new Compute can start from a safe point;
  • one node failure does not have to break writes;
  • the platform can honestly stop an unsafe scenario when quorum is lost, rather than risking your data.

For more about the role of the service in the overall architecture, see Service Roles in the Chain.