Gateway Metrics
StableWhich Gateway metrics are especially useful for the client entry point, pooling, freeze/drain, and backend-routing problems.
Updated: March 21, 2026
Gateway is the user-facing PostgreSQL entry point of SPG99, so its metrics best answer questions like “why can’t the application connect,” “how is the pool behaving,” and “is client traffic getting in the way of autoscale handoff.”
Basic liveness signals
upon the metrics port — whether Gateway itself is available;/health— a quick liveness/health probe.
TLS and backend connection
gw_backend_tls_ok_total
Counts successful backend TLS handshakes between Gateway and Compute.
gw_backend_tls_handshake_errors_total
Shows backend TLS handshake errors. Useful when investigating problems with the CA chain, SNI, and backend certificate re-creation.
Lease and database lifecycle
spg99_lease_active
Shows whether a lease is active for a specific database.
spg99_lease_acquire_total
Counts lease-acquire attempts, including errors.
Practical meaning: these metrics make it easy to see whether Gateway is waking a sleeping database, whether the lease path is getting stuck, and whether the database is being held longer than needed because of client activity.
Pooling and pressure on the backend
spg99_pool_total_conns
How many backend connections are open in the pool.
spg99_pool_idle_conns
How many backend connections are idle and ready for reuse.
spg99_pool_checkout_wait_seconds
How long clients wait to obtain a backend connection.
spg99_pool_checkout_timeouts_total
How many times checkout hit a timeout or the pool was exhausted.
spg99_pool_backend_connect_total
How many attempts were made to open a backend connection, broken down by result.
spg99_pool_resets_total
How many backend connections were reset before reuse.
What is especially important for the new autoscaler
spg99_session_pinned_total
Shows how often sessions enter the pinned state and for what reason.
This is especially useful when:
- the application unexpectedly started using
SET, temp tables, cursors,LISTEN, or named prepared statements; - transaction pooling stopped delivering the expected savings;
- autoscale handoff cannot reach a safe drain point.
Freeze and cutover
When freeze_new_checkouts=true, Gateway must stop issuing new checkouts to the old writer. At that moment, it is especially useful to watch:
- checkout waits;
- checkout timeouts;
- pinned sessions;
- overall lease duration and the number of active connections.
How to interpret the metrics in practice
Scenario 1: connections are slow only after idle
Look at spg99_lease_acquire_total, spg99_lease_active, spg99_pool_backend_connect_total, and checkout timings. If startup is slow only on a sleeping database, this is a normal cold-start path.
Scenario 2: handoff is stuck on freeze/drain
Look at spg99_session_pinned_total, checkout wait, and the total number of active/idle client connections. Often the problem is not the platform itself, but the workload keeping too much session state alive.
Scenario 3: the application is bottlenecked by the pool
Look at spg99_pool_checkout_timeouts_total, spg99_pool_total_conns, and spg99_pool_idle_conns. If timeouts are increasing while the idle pool is zero, you either need a larger profile, lower client fan-out, or an explanation of why the workload pins sessions.
Scenario 4: the route to the backend is unstable
Look at gw_backend_tls_handshake_errors_total and spg99_pool_backend_connect_total{result=...}. This helps quickly separate a routing/TLS issue from client-login errors.
Practical conclusion
For Gateway, metrics are the main way to understand:
- whether autostart works;
- whether pooling is delivering real savings;
- whether the workload has shifted from stateless to session-heavy;
- whether pinned traffic is preventing a safe autoscale handoff.
