← How-To

From Monitoring to Action: Using pg_stat_monitor to Optimize Queries

If you'll permit me a frank assessment: pg_stat_statements tells you what happened. pg_stat_monitor tells you what happened, when it happened, to whom, and under what plan.

The Butler of Gold Lapel · March 21, 2026 · 18 min read
The commissioned illustration has yet to arrive. We are told the artist is monitoring progress.

Why pg_stat_statements is not enough, if you'll forgive my candour

Good evening. I see you've arrived with questions about your query monitoring, and I should like to attend to them properly. pg_stat_statements is the standard tool for identifying slow queries in PostgreSQL, and it does that job well. It accumulates statistics — total execution time, call count, rows returned — for every normalized query pattern since the last reset. When you need to answer "which queries cost the most overall," it is the right tool.

But it has a structural limitation that I confess I find genuinely vexing: the statistics are cumulative. Each query gets a single row that aggregates everything from the moment counters were last reset. A query that averaged 5ms for three months and then jumped to 500ms in the last hour still shows a comfortable average in pg_stat_statements. The regression is diluted into the lifetime aggregate, invisible unless you happened to snapshot the statistics before the problem started. Three months of good behaviour providing cover for an hour of catastrophe. One does not run a household this way.

I should be fair: this is not a bug — it is a design choice. pg_stat_statements was built for aggregate analysis, not temporal analysis. If you want to know when a query regressed, you need to either reset statistics manually (losing all historical data in the process, which rather defeats the purpose) or build an external system that snapshots the view periodically and computes deltas. Neither option is what I would call elegant.

pg_stat_monitor, developed by Percona, takes a different — and in my estimation, rather more civilised — approach. It divides statistics into time buckets — configurable windows (default: 5 minutes) that each hold independent counters. When a bucket expires, the extension rotates to the next one. The result is a built-in sliding window of recent query performance: you can compare the last 5 minutes to the previous 5 minutes directly from SQL, without external tooling, without resetting anything. Context is preserved. This is the difference between being told "something happened" and being told "this happened, twelve minutes ago, in the drawing room."

That is merely the beginning. pg_stat_monitor also captures the actual execution plan for each query, records the client IP and application name that originated it, provides response time histograms, and detects when PostgreSQL switches to a different execution plan for the same query. These are capabilities that pg_stat_statements does not offer — not because it was poorly designed, but because they were outside its original brief. Allow me to walk you through each of them.

How do you install pg_stat_monitor?

pg_stat_monitor is not included in PostgreSQL's contrib modules — a minor inconvenience, but one easily resolved. It must be installed separately, from Percona's repositories, compiled from source, or installed via PGXN. Like pg_stat_statements, it requires a spot in shared_preload_libraries, which means a PostgreSQL restart when adding it for the first time. I appreciate that restarts are not to be taken lightly, so let us ensure the configuration is right before you commit to one.

postgresql.conf + SQL
-- 1. Add to postgresql.conf (requires restart)
shared_preload_libraries = 'pg_stat_monitor'

-- 2. Restart PostgreSQL, then create the extension
CREATE EXTENSION pg_stat_monitor;

-- 3. Verify it is collecting data
SELECT count(*) FROM pg_stat_monitor;

After the restart and extension creation, pg_stat_monitor begins tracking every query immediately. No further configuration is required for basic operation — the defaults (10 buckets at 300 seconds each) give you approximately 50 minutes of rolling history. It begins its duties at once, without further instruction.

Running alongside pg_stat_statements

You do not need to choose between the two. Both extensions can coexist in shared_preload_libraries — but the load order matters. pg_stat_statements must be listed first:

postgresql.conf + SQL
-- Running alongside pg_stat_statements — order matters
-- pg_stat_statements must be listed first
shared_preload_libraries = 'pg_stat_statements, pg_stat_monitor'

-- After restart, create both extensions
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
CREATE EXTENSION IF NOT EXISTS pg_stat_monitor;

Why run both? A reasonable question. Many third-party monitoring tools (pganalyze, Datadog, PMM) expect pg_stat_statements and query it directly. Removing it would break those integrations — and one does not dismiss reliable staff simply because new staff have arrived. Running pg_stat_monitor alongside gives you the time-bucketed analysis and plan capture without disrupting your existing monitoring stack.

One caveat noted in Percona's documentation: memory and WAL statistics may show inconsistencies when both extensions are active, because they intercept the same hooks and account for the same data differently. For most practical purposes, this does not affect query timing analysis — the primary use case for pg_stat_monitor.

Configuration: setting the household in order

All parameters are prefixed with pg_stat_monitor. and visible in pg_settings. Some require a restart; others take effect after a configuration reload. Allow me to draw your attention to the ones that matter most.

SQL
-- View all pg_stat_monitor configuration parameters
SELECT name, setting, short_desc
FROM pg_settings
WHERE name LIKE 'pg_stat_monitor%'
ORDER BY name;

Here is a recommended starting configuration with commentary on each parameter:

postgresql.conf
-- postgresql.conf — recommended starting configuration

-- Time bucketing: 60-second buckets, 30 retained = 30 minutes of history
pg_stat_monitor.pgsm_bucket_time = 60
pg_stat_monitor.pgsm_max_buckets = 30

-- Query plan capture: off by default, enable when investigating
pg_stat_monitor.pgsm_enable_query_plan = no

-- Histogram: 1ms to 10s range, 20 buckets
pg_stat_monitor.pgsm_histogram_min = 1
pg_stat_monitor.pgsm_histogram_max = 10000
pg_stat_monitor.pgsm_histogram_buckets = 20

-- Show actual parameter values instead of placeholders
pg_stat_monitor.pgsm_normalized_query = 0

-- Shared memory: 256MB is usually sufficient
pg_stat_monitor.pgsm_max = 256

Tuning bucket time and count

The two most important parameters are pgsm_bucket_time and pgsm_max_buckets. Together, they define your monitoring window:

  • Total history = pgsm_bucket_time × pgsm_max_buckets
  • Defaults (300s × 10) = 50 minutes
  • Recommended for active monitoring (60s × 30) = 30 minutes with finer granularity
  • For longer retention (60s × 60) = 1 hour, but uses more shared memory

Shorter bucket times give finer resolution for detecting regressions — a 60-second bucket can catch a spike that a 300-second bucket would smooth into obscurity. The trade-off is memory: more buckets with more distinct queries means more rows in shared memory. I would start with 60-second buckets and 30 of them. You can always adjust once you understand your workload's character.

Query plan capture

This is the feature that earns pg_stat_monitor its place in the household. When enabled, the extension stores the text representation of the execution plan that PostgreSQL chose for each query execution. This means you can see whether a query used an index scan or a sequential scan without manually running EXPLAIN — the evidence is already collected and waiting for your inspection.

SQL — toggle plan capture
-- Enable query plan capture for active investigation
ALTER SYSTEM SET pg_stat_monitor.pgsm_enable_query_plan = 'yes';
SELECT pg_reload_conf();

-- ... investigate the issue ...

-- Disable when done — plan capture adds overhead
ALTER SYSTEM SET pg_stat_monitor.pgsm_enable_query_plan = 'no';
SELECT pg_reload_conf();

Let me be forthcoming about the cost: plan serialization runs on every query execution, adding CPU overhead proportional to query throughput. On an OLTP workload doing 10,000 queries per second, you will notice. On an analytical workload doing 50 queries per second, you will not. The recommended pattern — and the one I would insist upon — is to keep it off by default and enable it surgically when you need to investigate a specific problem.

When plan capture is enabled, pg_stat_monitor creates separate rows for each distinct plan used for the same query — identified by the planid column. This is how you detect plan changes: if the same queryid appears with multiple planid values across buckets, PostgreSQL switched plans.

Histogram configuration

The response time histogram is controlled by three parameters:

  • pgsm_histogram_min — the lower bound in milliseconds (default: 0)
  • pgsm_histogram_max — the upper bound in milliseconds (default: 100000, i.e. 100 seconds)
  • pgsm_histogram_buckets — the number of buckets between min and max (default: 10)

The default range of 0-100s with 10 buckets creates 10-second-wide histogram buckets — far too coarse for most OLTP workloads where you care about the difference between 1ms and 100ms. I'm afraid this default is rather like having a thermometer that only reads "cold" and "hot." Setting pgsm_histogram_min = 1, pgsm_histogram_max = 10000, and pgsm_histogram_buckets = 20 gives you 500ms-wide buckets from 1ms to 10s, plus two overflow buckets (below 1ms and above 10s) in pg_stat_monitor 2.0+. Much more informative.

Time-bucketed analysis: what happened in the last five minutes

This is the capability that, in my view, justifies the extension entirely. The core advantage of pg_stat_monitor over pg_stat_statements is the ability to compare query performance across time windows. Instead of a single cumulative row per query — a lifetime average that loses all sense of when — you get independent statistics for each time bucket. You get temporal context. You get the ability to say "this query was fine at 14:05 and catastrophic at 14:10."

Current bucket: what is slow right now

SQL — top queries in the active bucket
-- Top 10 queries by total execution time in the most recent bucket
SELECT
  bucket_start_time,
  substr(query, 1, 80) AS query_preview,
  calls,
  round(total_exec_time::numeric, 2) AS total_ms,
  round(mean_exec_time::numeric, 2) AS mean_ms,
  round(max_exec_time::numeric, 2) AS max_ms
FROM pg_stat_monitor
WHERE bucket_start_time = (
  SELECT max(bucket_start_time) FROM pg_stat_monitor
)
ORDER BY total_exec_time DESC
LIMIT 10;

This is the equivalent of checking pg_stat_statements immediately after a reset — but without actually resetting anything, and without losing your historical data in the bargain. The bucket isolates recent activity automatically. No sacrifice required.

Regression detection: what got slower

The real power, however, lies in comparison. By joining the current bucket against the previous bucket on queryid, you can identify queries whose performance changed — and the specificity of that comparison is what makes this worth the shared memory:

SQL — detect regressions between buckets
-- Compare a query's performance across the two most recent buckets
-- A spike in mean_exec_time between buckets signals a regression
WITH recent_buckets AS (
  SELECT DISTINCT bucket_start_time
  FROM pg_stat_monitor
  ORDER BY bucket_start_time DESC
  LIMIT 2
),
current AS (
  SELECT queryid, query, calls, mean_exec_time, total_exec_time
  FROM pg_stat_monitor
  WHERE bucket_start_time = (SELECT max(bucket_start_time) FROM recent_buckets)
),
previous AS (
  SELECT queryid, calls, mean_exec_time, total_exec_time
  FROM pg_stat_monitor
  WHERE bucket_start_time = (SELECT min(bucket_start_time) FROM recent_buckets)
)
SELECT
  substr(c.query, 1, 60) AS query_preview,
  p.calls AS prev_calls,
  c.calls AS curr_calls,
  round(p.mean_exec_time::numeric, 2) AS prev_mean_ms,
  round(c.mean_exec_time::numeric, 2) AS curr_mean_ms,
  round(
    ((c.mean_exec_time - p.mean_exec_time) / NULLIF(p.mean_exec_time, 0)) * 100
  , 1) AS pct_change
FROM current c
JOIN previous p USING (queryid)
WHERE p.mean_exec_time > 0
  AND c.mean_exec_time > p.mean_exec_time * 1.5  -- 50%+ regression
ORDER BY (c.mean_exec_time - p.mean_exec_time) * c.calls DESC
LIMIT 10;

This query surfaces every query whose mean execution time increased by 50% or more between the two most recent buckets, sorted by total impact (time increase multiplied by call count). A query that went from 2ms to 5ms but runs 100,000 times per bucket appears higher than one that went from 100ms to 10s but runs once.

With pg_stat_statements, this analysis requires an external system that snapshots the view periodically and computes the deltas — infrastructure you must build, maintain, and trust to have been running before the problem started. With pg_stat_monitor, it is a single SQL query against the built-in view. The data was already being collected. It was waiting for you to ask.

Per-query plan capture: seeing the actual execution plan

Identifying a regression is step one. Understanding why the regression happened is step two — and the more consequential of the pair. This is where plan capture earns its place on the staff.

When pgsm_enable_query_plan is set to yes, the query_plan column contains the text representation of the execution plan PostgreSQL used. This is the same output you would get from EXPLAIN — but captured automatically for every execution, without anyone having to run EXPLAIN manually.

SQL — retrieve captured plans
-- Retrieve stored execution plans for the slowest queries
-- Requires: pgsm_enable_query_plan = yes
SELECT
  substr(query, 1, 60) AS query_preview,
  calls,
  round(mean_exec_time::numeric, 2) AS mean_ms,
  planid,
  query_plan
FROM pg_stat_monitor
WHERE query_plan IS NOT NULL
  AND bucket_start_time = (
    SELECT max(bucket_start_time) FROM pg_stat_monitor
  )
ORDER BY mean_exec_time DESC
LIMIT 5;

Detecting plan changes

One of the most common causes of sudden query regressions — and one of the most maddening to diagnose without the right tools — is a plan change. PostgreSQL decides to use a different execution strategy, often after an ANALYZE updates table statistics or after data volume crosses a threshold that changes the planner's cost estimates. The query text hasn't changed. The schema hasn't changed. But the plan has, and performance has gone with it.

The planid column in pg_stat_monitor is a hash of the execution plan. When the same queryid appears with different planid values, the query's execution strategy changed:

SQL — detect plan changes
-- Detect queries that changed execution plans between buckets
-- A new planid for the same queryid often explains sudden regressions
SELECT
  queryid,
  substr(query, 1, 60) AS query_preview,
  bucket_start_time,
  planid,
  calls,
  round(mean_exec_time::numeric, 2) AS mean_ms
FROM pg_stat_monitor
WHERE queryid IN (
  SELECT queryid
  FROM pg_stat_monitor
  GROUP BY queryid
  HAVING count(DISTINCT planid) > 1
)
ORDER BY queryid, bucket_start_time DESC;

This is diagnostic information that pg_stat_statements simply cannot provide. It has no concept of plans — it groups by query text alone. If a query slows down because PostgreSQL switched from an index scan to a sequential scan after an ANALYZE, pg_stat_statements shows only the timing change — the symptom without the cause. pg_stat_monitor shows the timing change and the plan change, linking cause to effect. The difference between knowing that something went wrong and understanding why is the difference between helplessness and a fix.

For a deeper understanding of execution plans and how to read them, see the EXPLAIN ANALYZE guide.

Client and application attribution: who is responsible for this

pg_stat_statements groups statistics by user and database — which is rather like logging complaints by floor of the building without noting which room. pg_stat_monitor goes further: it records the client_ip and application_name for each query, allowing you to attribute database load to specific services, hosts, or connection pools.

SQL — load by client and application
-- Which applications and hosts are generating the most load?
SELECT
  client_ip,
  application_name,
  count(DISTINCT queryid) AS distinct_queries,
  sum(calls) AS total_calls,
  round(sum(total_exec_time)::numeric, 2) AS total_ms,
  round(avg(mean_exec_time)::numeric, 2) AS avg_mean_ms
FROM pg_stat_monitor
WHERE bucket_start_time = (
  SELECT max(bucket_start_time) FROM pg_stat_monitor
)
GROUP BY client_ip, application_name
ORDER BY total_ms DESC;

This answers questions that are difficult to answer with pg_stat_statements alone:

  • Which microservice is generating the most load? — group by application_name (assuming each service sets it in the connection string)
  • Is the regression coming from a specific host? — group by client_ip to see if a single application server is responsible
  • Did a particular deployment cause the regression? — if only one application host shows increased load after a deploy, the problem is likely in the code on that host, not the database

I should note: for this to be useful, your applications must set application_name in their connection strings. If they do not, you will see blank entries — and I will refrain from commenting on what that says about the household's discipline. Most PostgreSQL drivers support this:

  • Python (psycopg/asyncpg): application_name=myapp-api in the connection string
  • Node.js (pg): application_name in the connection config
  • Java (pgjdbc): ApplicationName=myapp-api in the JDBC URL
  • Go (pgx): application_name=myapp-api in the DSN

Histogram analysis: the mean is not telling you the whole truth

I have a particular grievance with mean execution time. It is not that the metric is useless — it is that it is selectively honest. A query with a mean of 10ms might be consistently 10ms — or it might be 1ms 99% of the time and 1,000ms the other 1%. Your users experiencing that 1% do not care about the mean. The operational impact of those two scenarios is profoundly different, and the mean cannot distinguish between them. It presents a tidy number and omits the chaos beneath.

pg_stat_monitor provides the resp_calls column — a PostgreSQL array that represents the response time histogram. Each element is the count of query executions whose response time fell within that histogram bucket.

SQL — response time histograms
-- Response time histogram for high-frequency queries
-- resp_calls is an array of call counts per histogram bucket
SELECT
  substr(query, 1, 60) AS query_preview,
  calls,
  round(mean_exec_time::numeric, 2) AS mean_ms,
  resp_calls
FROM pg_stat_monitor
WHERE calls > 100
  AND bucket_start_time = (
    SELECT max(bucket_start_time) FROM pg_stat_monitor
  )
ORDER BY calls DESC
LIMIT 5;
Reading a histogram
-- Example resp_calls output (20 histogram buckets, 1ms-10s range):
-- {0, 45, 312, 89, 12, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0}
--  ^                                                              ^
--  |                                                              |
--  <1ms                                                           ~9.5-10s
--
-- This query ran 462 times. 312 calls (67%) fell in the 3rd bucket.
-- But one call landed in bucket 19 — nearly 10 seconds.
-- The mean of 4.2ms hides that outlier entirely.

The histogram reveals the patterns that a well-mannered average would prefer you not see:

  • Bimodal distribution — most calls are fast, but a minority are very slow. This often indicates lock contention, buffer cache misses, or plan instability.
  • Long tail — the vast majority of calls cluster in the fast buckets, but occasional outliers reach the high end. This affects p99 latency and may indicate periodic table bloat or autovacuum interference.
  • Uniform spread — calls are distributed across multiple buckets. This is unusual and may indicate that the query's performance depends heavily on the specific parameter values.

To be fair, pg_stat_statements provides min_exec_time, max_exec_time, mean_exec_time, and stddev_exec_time — enough to calculate a rough spread. But the histogram shows the full shape of the distribution. The difference is the difference between being told "the average temperature in this room is comfortable" and being shown that one corner is freezing and another is on fire.

Additional monitoring capabilities

Error tracking

The elevel and message columns track queries that generated errors or warnings — a feature I find quietly indispensable. This lets you correlate error frequency with query patterns, useful for catching issues like constraint violations that spike after a code deployment:

SQL — error tracking
-- Queries that generated errors in the current bucket
SELECT
  substr(query, 1, 60) AS query_preview,
  elevel,
  message,
  calls,
  round(mean_exec_time::numeric, 2) AS mean_ms
FROM pg_stat_monitor
WHERE elevel > 0
  AND bucket_start_time = (
    SELECT max(bucket_start_time) FROM pg_stat_monitor
  )
ORDER BY calls DESC;

Command type breakdown

The cmd_type column categorizes each query as SELECT, INSERT, UPDATE, DELETE, or UTILITY — a quick inventory of what your database is actually spending its time on. This gives you a read/write ratio breakdown and helps identify whether load shifts are coming from read or write traffic:

SQL — command type breakdown
-- Break down load by command type
SELECT
  cmd_type,
  count(DISTINCT queryid) AS distinct_queries,
  sum(calls) AS total_calls,
  round(sum(total_exec_time)::numeric, 2) AS total_ms,
  round(avg(mean_exec_time)::numeric, 2) AS avg_mean_ms
FROM pg_stat_monitor
WHERE bucket_start_time = (
  SELECT max(bucket_start_time) FROM pg_stat_monitor
)
GROUP BY cmd_type
ORDER BY total_ms DESC;

The proper procedure: deploy, monitor, detect, fix

If you'll follow me, I should like to walk you through the complete procedure. The individual queries above are useful, but they are most powerful as part of a systematic workflow — a protocol, if you will, for maintaining the household's standards around code deployments, which are the event most likely to introduce query regressions:

The deploy-monitor-fix cycle
-- A complete workflow: deploy, monitor, detect, investigate

-- Step 1: Before deploying, note the current bucket baseline
SELECT max(bucket_start_time) AS pre_deploy_bucket FROM pg_stat_monitor;

-- Step 2: Deploy the code change

-- Step 3: Wait for one full bucket to fill (pgsm_bucket_time seconds)

-- Step 4: Compare post-deploy bucket to pre-deploy bucket
-- (use the regression detection query from the previous section)

-- Step 5: If a regression is found, enable plan capture
ALTER SYSTEM SET pg_stat_monitor.pgsm_enable_query_plan = 'yes';
SELECT pg_reload_conf();

-- Step 6: Wait for the next bucket to fill with plans attached

-- Step 7: Examine the plan for the regressed query
SELECT query, query_plan
FROM pg_stat_monitor
WHERE queryid = '<regressed_queryid>'
  AND query_plan IS NOT NULL
ORDER BY bucket_start_time DESC
LIMIT 1;

-- Step 8: Fix the query or add the missing index

-- Step 9: Disable plan capture
ALTER SYSTEM SET pg_stat_monitor.pgsm_enable_query_plan = 'no';
SELECT pg_reload_conf();

The key insight — and it is one I consider non-negotiable for any team deploying to production regularly — is that pg_stat_monitor turns regression detection from a reactive process ("something is slow, let me check the database") into a proactive one ("I deployed code, let me verify the queries are still in good order"). With 60-second buckets, you can detect a regression within minutes of deploying. Before users report it. Before the on-call page. Before the incident channel fills with questions no one can answer yet.

The complete investigation sequence, step by step

  1. Detect — use the regression detection query to identify queries whose performance changed between buckets
  2. Attribute — use client attribution to confirm which application or host is responsible for the changed query pattern
  3. Diagnose — enable plan capture, wait for the next bucket, and examine the execution plan for the regressed query
  4. Check for plan changes — use the planid comparison to see if PostgreSQL changed execution strategies
  5. Fix — add the missing index, update statistics with ANALYZE, or fix the application code
  6. Verify — wait for the next bucket and confirm the regression is resolved

Each step is a SQL query against the same pg_stat_monitor view. No external tools required, no log parsing, no manual EXPLAIN runs. The data is already there, collected quietly while you were attending to other matters. A well-run household anticipates what will be needed.

A proper comparison: pg_stat_monitor vs. pg_stat_statements

Both extensions intercept the same PostgreSQL hooks and track the same core metrics. The difference is in what additional data they capture and how they organize it. I have prepared an honest summary. Percona also maintains a detailed comparison in their documentation.

Featurepg_stat_monitorpg_stat_statements
Time-bucketed statistics
Cumulative lifetime stats
Execution plan capture
Plan change detection (planid)
Client IP tracking
Application name grouping
Response time histogram
Error/warning tracking
Command type (SELECT/INSERT/...)
Included in PostgreSQL contrib
Available on all managed providers
Lower memory overhead
Rows/shared blocks per query
Min/max/mean/stddev timing
WAL statistics

When pg_stat_statements alone will serve you well

  • You need the lightest possible monitoring overhead
  • You are on a managed provider that does not offer pg_stat_monitor (RDS, Aurora, Cloud SQL, Azure)
  • Your monitoring tools (pganalyze, Datadog) integrate with pg_stat_statements and you do not need time-bucketed analysis
  • Your workload is simple enough that "top 10 queries by total time" is sufficient

When pg_stat_monitor becomes essential

  • You need to detect regressions quickly after deployments
  • You want to understand latency distribution, not just averages
  • You need to attribute database load to specific applications or hosts
  • You want automatic execution plan capture without running EXPLAIN manually
  • You need to detect and correlate plan changes with performance regressions
  • You are running self-managed PostgreSQL or Percona Distribution for PostgreSQL

Cloud availability — an honest accounting

I'm afraid the news here is mixed. Because pg_stat_monitor is a third-party extension (not part of PostgreSQL's contrib modules), availability on managed platforms is limited:

ProviderStatus
Self-managed PostgreSQLFully supported — install from Percona repos or source
Percona Distribution for PostgreSQLIncluded and supported
SupabaseAvailable via the extensions dashboard
Amazon RDS / AuroraNot available
Google Cloud SQLNot available
Azure Database for PostgreSQLNot available
NeonNot available

If your provider does not support pg_stat_monitor, the situation is not hopeless — merely less convenient. You can snapshot pg_stat_statements periodically for time-based comparison, use auto_explain for plan capture (with log-based collection), and check pg_stat_activity for client attribution. It requires more effort and more infrastructure, but the information is available through other channels. One makes do with the staff one has.

How Gold Lapel approaches the same problem

I would be remiss if I did not mention that Gold Lapel addresses several of these same concerns from a different vantage point. Gold Lapel monitors every query at the proxy level — normalizing query text, tracking execution statistics across configurable time windows, and acting on what it finds. Time-bucketed analysis, client attribution, and regression detection are built into the proxy itself, without requiring an extension installed inside PostgreSQL.

There are advantages to both approaches. pg_stat_monitor observes from inside the database engine — it sees planning time, internal background operations, and queries from any client regardless of whether they pass through a proxy. Gold Lapel observes from the network layer — it sees application-level patterns like N+1 queries, connection pool behavior, and the gap between what the application intended and what the database received.

The two perspectives are complementary. pg_stat_monitor tells you what happened inside PostgreSQL. Gold Lapel tells you what happened in the space between the application and PostgreSQL — and acts on it. If you can run both, you have two independent sources of truth about the same workload, which is a healthy position to be in when diagnosing production issues at inconvenient hours.

For teams on managed providers where pg_stat_monitor is not available, Gold Lapel's proxy-level monitoring provides the time-bucketed analysis, regression detection, and client attribution that would otherwise require the extension — no database-side installation or restart needed. But I should be clear: everything in this guide works on its own, without Gold Lapel, and is worth implementing regardless. Good monitoring is good monitoring. The database was never the problem. It was the visibility.

Frequently asked questions