Monitoring Integration

Standard Prometheus metrics. Ready for Grafana, Datadog, and anything that speaks the format.

It falls to me to point out that Gold Lapel includes a built-in dashboard for real-time monitoring. This page covers integration with external monitoring tools.

Prometheus /metrics endpoint

Gold Lapel exposes a standard Prometheus text format endpoint at /metrics on the dashboard port (default http://127.0.0.1:7933/metrics — the dashboard port auto-derives from --proxy-port + 1). If Prometheus can reach the endpoint, it can scrape Gold Lapel. The same applies to Datadog Agent, Grafana Alloy, Victoria Metrics, or any tool that speaks Prometheus text format.

GET /metrics (excerpt)

# HELP goldlapel_time_saved_seconds_total Cumulative time saved by optimizations
# TYPE goldlapel_time_saved_seconds_total counter
goldlapel_time_saved_seconds_total 345892.471

# HELP goldlapel_queries_observed_total Total queries observed
# TYPE goldlapel_queries_observed_total counter
goldlapel_queries_observed_total 2847193

# HELP goldlapel_queries_rewritten_total Total queries rewritten
# TYPE goldlapel_queries_rewritten_total counter
goldlapel_queries_rewritten_total 1923847

# HELP goldlapel_cache_hits_total Total result cache hits
# TYPE goldlapel_cache_hits_total counter
goldlapel_cache_hits_total 4129381

# HELP goldlapel_active_connections Active client connections
# TYPE goldlapel_active_connections gauge
goldlapel_active_connections 42

# HELP goldlapel_info Gold Lapel version info
# TYPE goldlapel_info gauge
goldlapel_info{version="0.14.0"} 1

The endpoint returns text/plain; version=0.0.4; charset=utf-8 — the standard Prometheus exposition format. Every metric includes # HELP and # TYPE annotations, so your monitoring tool can auto-discover descriptions and types without additional configuration.

The metrics endpoint is part of the dashboard server and shares its port. If you have the dashboard running — which it does by default — you already have the metrics endpoint.

Scrape configuration

Add Gold Lapel to your Prometheus scrape configuration:

# prometheus.yml
scrape_configs:
  - job_name: goldlapel
    scrape_interval: 15s
    static_configs:
      - targets: ['localhost:7933']
    metrics_path: /metrics

A 15-second scrape interval works well. Gold Lapel's metrics endpoint is lightweight — it reads from in-memory counters and returns in under a millisecond. You will not notice the overhead.

For multiple Gold Lapel instances, add each target to the static_configs list, or use service discovery if your infrastructure supports it. All instances expose identical metric names, so Prometheus can aggregate them naturally with label selectors.

Metrics reference

Gold Lapel exports approximately 35 metrics across five categories. All counters are monotonically increasing totals. Gauges reflect the current value at scrape time.

Headline

Metric	Type	Description
`goldlapel_time_saved_seconds_total`	counter	Cumulative wall-clock time saved by all optimizations

This is the metric that answers "is Gold Lapel helping?" For the live view and per-pattern breakdown, see the Time saved section on the dashboard page.

Core query metrics

Metric	Type	Description
`goldlapel_queries_observed_total`	counter	Total queries observed through the proxy
`goldlapel_queries_rewritten_total`	counter	Total queries rewritten to use materialized views
`goldlapel_active_connections`	gauge	Current active client connections
`goldlapel_uptime_seconds`	gauge	Proxy uptime in seconds
`goldlapel_info`	gauge	Version info label (`goldlapel_info{version="0.14.0"} 1`)

Strategy counters

Metric	Type	Description
`goldlapel_matviews_created_total`	counter	Materialized views created
`goldlapel_immvs_created_total`	counter	Incremental materialized views created (pg_ivm)
`goldlapel_matviews_routed_total`	counter	Queries routed to materialized views
`goldlapel_matviews_expanded_total`	counter	Matview expansions (columns added to existing views)
`goldlapel_rewrites_total`	counter	Total query rewrites
`goldlapel_btree_indexes_created_total`	counter	B-tree indexes created
`goldlapel_trigram_indexes_created_total`	counter	Trigram GIN indexes created
`goldlapel_expression_indexes_created_total`	counter	Expression indexes created
`goldlapel_partial_indexes_created_total`	counter	Partial indexes created
`goldlapel_matview_indexes_created_total`	counter	Indexes on materialized views created
`goldlapel_prepared_hits_total`	counter	Prepared statement cache hits
`goldlapel_prepared_misses_total`	counter	Prepared statement cache misses
`goldlapel_cache_hits_total`	counter	Result cache hits
`goldlapel_cache_misses_total`	counter	Result cache misses
`goldlapel_coalesced_total`	counter	Queries coalesced (deduplicated in-flight requests)
`goldlapel_shadow_passes_total`	counter	Shadow verification passes
`goldlapel_shadow_failures_total`	counter	Shadow verification failures
`goldlapel_deep_pagination_warnings_total`	counter	Deep pagination warnings issued

Connection pool

Pool metrics are present when connection pooling is active.

Metric	Type	Description
`goldlapel_pool_hits_total`	counter	Connections served from idle pool
`goldlapel_pool_misses_total`	counter	Connections that required a new upstream connection
`goldlapel_pool_timeouts_total`	counter	Connection acquire timeouts
`goldlapel_pool_pins_total`	counter	Connections pinned (session state detected)
`goldlapel_pool_active`	gauge	Currently active pool connections
`goldlapel_pool_idle`	gauge	Currently idle pool connections

Replica routing

Replica metrics are present when read replicas are configured.

Metric	Type	Description
`goldlapel_replica_reads_total`	counter	Reads routed to a replica
`goldlapel_primary_reads_total`	counter	Reads routed to the primary (read-after-write safety)
`goldlapel_writes_total`	counter	Writes routed to the primary
`goldlapel_dirty_reads_total`	counter	Dirty reads sent to primary due to replication lag

Grafana setup

With Prometheus scraping Gold Lapel, adding a Grafana dashboard is straightforward. The following JSON provides a starting point with the five panels that matter most: time saved, rewrite rate, cache hit rate, active connections, and pool utilization.

Grafana dashboard JSON

{
  "dashboard": {
    "title": "Gold Lapel",
    "panels": [
      {
        "title": "Time Saved",
        "type": "stat",
        "targets": [{
          "expr": "goldlapel_time_saved_seconds_total",
          "legendFormat": "Total seconds saved"
        }],
        "fieldConfig": {
          "defaults": { "unit": "s" }
        }
      },
      {
        "title": "Query Rewrite Rate",
        "type": "gauge",
        "targets": [{
          "expr": "rate(goldlapel_queries_rewritten_total[5m]) / rate(goldlapel_queries_observed_total[5m]) * 100"
        }],
        "fieldConfig": {
          "defaults": { "unit": "percent", "max": 100 }
        }
      },
      {
        "title": "Cache Hit Rate",
        "type": "timeseries",
        "targets": [{
          "expr": "rate(goldlapel_cache_hits_total[5m]) / (rate(goldlapel_cache_hits_total[5m]) + rate(goldlapel_cache_misses_total[5m])) * 100",
          "legendFormat": "Cache hit %"
        }]
      },
      {
        "title": "Active Connections",
        "type": "timeseries",
        "targets": [{
          "expr": "goldlapel_active_connections",
          "legendFormat": "Active"
        }]
      },
      {
        "title": "Pool Utilization",
        "type": "timeseries",
        "targets": [
          { "expr": "goldlapel_pool_active", "legendFormat": "Active" },
          { "expr": "goldlapel_pool_idle", "legendFormat": "Idle" }
        ]
      }
    ]
  }
}

Import this via Grafana's dashboard import (Dashboards > Import > paste JSON). Adjust the Prometheus data source name if yours differs from the default.

Panels worth adding

The starter dashboard covers the essentials. As you get familiar with the metrics, consider adding:

Matview creation rate — rate(goldlapel_matviews_created_total[1h]) shows how actively Gold Lapel is finding new optimization opportunities. A burst early on that tapers off is normal — it means the major patterns have been covered.
Shadow verification — goldlapel_shadow_passes_total vs goldlapel_shadow_failures_total. Failures are not cause for alarm — they mean Gold Lapel caught a matview whose results did not match the original query and correctly declined to route traffic to it.
Prepared statement hit rate — rate(goldlapel_prepared_hits_total[5m]) / (rate(goldlapel_prepared_hits_total[5m]) + rate(goldlapel_prepared_misses_total[5m])). High hit rates indicate the proxy's prepared statement cache is working well for your query patterns.
Replica read distribution — if you use read replicas, goldlapel_replica_reads_total vs goldlapel_primary_reads_total shows how effectively reads are being distributed.

Alerting

Gold Lapel is designed to fail safely — if something goes wrong, it falls back to proxying queries directly to PostgreSQL. That said, you will want to know if it goes down or if something unusual is happening. Here are alert rules for the conditions worth watching:

Prometheus alerting rules

# Prometheus alert rules for Gold Lapel
groups:
  - name: goldlapel
    rules:
      - alert: GoldLapelDown
        expr: up{job="goldlapel"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Gold Lapel instance is down"

      - alert: CacheHitRateLow
        expr: >
          rate(goldlapel_cache_hits_total[10m])
          / (rate(goldlapel_cache_hits_total[10m]) + rate(goldlapel_cache_misses_total[10m]))
          < 0.5
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Cache hit rate below 50% for 5 minutes"

      - alert: PoolTimeouts
        expr: rate(goldlapel_pool_timeouts_total[5m]) > 0
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "Connection pool timeouts detected"

      - alert: ShadowVerificationFailures
        expr: increase(goldlapel_shadow_failures_total[1h]) > 0
        labels:
          severity: info
        annotations:
          summary: "Shadow verification failure in the last hour"

A note on the cache hit rate alert: a sustained rate below 50% usually means the workload is write-heavy or the query patterns are highly varied. Neither is a problem — it simply means caching is less effective for this particular workload. Investigate before acting.

The PoolTimeouts alert deserves attention when it fires. Pool timeouts mean clients are waiting for a database connection and giving up. The fix is usually increasing pool_size or investigating whether long-running transactions are holding connections open.

Datadog & other monitoring tools

Any monitoring tool that can scrape a Prometheus endpoint will work with Gold Lapel without modification. The /metrics endpoint speaks the standard exposition format — no proprietary protocol, no agent required beyond what your monitoring stack already provides.

Datadog

Datadog's Agent includes an OpenMetrics check that scrapes Prometheus endpoints natively. Add a configuration file for the check:

conf.d/openmetrics.d/conf.yaml

# datadog.yaml — Datadog Agent OpenMetrics check
instances:
  - openmetrics_endpoint: http://localhost:7933/metrics
    namespace: goldlapel
    metrics:
      - goldlapel_time_saved_seconds_total
      - goldlapel_queries_observed_total
      - goldlapel_queries_rewritten_total
      - goldlapel_cache_hits_total
      - goldlapel_cache_misses_total
      - goldlapel_active_connections
      - goldlapel_pool_active
      - goldlapel_pool_idle
      - goldlapel_pool_timeouts_total
      - goldlapel_shadow_passes_total
      - goldlapel_shadow_failures_total

The Agent will collect all listed metrics and report them to Datadog with the goldlapel. namespace prefix. From there, build dashboards and monitors using Datadog's standard tools.

Other tools

The following tools have been confirmed to work with Gold Lapel's /metrics endpoint out of the box:

Grafana Alloy / Grafana Agent — configure a prometheus.scrape component pointed at the metrics endpoint.
Victoria Metrics — use vmagent with a standard Prometheus scrape config. Victoria Metrics is a drop-in replacement for Prometheus storage and works identically.
New Relic — the Prometheus OpenMetrics integration or the Infrastructure agent's Flex integration can scrape the endpoint. Metrics appear in New Relic One with the goldlapel_ prefix.
Elastic / Metricbeat — the prometheus module in Metricbeat scrapes the endpoint and ships metrics to Elasticsearch.
InfluxDB / Telegraf — Telegraf's inputs.prometheus plugin scrapes the endpoint and writes to InfluxDB.

If your monitoring tool speaks Prometheus text format — and at this point, nearly all of them do — it will work. No adapter required.