HypoPG in CI/CD: Automated Index Testing for PostgreSQL

Q: How do I test PostgreSQL indexes before deploying them?

Use HypoPG to create hypothetical indexes that exist only in the planner's memory. Run EXPLAIN on your critical queries with and without the hypothetical index to see whether the planner would use it and how much it reduces estimated query cost. Integrate this into your CI pipeline so every migration that adds an index is automatically tested. The workflow is: extract CREATE INDEX statements from the migration, create corresponding hypothetical indexes via hypopg_create_index(), run EXPLAIN before and after, and compare costs.

Q: Can HypoPG be used in CI/CD pipelines?

Yes. HypoPG is lightweight, requires no special permissions beyond CREATE EXTENSION, and hypothetical indexes exist only for the session duration — no cleanup required across test runs. Install HypoPG in your CI database container, create hypothetical indexes matching your migration's index definitions, and compare EXPLAIN output before and after. The entire process adds seconds to a pipeline run.

Q: How do I detect index regressions in PostgreSQL?

Compare EXPLAIN plan costs before and after adding a hypothetical index via HypoPG. If any query's estimated cost increases by more than 10%, the new index may cause a regression. Common regression patterns include indexes on low-cardinality columns (where a sequential scan would be faster) and partial indexes whose conditions do not match the actual query predicates. Automate this comparison in CI to catch regressions before they reach production.

Q: What queries should I include in database CI testing?

Include the 10-30 queries that define your application's performance. Start with the top 20 queries by total execution time from pg_stat_statements. Add queries sensitive to index changes: those with JOINs on foreign keys, WHERE clauses on indexed columns, and ORDER BY on indexed columns. Include any query that has caused a production performance incident. Version this query set alongside your application code and review it quarterly.

Every migration that adds an index, automatically tested for impact before it merges.

March 29, 2026 · 20 min read

The illustrator is conducting a dress rehearsal of their own. We await the final performance.

The problem — deploying indexes blind

I have observed, in a great many production systems, a pattern that deserves frank discussion. Most teams deploy indexes based on intuition. Someone notices a slow query, examines the query plan, picks a column that appears in the WHERE clause, and adds an index. Perhaps they run EXPLAIN ANALYZE in a staging environment first. Perhaps they do not.

The index lands in production. If it helps, nobody notices — the slow query disappears and life continues. If it hurts, the team discovers it hours or days later through elevated CPU, increased replication lag, or user complaints. If it simply does nothing — the planner ignores it in favor of a better plan — it sits on disk consuming space and slowing writes indefinitely, because nobody goes back to check.

This gap between "proposed an index" and "verified it helps" is the problem. Indexes are not free. On large tables, CREATE INDEX acquires locks, consumes I/O, and can take minutes to hours. Even CREATE INDEX CONCURRENTLY generates sustained write amplification. Every index that exists adds overhead to every INSERT, UPDATE, and DELETE on the table. Deploying a useless or harmful index carries a cost that accrues quietly.

HypoPG addresses this gap with considerable elegance. It creates hypothetical indexes that exist only in the planner's memory. No disk writes, no locks, no maintenance overhead. The PostgreSQL planner evaluates them as if they were real and reports whether it would use them and what cost reduction it estimates.

Running HypoPG interactively is valuable. Integrating it into CI/CD is transformative. Every migration that adds an index is automatically tested for impact before it merges. The pull request includes a report showing exactly which queries improved, which did not, and whether the planner would actually use the proposed index. Deploying a useless index becomes a conscious decision, not an accident.

HypoPG fundamentals — a brief refresher

For those already acquainted with HypoPG, this is a brief refresher. For a comprehensive introduction, see the HypoPG guide.

Create a hypothetical index:

Create a hypothetical index

SELECT * FROM hypopg_create_index('CREATE INDEX ON orders (customer_id)');

Result

 indexrelid |          indexname
------------+-----------------------------
     205401 | <205401>btree_orders_customer_id

The hypothetical index is visible only to the current session. It does not touch disk, does not acquire locks, and disappears when the session ends.

Check whether the planner would use the hypothetical index:

EXPLAIN with hypothetical index

EXPLAIN SELECT * FROM orders WHERE customer_id = 42;

Planner chooses the hypothetical index

                                    QUERY PLAN
---------------------------------------------------------------------------
 Index Scan using <205401>btree_orders_customer_id on orders
   (cost=0.04..8.06 rows=1 width=48)
   Index Cond: (customer_id = 42)

An important note: use EXPLAIN, not EXPLAIN ANALYZE. Hypothetical indexes cannot be used for actual execution — they exist only in the planner's cost estimation. For a deeper understanding of reading plan output, see the EXPLAIN output guide.

List and clean up hypothetical indexes:

List hypothetical indexes

SELECT * FROM hypopg_list_indexes();

Drop all hypothetical indexes

SELECT hypopg_drop_index(indexrelid) FROM hypopg_list_indexes();

The CI/CD integration architecture

What you need

If you'll follow me, I should like to walk through the arrangement. The integration requires four components:

A PostgreSQL instance accessible from CI. Options include a Docker container spun up in the pipeline (simplest), a dedicated test database on a shared server, or a Neon branch created per pipeline run. The instance must have the current schema loaded — without the proposed index.

HypoPG installed on that instance. HypoPG requires no special permissions beyond CREATE EXTENSION and adds no overhead when not actively in use.

A set of representative queries. These are the queries the proposed index is meant to improve, plus other critical queries that could be affected. They can live in a dedicated file (e.g., queries/critical_queries.sql) or be extracted from your ORM's generated SQL.

A testing script. A script that creates hypothetical indexes, runs EXPLAIN on each query, compares costs, and reports results.

The testing flow

The pipeline follows a straightforward sequence:

A migration file adds CREATE INDEX ... — this is the change being tested.
The CI pipeline starts a PostgreSQL instance and loads the current schema (without the new index).
The testing script parses the migration to extract the index definition.
The script runs EXPLAIN on the critical query set — capturing baseline estimated costs.
The script creates hypothetical indexes matching the migration's index definitions via HypoPG.
The script runs EXPLAIN again on the same queries — capturing estimated costs with hypothetical indexes.
The script compares costs and generates a report: which queries improved, by how much, and whether the planner would actually use the index.
The report is posted as a PR comment or saved as a pipeline artifact.

The entire process adds seconds to the pipeline. Seconds. HypoPG creates hypothetical indexes in microseconds, and EXPLAIN (without ANALYZE) returns instantly because it only estimates costs without executing the query.

GitHub Actions implementation

The workflow file

This workflow triggers on pull requests that modify migration files. It spins up a PostgreSQL container with HypoPG, applies the current schema, and runs the index testing script.

.github/workflows/index-test.yml

# .github/workflows/index-test.yml
name: Index Impact Test

on:
  pull_request:
    paths:
      - 'migrations/**'
      - 'db/migrate/**'

jobs:
  test-indexes:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_DB: index_test
          POSTGRES_USER: ci_user
          POSTGRES_PASSWORD: ci_password
        ports:
          - 5432:5432
        options: >-
          --health-cmd="pg_isready -U ci_user -d index_test"
          --health-interval=10s
          --health-timeout=5s
          --health-retries=5

    env:
      DATABASE_URL: postgresql://ci_user:ci_password@localhost:5432/index_test

    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Install HypoPG
        run: |
          sudo apt-get update
          sudo apt-get install -y postgresql-16-hypopg
          psql "$DATABASE_URL" -c "CREATE EXTENSION IF NOT EXISTS hypopg;"

      - name: Apply current schema
        run: |
          # Apply all existing migrations (adjust for your migration tool)
          for f in migrations/*.sql; do
            psql "$DATABASE_URL" -f "$f"
          done

      - name: Identify new index migrations
        id: find-indexes
        run: |
          CHANGED_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD -- 'migrations/*.sql' 'db/migrate/*.rb')
          echo "changed_files=$CHANGED_FILES" >> "$GITHUB_OUTPUT"

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install dependencies
        run: pip install psycopg2-binary

      - name: Run index impact test
        id: index-test
        run: |
          python scripts/test_index_impact.py \
            --database-url "$DATABASE_URL" \
            --migration-files ${{ steps.find-indexes.outputs.changed_files }} \
            --query-file queries/critical_queries.sql \
            --output-format markdown \
            --output-file index_report.md

      - name: Post results as PR comment
        if: always()
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const report = fs.readFileSync('index_report.md', 'utf8');
            await github.rest.issues.createComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
              body: report
            });

      - name: Fail on regression
        run: |
          python scripts/test_index_impact.py \
            --database-url "$DATABASE_URL" \
            --migration-files ${{ steps.find-indexes.outputs.changed_files }} \
            --query-file queries/critical_queries.sql \
            --check-regression \
            --regression-threshold 10

The path filter (paths: migrations/**, db/migrate/**) ensures the workflow only runs when migration files change.

The index testing script

This Python script is the core of the pipeline. It accepts a migration file, extracts CREATE INDEX statements, and compares EXPLAIN costs with and without the hypothetical index.

scripts/test_index_impact.py (core functions)

#!/usr/bin/env python3
"""
test_index_impact.py — Tests index impact via HypoPG.
Compares EXPLAIN costs with and without hypothetical indexes.
"""

import argparse, json, re, sys
import psycopg2

def extract_index_statements(migration_file):
    with open(migration_file, "r") as f:
        content = f.read()
    pattern = r"CREATE\s+(?:UNIQUE\s+)?INDEX\s+(?:CONCURRENTLY\s+)?(?:IF\s+NOT\s+EXISTS\s+)?\S+\s+ON\s+[^;]+;"
    return re.findall(pattern, content, re.IGNORECASE | re.MULTILINE)

def get_explain_cost(cursor, query):
    explain_query = query.replace("$1", "'1'").replace("$2", "'1'")
    cursor.execute(f"EXPLAIN (FORMAT JSON) {explain_query}")
    plan = cursor.fetchone()[0]
    top_node = plan[0]["Plan"]
    return {
        "total_cost": top_node["Total Cost"],
        "node_type": top_node["Node Type"],
        "plan_text": json.dumps(plan, indent=2),
    }

def compare_costs(baseline, with_index):
    if baseline["total_cost"] == 0:
        delta_pct = 0
    else:
        delta_pct = ((with_index["total_cost"] - baseline["total_cost"])
                     / baseline["total_cost"]) * 100
    return {
        "baseline_cost": baseline["total_cost"],
        "hypo_cost": with_index["total_cost"],
        "delta_pct": delta_pct,
        "improvement": delta_pct < 0,
        "regression": delta_pct > 10,
        "uses_index": "Index" in with_index["node_type"],
        "baseline_node": baseline["node_type"],
        "hypo_node": with_index["node_type"],
    }

Save the script as scripts/test_index_impact.py in your repository and make it executable.

Posting results as a PR comment

The workflow uses actions/github-script to post the markdown report directly as a PR comment. The report includes a table with query name, baseline cost, hypothetical cost, delta percentage, plan change description, and whether the planner uses the index. A summary line reports how many queries improved and the average cost change. Regression warnings flag any query whose cost increased above the threshold.

Maintaining the critical query set

Which queries to include

The critical query set is the list of queries tested against every proposed index. Choosing the right queries determines whether the pipeline catches real problems or produces noise — and I have strong views on the distinction.

Start with the queries that define your application's performance.

Top queries by total execution time. Query pg_stat_statements on your staging or production database:

Find top queries by execution time

SELECT
    query,
    calls,
    total_exec_time,
    mean_exec_time,
    rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

These 20 queries represent the largest share of your database's work. Any index change that affects them — positively or negatively — matters.

Queries sensitive to index changes. Look for queries with WHERE clauses on indexed columns, JOINs on foreign keys, and ORDER BY on indexed columns.

Queries that have caused production incidents. If a query has been the subject of a performance investigation, it belongs in the critical set permanently.

A focused set of 10-30 queries is manageable and meaningful. Include an annotation describing the expected access pattern:

Annotated critical queries

-- name: Orders by customer
-- expected: Index Scan on orders(customer_id)
SELECT * FROM orders WHERE customer_id = $1 ORDER BY created_at DESC LIMIT 20;

-- name: Recent high-value orders
-- expected: Index Scan on orders(total, created_at)
SELECT o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.total > 1000
  AND o.created_at > NOW() - INTERVAL '30 days';

Keeping queries current

The critical query set drifts as the application evolves. Review the critical query set quarterly, or when pg_stat_statements shows a new query in the top 20 by execution time. Version the query file alongside the application code. It is part of your project's performance contract — as fundamental as the schema itself. A contract that is not maintained is not a contract.

Integrating with pg_qualstats for index recommendations

pg_qualstats tracks which columns appear in WHERE clauses, JOIN conditions, and ORDER BY during actual query execution. The integration creates a closed loop:

pg_qualstats identifies candidate indexes on staging or production.
A developer writes a migration to add the recommended index.
HypoPG in CI validates the candidate. The pipeline creates a hypothetical version and tests it against the critical query set.
The migration deploys only if validated. If the planner ignores the hypothetical index or if it causes regressions, the PR is flagged for review.

This workflow prevents a failure mode I have seen more often than I would prefer: pg_qualstats or a similar tool recommends an index based on predicate frequency, a developer adds it without testing, and it turns out the planner prefers a different plan entirely.

For the detailed pg_qualstats recommendation workflow, see the pg_qualstats + HypoPG guide.

Regression detection — catching index changes that hurt

What regression looks like

I'm afraid this section contains unwelcome news for those who view indexes as purely beneficial. Adding an index does not always improve performance. In some cases, it makes specific queries slower.

Low-cardinality trap. Adding an index on orders(status) where status has 3 distinct values across millions of rows. The planner might choose an Index Scan when a Sequential Scan with a filter would be faster:

Low-cardinality regression example

-- Before: Sequential Scan (fast for low selectivity)
EXPLAIN SELECT * FROM orders WHERE status = 'shipped';
-- Seq Scan on orders  (cost=0.00..45000.00 rows=1650000 width=48)
--   Filter: (status = 'shipped')

-- After adding index on status: Index Scan (slower)
EXPLAIN SELECT * FROM orders WHERE status = 'shipped';
-- Index Scan using idx_orders_status on orders  (cost=0.43..52000.00 rows=1650000 width=48)
--   Index Cond: (status = 'shipped')

The estimated cost increased. The planner chose the index, but the index is slower for this query because it returns too many rows.

For guidance on designing composite indexes and choosing the right index type, see the linked guides.

Automated regression flags

The CI script should flag any query where the estimated cost increases after the hypothetical index is added. Set a threshold — small cost fluctuations (under 10%) can be planner estimation noise:

Regression threshold

# In the CI script
REGRESSION_THRESHOLD = 10.0  # percent

for result in results:
    if result["delta_pct"] > REGRESSION_THRESHOLD:
        print(f"REGRESSION: {result['query_name']} "
              f"cost increased {result['delta_pct']:.1f}%")

Block the PR on regression. The CI workflow's final step checks for regressions and exits with a non-zero code if any are found.

For cases where the regression is acceptable — the index helps 9 queries and hurts 1 insignificant one — use a "regression acknowledged" mechanism:

Acknowledged regression in migration file

-- regression-acknowledged: query "low-value status check" regresses 15%
-- reason: this query runs <100 times/day and the cost increase is marginal
CREATE INDEX idx_orders_customer_status ON orders (customer_id, status);

The key is making regressions visible and intentional, not silent. A known trade-off is a decision. An unknown trade-off is an accident waiting to present itself.

Beyond indexes — testing other schema changes

The EXPLAIN comparison technique is not limited to indexes. Any schema change that affects query plans can be tested the same way.

Partitioning. Compare EXPLAIN output before and after applying a partition scheme. Verify that partition pruning works for the critical queries:

Verifying partition pruning

-- Before partitioning: full table scan
EXPLAIN SELECT * FROM orders WHERE created_at > '2026-01-01';
-- Seq Scan on orders  (cost=0.00..45000.00 rows=500000 width=48)

-- After partitioning by created_at range: partition pruning
EXPLAIN SELECT * FROM orders WHERE created_at > '2026-01-01';
-- Append  (cost=0.00..5000.00 rows=500000 width=48)
--   ->  Seq Scan on orders_2026_q1  (cost=0.00..2500.00 rows=250000 width=48)
--   ->  Seq Scan on orders_2026_q2  (cost=0.00..2500.00 rows=250000 width=48)

If the planner does not prune partitions for a critical query, the partition scheme needs adjustment before deployment.

Materialized views. Verify that queries intended to hit a materialized view actually use it.

Column additions. Adding wide columns to frequently scanned tables can affect plan choices by changing the cost of sequential scans.

The principle: schema changes that affect query planning should be tested against the critical query set before deployment. The cost of running EXPLAIN against 20-30 queries is negligible. The cost of discovering a plan regression in production at three in the morning is decidedly not.

Honest counterpoint — limitations of this approach

I should be honest about the boundaries of what this approach can guarantee. EXPLAIN without ANALYZE uses estimated costs, not actual execution times. Estimates can be wrong. The planner relies on table statistics (row counts, value distributions, null fractions) collected by ANALYZE. If the test database has stale statistics or unrepresentative data distribution, the cost estimates may not reflect production behavior.

The test database must have realistic data. An empty test database produces useless EXPLAIN estimates. A table with 100 rows will never trigger an index scan. Populate the test database with representative data: realistic row counts, realistic distributions, and up-to-date statistics. Running ANALYZE after loading data is essential.

HypoPG cannot test everything about an index. It validates that the planner would use the index and estimates a cost reduction. It cannot test index build time, index size on disk, write amplification, or impact on autovacuum and I/O.

This is a strong signal, not a proof. CI testing validates that the planner estimates a cost reduction and would use the proposed index. It does not guarantee the index will improve production performance. For high-stakes index changes on large tables, complement CI testing with a staging environment test using EXPLAIN ANALYZE on realistic data. CI catches the obvious mistakes automatically. Staging catches the subtle ones with human review. The two approaches are complementary, not redundant — and together, they represent a standard of care that your production database deserves.