← PostgreSQL Extensions

tablefunc

Crosstab and pivot table functions — because data deserves to be presented properly.

Extension · March 21, 2026 · 5 min read

There is something rather undignified about making a report consumer mentally pivot rows into columns. That is the database's job. tablefunc is a PostgreSQL contrib extension that provides functions for creating pivot tables (crosstabs), generating normally distributed random numbers, and traversing hierarchical data. Its main function, crosstab(), transforms row-oriented query results into columnar output — the kind of row-to-column transformation that reporting and analytics queries frequently need.

What tablefunc does

tablefunc provides three families of functions. The most widely used is crosstab(), which takes a query returning row-name/category/value triples and pivots them into a table where each category becomes a column. This is the standard way to produce pivot tables directly in PostgreSQL without reshaping data in application code.

The extension also includes normal_rand() for generating sets of normally distributed random numbers, and connectby() for traversing parent-child relationships stored in a table. While crosstab() sees regular production use, the other two functions are more niche — normal_rand() is useful for test data generation, and connectby() has largely been superseded by recursive CTEs.

When to use tablefunc

  • Pivot tables for reporting — turn month-over-month, category-by-category, or any row-oriented data into columnar output
  • Dashboard queries — produce results already shaped for display, avoiding application-level pivoting
  • Data exports — reshape normalized data into the wide-format layout that spreadsheets and BI tools expect
  • Test data generation — use normal_rand() to create realistic distributions for performance testing

If your pivot categories are truly dynamic (unknown at query time), you will need to generate the SQL dynamically in application code — crosstab() requires the output columns to be declared in the AS clause.

Installation and setup

tablefunc is a contrib module that ships with PostgreSQL — no separate download or build required. It does not need shared_preload_libraries, so no restart is necessary.

SQL
-- tablefunc ships with PostgreSQL (contrib module)
-- No shared_preload_libraries needed — just create the extension
CREATE EXTENSION tablefunc;

-- Verify it's working
SELECT * FROM normal_rand(5, 0, 1);

crosstab(): single-parameter form

The basic crosstab() takes a single SQL query that must return exactly three columns: a row name, a category, and a value. The results are pivoted so that each distinct category becomes a column.

SQL
-- Sample data: monthly sales by product
CREATE TABLE monthly_sales (
  product text,
  month text,
  revenue numeric
);

INSERT INTO monthly_sales VALUES
  ('Widget', 'Jan', 1200),
  ('Widget', 'Feb', 1500),
  ('Widget', 'Mar', 980),
  ('Gadget', 'Jan', 3200),
  ('Gadget', 'Feb', 2800),
  ('Gadget', 'Mar', 3500);

-- Single-parameter crosstab: pivots rows into columns
-- The source query MUST return (row_name, category, value)
-- ordered by row_name, then category
SELECT * FROM crosstab(
  'SELECT product, month, revenue
   FROM monthly_sales
   ORDER BY 1, 2'
) AS ct(product text, jan numeric, feb numeric, mar numeric);

-- Result:
-- product | jan  | feb  | mar
-- --------+------+------+-----
-- Gadget  | 3200 | 2800 | 3500
-- Widget  | 1200 | 1500 | 980

The source query must be ordered by row name first, then by category. Values are assigned to output columns left to right in the order they appear. This form works correctly when every row has the same set of categories — but breaks when categories are missing. Silently placing March's revenue in February's column is the sort of behaviour that erodes trust in a report.

crosstab(): two-parameter form

The two-parameter form is more robust. The second argument is a query that returns the list of category values in order. PostgreSQL matches each value to the correct output column by category name rather than by position, inserting NULL for any missing categories.

SQL
-- Two-parameter crosstab: handles missing values correctly
-- First arg: source query (row_name, category, value)
-- Second arg: query returning the category values in order

SELECT * FROM crosstab(
  'SELECT product, month, revenue
   FROM monthly_sales
   ORDER BY 1, 2',
  $$SELECT DISTINCT month FROM monthly_sales ORDER BY 1$$
) AS ct(product text, feb numeric, jan numeric, mar numeric);

-- Why the two-parameter form matters:
-- If Gadget has no Feb data, the single-parameter form
-- would shift Mar's value into Feb's column.
-- The two-parameter form matches values to the correct
-- category column, leaving gaps as NULL.

The two-parameter form should be your default choice. It handles sparse data correctly and makes the intent of the query explicit. A NULL in the output is honest; a misplaced value is a lie.

normal_rand()

Generates a set of normally distributed (Gaussian) random numbers. Useful for populating test tables with realistic data distributions.

SQL
-- Generate 10 normally distributed random numbers
-- normal_rand(count, mean, standard_deviation)
SELECT * FROM normal_rand(10, 100, 15);

-- Useful for generating test data with realistic distributions
-- Example: generate 1000 simulated response times (mean 200ms, stddev 50ms)
SELECT round(val::numeric, 1) AS response_ms
FROM normal_rand(1000, 200, 50) AS val
WHERE val > 0;

connectby()

Traverses hierarchical data stored as parent-child references in a single table. While still functional, recursive CTEs (WITH RECURSIVE) are generally preferred for new code. I include it here for completeness — a proper inventory omits nothing.

SQL
-- connectby: traverse hierarchical data
-- connectby(relname, keyid_fld, parent_keyid_fld,
--           start_with, max_depth [, branch_delim])

CREATE TABLE employees (
  id integer PRIMARY KEY,
  name text,
  manager_id integer REFERENCES employees(id)
);

INSERT INTO employees VALUES
  (1, 'Alice', NULL),
  (2, 'Bob', 1),
  (3, 'Carol', 1),
  (4, 'Dave', 2),
  (5, 'Eve', 2);

-- Traverse the org chart starting from Alice (id=1), up to 10 levels deep
SELECT * FROM connectby(
  'employees', 'id', 'manager_id', '1', 10, '~'
) AS ct(id integer, manager_id integer, level integer, branch text);

-- Result:
-- id | manager_id | level | branch
-- ---+------------+-------+--------
--  1 |            |     0 | 1
--  2 |          1 |     1 | 1~2
--  3 |          1 |     1 | 1~3
--  4 |          2 |     2 | 1~2~4
--  5 |          2 |     2 | 1~2~5

Cloud availability

ProviderStatus
Amazon RDS / AuroraAvailable — install with CREATE EXTENSION
Google Cloud SQLAvailable — supported as a standard contrib extension
Azure Database for PostgreSQLAvailable — add to the azure.extensions allowlist, then CREATE EXTENSION
SupabaseAvailable — enable from the Extensions dashboard or via SQL
NeonAvailable — install with CREATE EXTENSION

As a contrib module that ships with PostgreSQL itself, tablefunc is available on virtually every managed PostgreSQL service.

How Gold Lapel relates

A well-formed crosstab query is a pleasure. A well-formed crosstab query that runs forty times an hour against the same underlying data is a concern. The pivot itself — scanning, grouping, redistributing — is honest work, but repeating it identically for every dashboard refresh is the kind of redundancy I find difficult to overlook.

Gold Lapel addresses this through materialized views. When the same crosstab structure appears repeatedly, Gold Lapel can cache the pivoted results and serve subsequent requests from the precomputed output. The full pivot still runs at refresh time, but your dashboard consumers see sub-millisecond responses rather than waiting for the operation to complete from scratch. The data is presented properly, and it is presented promptly. Both matter.

Frequently asked questions