← PostgreSQL Extensions

file_fdw

Flat files on disk, queried with SQL. Allow me to make the introductions.

Extension · March 21, 2026 · 5 min read

There is something quietly satisfying about bringing order to a CSV file that has been sitting on disk, unstructured and unqueried, gathering dust. file_fdw is a foreign data wrapper that exposes flat files on the server's filesystem as read-only foreign tables. You define a foreign table with column names, types, and parsing options (the same options as COPY), and then query the file with standard SQL. It ships with PostgreSQL as a contrib module.

What file_fdw does

file_fdw maps a file on the PostgreSQL server's filesystem to a foreign table. When you run a SELECT against that table, PostgreSQL opens the file, parses it according to the format options you specified, and returns the rows. The file is never modified — file_fdw is strictly read-only.

The parsing engine is the same one used by COPY FROM. This means every format option available in COPY — CSV, tab-delimited text, binary, custom delimiters, quoting, null strings, encoding — works identically with file_fdw. If you can COPY a file, you can query it as a foreign table.

When to use file_fdw

file_fdw is useful when you want to query a file with SQL without permanently loading it into the database:

  • CSV data files — query customer exports, product catalogs, or reference data that arrives as CSV without importing it first
  • PostgreSQL log analysis — point a foreign table at the server's CSV log (log_destination = 'csvlog') and query errors, slow queries, and connection patterns with SQL
  • One-time imports and validation — preview and filter file contents with WHERE clauses before loading into a permanent table with INSERT INTO ... SELECT
  • ETL staging — expose landing-zone files as foreign tables so downstream queries can reference them without a separate load step
  • Compressed or piped data — use the PROGRAM option to read from gunzip, curl, or any command that writes to stdout

Installation and setup

file_fdw is a contrib module that ships with PostgreSQL. No shared library loading or restart is required — a simple CREATE EXTENSION enables it. You then create a foreign server object, which acts as a container for foreign tables. The server itself has no configuration options.

SQL
-- file_fdw is a contrib module — ships with PostgreSQL
CREATE EXTENSION file_fdw;

-- Create a foreign server (file_fdw requires exactly one, but it has no options)
CREATE SERVER file_server FOREIGN DATA WRAPPER file_fdw;

The file referenced in a foreign table must be readable by the operating system user that runs the PostgreSQL process (typically postgres). Relative paths are resolved from the PostgreSQL data directory.

Practical examples

Querying a CSV file

The most common use case. Define columns that match the file's structure and specify the parsing options.

SQL
-- Query a CSV file as a foreign table
CREATE FOREIGN TABLE customer_import (
  id        integer,
  name      text,
  email     text,
  signed_up date
)
SERVER file_server
OPTIONS (
  filename '/var/data/customers.csv',
  format 'csv',
  header 'true'
);

-- Query it like any other table
SELECT name, email FROM customer_import WHERE signed_up >= '2026-01-01';

Tab-separated files

Set the delimiter to a tab character. Note the E'\\t' escape syntax for the tab literal.

SQL
-- Query a tab-separated file
CREATE FOREIGN TABLE server_metrics (
  ts         timestamptz,
  hostname   text,
  cpu_pct    numeric,
  mem_mb     integer
)
SERVER file_server
OPTIONS (
  filename '/var/log/metrics.tsv',
  format 'csv',
  header 'true',
  delimiter E'\t'
);

Querying PostgreSQL CSV logs

When log_destination = 'csvlog' is enabled, PostgreSQL writes structured log files that file_fdw can query directly. This turns your database logs into a queryable table — useful for finding error patterns, identifying slow queries, or auditing connections.

SQL
-- Query PostgreSQL's own CSV log as a foreign table
CREATE FOREIGN TABLE pg_csvlog (
  log_time               timestamptz,
  user_name              text,
  database_name          text,
  process_id             integer,
  connection_from        text,
  session_id             text,
  session_line_num       bigint,
  command_tag            text,
  session_start_time     timestamptz,
  virtual_transaction_id text,
  transaction_id         bigint,
  error_severity         text,
  sql_state_code         text,
  message                text,
  detail                 text,
  hint                   text,
  internal_query         text,
  internal_query_pos     integer,
  context                text,
  query                  text,
  query_pos              integer,
  location               text,
  application_name       text,
  backend_type           text,
  leader_pid             integer,
  query_id               bigint
)
SERVER file_server
OPTIONS (
  filename 'log/postgresql.csv',
  format 'csv'
);

-- Find recent errors
SELECT log_time, error_severity, message
FROM pg_csvlog
WHERE error_severity IN ('ERROR', 'FATAL')
ORDER BY log_time DESC
LIMIT 20;

Reading from a command with PROGRAM

The PROGRAM option runs a shell command and reads its stdout as the data source. This is powerful — you can decompress files, fetch remote data, or transform input on the fly — but it requires superuser privileges or the pg_execute_server_program role. I would counsel restraint here. A shell command that runs with the privileges of the PostgreSQL process deserves the same scrutiny you would give to any other form of elevated access.

SQL
-- Read from a gzipped CSV using the PROGRAM option
CREATE FOREIGN TABLE archived_orders (
  id         bigint,
  product    text,
  quantity   integer,
  total      numeric(10,2),
  order_date date
)
SERVER file_server
OPTIONS (
  program 'gunzip -c /var/data/orders-2025.csv.gz',
  format 'csv',
  header 'true'
);

-- Query it normally — gunzip runs on each SELECT
SELECT product, sum(total)
FROM archived_orders
GROUP BY product
ORDER BY sum(total) DESC;

COPY options reference

file_fdw accepts the same parsing options as COPY FROM. These are specified in the OPTIONS clause of CREATE FOREIGN TABLE.

SQL
-- All COPY-compatible options are available
CREATE FOREIGN TABLE custom_format (
  col1 text,
  col2 text,
  col3 text
)
SERVER file_server
OPTIONS (
  filename '/var/data/export.csv',
  format 'csv',          -- 'csv', 'text', or 'binary'
  header 'true',         -- skip the first line
  delimiter '|',         -- column separator (default: comma for csv, tab for text)
  null 'NULL',           -- string that represents NULL
  quote '"',             -- quoting character for csv format
  escape '"',            -- escape character for csv format
  encoding 'UTF8'        -- file encoding
);

Either filename or program must be specified, but not both. Every format option that COPY understands, file_fdw understands. A reassuringly symmetrical arrangement. Changing these options after table creation requires superuser privileges or the appropriate pg_read_server_files / pg_execute_server_program role.

Cloud availability

ProviderStatus
Amazon RDS / AuroraNot available — no server filesystem access. Use log_fdw for log queries.
Google Cloud SQLAvailable — limited to files in the instance data directory
Azure Database for PostgreSQLCheck provider documentation — filesystem access is restricted on managed instances
SupabaseCheck provider documentation — filesystem access is restricted on managed instances
NeonNot available — serverless architecture has no persistent local filesystem

file_fdw requires direct access to the server's filesystem, which most managed PostgreSQL services restrict for security reasons. It is most useful on self-managed PostgreSQL instances where you control the server and its files.

How Gold Lapel relates

I should be candid: file_fdw queries pass through the Gold Lapel proxy without difficulty, and we track their execution patterns as we do any other statement. But there is little I can do for you here. The performance of a file-backed foreign table is bound by filesystem I/O, not by query planning. There are no indexes to recommend, no materialized views to suggest. The data lives outside PostgreSQL's reach, and so it lives outside mine.

If you find yourself querying the same file repeatedly, the more productive arrangement is to load the data into a proper table. Once it resides within PostgreSQL, Gold Lapel can attend to it fully — index recommendations, materialized views, query plan analysis. The file served its purpose as a point of entry. The table is where it belongs.

Frequently asked questions