Skip to content
Catalogs Last updated: May 14, 2026

Iceberg JDBC Catalog

The Iceberg JDBC Catalog uses any JDBC-compatible relational database (PostgreSQL, MySQL, SQLite) as a persistent metadata store for Iceberg table catalog information, making it a popular choice for self-hosted, development, and single-writer production deployments.

iceberg jdbc catalogiceberg postgresql catalogiceberg mysql catalogiceberg self-hosted catalogiceberg relational catalog

Iceberg JDBC Catalog

The Iceberg JDBC Catalog is a catalog backend that stores Iceberg table metadata in any JDBC-compatible relational database — PostgreSQL, MySQL, MariaDB, SQLite, and others. It provides a simple, persistent, self-hosted alternative to the Hive Metastore (no Thrift server required) and is particularly popular for:

How the JDBC Catalog Works

The JDBC Catalog creates and maintains three tables in the backing database:

TablePurpose
iceberg_tablesMaps table identifiers to current metadata file location
iceberg_namespace_propertiesStores namespace-level properties
iceberg_table_propertiesStores table-level catalog properties

These tables are created automatically when the JDBC Catalog is initialized. Catalog operations (create table, rename table, drop table, update metadata pointer) are all JDBC transactions against these tables.

The backing database provides the atomicity guarantee for concurrent writes — all catalog updates are wrapped in database transactions, ensuring that concurrent writers see consistent metadata.

Configuration (Apache Spark)

PostgreSQL-backed JDBC Catalog

spark = SparkSession.builder \
    .config("spark.sql.catalog.my_catalog", "org.apache.iceberg.spark.SparkCatalog") \
    .config("spark.sql.catalog.my_catalog.catalog-impl", "org.apache.iceberg.jdbc.JdbcCatalog") \
    .config("spark.sql.catalog.my_catalog.uri",
            "jdbc:postgresql://postgres-host:5432/iceberg_catalog") \
    .config("spark.sql.catalog.my_catalog.jdbc.user", "iceberg_user") \
    .config("spark.sql.catalog.my_catalog.jdbc.password", "secret") \
    .config("spark.sql.catalog.my_catalog.warehouse", "s3://my-bucket/warehouse") \
    .getOrCreate()

SQLite-backed JDBC Catalog (Development)

spark = SparkSession.builder \
    .config("spark.sql.catalog.local", "org.apache.iceberg.spark.SparkCatalog") \
    .config("spark.sql.catalog.local.catalog-impl", "org.apache.iceberg.jdbc.JdbcCatalog") \
    .config("spark.sql.catalog.local.uri", "jdbc:sqlite:/tmp/iceberg-catalog.db") \
    .config("spark.sql.catalog.local.warehouse", "file:///tmp/iceberg-warehouse") \
    .getOrCreate()

# Create tables and work with Iceberg locally — no external services needed
spark.sql("CREATE TABLE local.db.test_table (id BIGINT, value STRING) USING iceberg")
spark.sql("INSERT INTO local.db.test_table VALUES (1, 'hello')")
spark.sql("SELECT * FROM local.db.test_table").show()

Configuration (PyIceberg)

from pyiceberg.catalog.sql import SqlCatalog

# PostgreSQL
catalog = SqlCatalog(
    "my_catalog",
    **{
        "uri": "postgresql+psycopg2://user:password@postgres-host/iceberg_db",
        "warehouse": "s3://my-bucket/warehouse",
    }
)

# SQLite (development)
catalog = SqlCatalog(
    "local",
    **{
        "uri": "sqlite:////tmp/iceberg_catalog.db",
        "warehouse": "file:////tmp/iceberg-warehouse",
    }
)

# Create namespace and table
catalog.create_namespace("analytics")
catalog.create_table(
    identifier="analytics.orders",
    schema=Schema(
        NestedField(1, "order_id", LongType()),
        NestedField(2, "total", DoubleType()),
    )
)

JDBC Catalog vs. Other Catalogs

AspectJDBCHive MetastoreREST (Polaris)AWS Glue
External service neededDatabase onlyThrift server + DBREST serverAWS managed
Multi-engine supportLimitedYes (Thrift protocol)Excellent (REST API)Yes (AWS only)
Credential vendingNoNoYesPartial
RBACDatabase-level onlyLimitedFullVia IAM/LF
Best forDev, self-hosted prodLegacy Hive migrationMulti-engine prodAWS-native prod

Concurrency Considerations

The JDBC Catalog relies on database-level optimistic locking:

For high-concurrency production workloads, the Iceberg REST Catalog (Apache Polaris) or Project Nessie provide better concurrency characteristics.

Use Case: Local Development with Docker

A common development setup uses Docker Compose:

# docker-compose.yml
services:
  postgres:
    image: postgres:15
    environment:
      POSTGRES_DB: iceberg_catalog
      POSTGRES_USER: iceberg
      POSTGRES_PASSWORD: iceberg_password
    ports: ["5432:5432"]

  spark:
    image: apache/spark:3.5.0
    environment:
      SPARK_CATALOG_URI: jdbc:postgresql://postgres:5432/iceberg_catalog
      SPARK_CATALOG_USER: iceberg
      SPARK_CATALOG_PASSWORD: iceberg_password
    depends_on: [postgres]

This gives developers a fully local Iceberg environment with a real, persistent JDBC catalog — no AWS credentials, no cloud services, no Hive Metastore.

📚 Go Deeper on Apache Iceberg

Alex Merced has authored three hands-on books covering Apache Iceberg, the Agentic Lakehouse, and modern data architecture. Pick up a copy to master the full ecosystem.

← Back to Iceberg Knowledge Base