Skip to content
File & Metadata Layer Last updated: May 14, 2026

Iceberg Encryption

Apache Iceberg supports column-level and file-level encryption through its encryption specification, enabling sensitive data to be protected at rest within Parquet data files using key management services while maintaining full queryability on authorized clients.

iceberg encryptioniceberg parquet encryptioniceberg column encryptioniceberg data at rest encryptioniceberg kms encryption

Iceberg Encryption

Apache Iceberg supports encryption at the data file level through its encryption specification, which integrates with Apache Parquet’s native encryption capabilities. Iceberg encryption allows:

The Encryption Model

Iceberg uses a key wrapping model:

  1. A Data Encryption Key (DEK) is generated for each Parquet file (or column group).
  2. The DEK encrypts the actual data within the Parquet file.
  3. The DEK itself is encrypted using a Key Encryption Key (KEK) from the KMS.
  4. The encrypted DEK (wrapped DEK) is stored in the Parquet file footer.

To decrypt a file:

  1. Reader contacts KMS to unwrap (decrypt) the DEK using the KEK.
  2. Reader uses the DEK to decrypt the column data.

The KMS never receives raw data — it only wraps/unwraps keys. This ensures KMS cannot be used to exfiltrate data.

Parquet Encryption and Iceberg

Iceberg encryption is built on top of Parquet Modular Encryption (PME), introduced in Parquet 1.12.0:

Parquet File Structure (with encryption):
  Row Group 1:
    Column customer_id [ENCRYPTED with DEK_customer]
    Column total       [PLAINTEXT]
    Column email       [ENCRYPTED with DEK_pii]
  Row Group 2:
    ...
  Footer:
    Schema (optionally encrypted)
    Encrypted DEK_customer (wrapped with KMS key arn:aws:kms:...:key/customer-key-id)
    Encrypted DEK_pii (wrapped with KMS key arn:aws:kms:...:key/pii-key-id)

Iceberg Encryption Configuration

Spark + AWS KMS

# Spark: configure Iceberg encryption with AWS KMS
spark = SparkSession.builder \
    .config("spark.sql.parquet.encryption.kms.client.class",
            "org.apache.parquet.crypto.keytools.mocks.InMemoryKMS") \
    .getOrCreate()

# Table with column-level encryption
spark.sql("""
    CREATE TABLE db.customers (
        customer_id BIGINT,
        name        STRING,
        email       STRING,
        total_orders INT
    ) USING iceberg
    TBLPROPERTIES (
        'write.parquet.encryption.enabled' = 'true',
        'write.parquet.encryption.column.email' =
            'arn:aws:kms:us-east-1:123456789:key/pii-encryption-key',
        'write.parquet.encryption.footer.key' =
            'arn:aws:kms:us-east-1:123456789:key/footer-encryption-key'
    )
""")

Encryption Key Metadata in Table Properties

Column-specific encryption is configured via table properties:

ALTER TABLE db.customers SET TBLPROPERTIES (
    'write.parquet.encryption.enabled' = 'true',
    'write.parquet.encryption.column.email' = '<kms-key-id-for-pii>',
    'write.parquet.encryption.column.phone' = '<kms-key-id-for-pii>',
    'write.parquet.encryption.column.ssn' = '<kms-key-id-for-pii-sensitive>',
    'write.parquet.encryption.footer.key' = '<kms-key-id-for-footer>'
);

Iceberg Encryption vs. Storage Encryption

Encryption TypeWhat’s ProtectedGranularityWho Manages Keys
Cloud storage (SSE-S3, CMEK)All objects in bucketFile levelCloud provider / IAM
Parquet column encryptionSpecific columns within filesColumn levelYour KMS
Iceberg encryption specPer-file data encryptionFile / columnYour KMS
TLS (in-transit)Data in network transferConnectionCertificates

Cloud storage encryption is always on by default on major clouds. Column-level Iceberg encryption adds an additional layer for columns requiring fine-grained key management — typically the most sensitive PII columns.

Access Control via Encryption

Column encryption can enforce access control independently of the catalog RBAC:

This “defense in depth” approach:

  1. Catalog RBAC: first line of defense.
  2. Object storage IAM: second line of defense.
  3. Column encryption: cryptographic guarantee even if storage access is compromised.

Encryption and Compaction

Compaction must be able to decrypt existing files and re-encrypt output files using the same or new keys. Ensure the compaction job’s service account has KMS access to both decrypt (old files) and encrypt (new files). Key rotation is possible during compaction by changing the KMS key references in table properties before running rewrite_data_files.

📚 Go Deeper on Apache Iceberg

Alex Merced has authored three hands-on books covering Apache Iceberg, the Agentic Lakehouse, and modern data architecture. Pick up a copy to master the full ecosystem.

← Back to Iceberg Knowledge Base