Predicate Pushdown in Apache Iceberg
Predicate pushdown is the query optimization technique that moves filter conditions (“predicates”) from the query execution layer down to the data storage layer — eliminating irrelevant data as early as possible, at the lowest possible level, before any bytes are transferred or decompressed.
In Apache Iceberg, predicate pushdown works at three progressive levels: the manifest list (snapshot level), manifest files (file group level), and Parquet row groups (sub-file level). Together, these three levels can reduce the data actually read to a tiny fraction of the total table size.
Level 1: Manifest List Pruning (Partition Elimination)
The outermost filter. The manifest list contains a summary of which partitions are covered by each manifest file:
Query: WHERE order_date = '2026-05-14'
Manifest List:
manifest-1.avro → covers partitions: 2026-01-xx, 2026-02-xx → SKIP
manifest-2.avro → covers partitions: 2026-03-xx, 2026-04-xx → SKIP
manifest-3.avro → covers partitions: 2026-05-xx → READ
Only manifest-3.avro is opened. The engine never reads manifests 1 or 2.
This is partition elimination — eliminates manifests (and all their data files) for irrelevant partitions in a single metadata-level check.
Level 2: Manifest File Pruning (File Elimination)
The manifest file contains per-file column statistics (min/max, null count) for every data file. The engine applies the predicate to these statistics:
manifest-3.avro:
data-file-001.parquet → order_date min=2026-05-01, max=2026-05-07 → SKIP
data-file-002.parquet → order_date min=2026-05-08, max=2026-05-14 → READ
data-file-003.parquet → order_date min=2026-05-15, max=2026-05-21 → SKIP
data-file-004.parquet → order_date min=2026-05-22, max=2026-05-31 → SKIP
Only data-file-002 is opened. 3 files are skipped based on manifest statistics.
This is file-level data skipping — eliminates files that can be proven to not contain matching rows based on min/max statistics.
Level 3: Row Group Pruning (Sub-File Elimination)
Once a Parquet file is opened, Parquet’s native row group statistics provide a third filter level:
data-file-002.parquet:
Row Group 1 → order_date min=2026-05-08, max=2026-05-10 → SKIP
Row Group 2 → order_date min=2026-05-11, max=2026-05-14 → READ
Row Group 3 → order_date min=2026-05-12, max=2026-05-14 → READ (overlap)
2 row groups read out of 3.
Level 4: Column Chunk and Page Filter (Innermost)
Within a row group, Parquet can further skip:
- Column chunks: Only read the columns referenced in SELECT + WHERE (column projection pushdown).
- Pages: With Parquet dictionary pages, skip entire pages where the dictionary doesn’t contain the queried value.
- Bloom filters: Skip row groups or pages where the bloom filter proves absence.
Pushdown with Complex Predicates
Iceberg’s scan planner evaluates complex predicates for pushdown:
-- Multi-predicate pushdown
SELECT * FROM db.orders
WHERE order_date >= '2026-05-01' -- partition elimination
AND region = 'AMER' -- column statistics pushdown
AND total > 1000.00 -- column statistics pushdown
AND customer_id = 12345; -- bloom filter pushdown (if enabled)
Each predicate is evaluated at the appropriate level:
order_date >=→ Manifest list + manifest file statistics.region =→ Manifest file statistics (if region has low cardinality, highly effective).total > 1000.00→ Manifest file statistics (min/max).customer_id = 12345→ Bloom filter (if enabled) + Parquet row group stats.
OR Predicates and Pushdown
OR predicates are harder to push down:
WHERE region = 'AMER' OR region = 'EMEA'
Iceberg’s expression evaluator converts this to a union bound: files where region_max >= 'AMER' AND region_min <= 'EMEA' survive. Files entirely outside this range are skipped.
-- Complex OR across columns (difficult for min/max)
WHERE status = 'failed' OR total > 10000
This is harder — files must be kept if they might satisfy either condition independently. Iceberg may fall back to reading all files and applying the predicate at the compute layer.
Monitoring Pushdown Effectiveness
-- Check how many files were scanned vs. total
-- (requires query profiling — Spark's EXPLAIN, Dremio's query profile)
EXPLAIN SELECT * FROM db.orders WHERE order_date = '2026-05-14';
-- Look for:
-- "FileScanTask: 1/847 files" → only 1 of 847 files was read (excellent pushdown)
-- vs. "FileScanTask: 847/847 files" → full table scan (no pushdown benefit)
Maximizing Pushdown Effectiveness
| Strategy | Pushdown Benefit |
|---|---|
| Partition by query filter columns | Level 1: manifest list elimination |
| Sort/cluster by filter columns | Level 2: tight file-level min/max |
| Run compaction to reduce file count | Fewer manifest entries to evaluate |
| Enable bloom filters on ID columns | Level 2-3: hash-based file skipping |
| Use hidden partitioning correctly | Avoids partition column inclusion in SELECT |