Skip to content
Engines & Integrations Last updated: May 14, 2026

StarRocks and Apache Iceberg

StarRocks is a high-performance OLAP query engine with native Apache Iceberg external table support via its Multi-Catalog architecture, enabling sub-second analytics over Iceberg tables stored in S3, HDFS, or other storage without data ingestion.

starrocks icebergstarrocks apache icebergstarrocks external catalog icebergstarrocks lakehousestarrocks iceberg integration

StarRocks and Apache Iceberg

StarRocks (formerly known as DorisDB, not to be confused with Apache Doris) is a high-performance, MPP (Massively Parallel Processing) OLAP database designed for real-time and interactive analytics. StarRocks supports Apache Iceberg as an external table format via its Multi-Catalog architecture — enabling StarRocks SQL to query Iceberg tables directly from object storage without data ingestion or ETL.

StarRocks is widely used in the Asia-Pacific tech ecosystem and has a growing global community, particularly for use cases requiring sub-second multi-dimensional analytics over large Iceberg datasets.

StarRocks Multi-Catalog for Iceberg

StarRocks’ Multi-Catalog feature allows creating catalog connections to external table formats including Iceberg, Hive, Delta Lake, and Hudi — alongside StarRocks’ native internal tables.

Creating an Iceberg Catalog

-- StarRocks: create an Iceberg catalog using Hive Metastore
CREATE EXTERNAL CATALOG iceberg_hms
PROPERTIES (
    "type" = "iceberg",
    "iceberg.catalog.type" = "hive",
    "hive.metastore.uris" = "thrift://hms-host:9083",
    "aws.s3.use_instance_profile" = "true",
    "aws.s3.region" = "us-east-1"
);

-- Using AWS Glue
CREATE EXTERNAL CATALOG iceberg_glue
PROPERTIES (
    "type" = "iceberg",
    "iceberg.catalog.type" = "glue",
    "aws.glue.region" = "us-east-1",
    "aws.s3.use_instance_profile" = "true"
);

-- Using Iceberg REST Catalog (Apache Polaris)
CREATE EXTERNAL CATALOG iceberg_polaris
PROPERTIES (
    "type" = "iceberg",
    "iceberg.catalog.type" = "rest",
    "iceberg.catalog.uri" = "https://my-polaris.example.com",
    "iceberg.catalog.credential" = "client-id:client-secret",
    "iceberg.catalog.warehouse" = "my-warehouse"
);

Querying Iceberg Tables

-- Set the Iceberg catalog as current
SET CATALOG iceberg_polaris;

-- List namespaces and tables
SHOW DATABASES;
SHOW TABLES FROM analytics;

-- Query Iceberg tables with full predicate pushdown
SELECT
    date_trunc('month', order_date) AS month,
    region,
    COUNT(*) AS orders,
    SUM(total) AS revenue
FROM analytics.orders
WHERE order_date >= '2026-01-01'
  AND region IN ('AMER', 'EMEA')
GROUP BY 1, 2
ORDER BY 1, 4 DESC;

Cross-Catalog Joins

StarRocks can join between internal StarRocks tables and external Iceberg tables:

-- Join StarRocks internal dimension table with Iceberg fact table
SELECT
    d.product_name,
    d.category,
    SUM(f.revenue) AS total_revenue
FROM iceberg_polaris.analytics.fact_orders f
JOIN default_catalog.dim.products d
    ON f.product_id = d.product_id
WHERE f.order_date >= '2026-01-01'
GROUP BY 1, 2;

StarRocks Performance Characteristics for Iceberg

StarRocks’ query engine applies the full Iceberg optimization stack:

StarRocks also uses a vectorized execution engine (SIMD instructions, columnar processing) that makes it particularly fast for aggregation-heavy analytical queries.

StarRocks vs. Other Iceberg Query Engines

AspectStarRocksTrinoDremio
Primary strengthReal-time OLAPGeneral SQLAI Analytics + BI
Native table formatStarRocks internalNone (all external)Iceberg (via Dremio Open Catalog)
AI integrationNoNoYes (AI Semantic Layer)
Streaming ingestYes (native)NoNo
CommunityGlobal (large in APAC)Large (Apache)Enterprise
Best forReal-time OLAP + IcebergGeneral lakehouse SQLAI, BI, federated analytics

StarRocks and the Modern Iceberg Ecosystem

StarRocks fits well in architectures where:

For AI analytics, semantic layer, and natural language query capabilities on Iceberg data, Dremio complements StarRocks in the same lakehouse architecture.

📚 Go Deeper on Apache Iceberg

Alex Merced has authored three hands-on books covering Apache Iceberg, the Agentic Lakehouse, and modern data architecture. Pick up a copy to master the full ecosystem.

← Back to Iceberg Knowledge Base