[AURON #2015] Add Native Scan Support for Apache Iceberg Copy-On-Write Tables. by slfan1989 · Pull Request #2016 · apache/auron

slfan1989 · 2026-02-18T01:11:05Z

Which issue does this PR close?

Closes #2015

Rationale for this change

This PR adds native scan support for Apache Iceberg Copy-On-Write (COW) tables to improve query performance. Currently, Auron lacks direct integration with Iceberg, forcing all Iceberg queries to use Spark's native execution path, missing opportunities for native engine acceleration.

Key Motivations:

Enable Auron's native execution engine to read Iceberg tables directly
Leverage native performance optimizations for Iceberg COW tables
Provide automatic fallback to Spark scan for unsupported scenarios
Lay the foundation for future Iceberg feature enhancements (MOR tables, pruning predicates, etc.)

What changes are included in this PR?

Core Implementation:

IcebergConvertProvider - SPI extension point that detects Iceberg scans and decides whether to use native execution
IcebergScanSupport - Decision logic that validates scan plans and checks for COW table eligibility
NativeIcebergTableScanExec - Native execution node that converts Iceberg FileScanTask to native scan plans

Build & Configuration:

Updated pom.xml with Iceberg version management and Maven enforcer rules
Modified auron-build.sh to support Iceberg build parameters
Added configuration option: spark.auron.enable.iceberg.scan (default: true)

Supported Features:

Iceberg COW tables (Parquet and ORC formats)
Projection pushdown (column pruning)
Partitioned and non-partitioned tables
Automatic fallback for unsupported scenarios

Version Support:

Spark: 3.4, 3.5, 4.0 only
Iceberg: 1.10.1 only (enforced by Maven)

Are there any user-facing changes?

No Breaking Changes: Existing functionality remains unchanged. Iceberg support is additive and disabled by default in unsupported scenarios.

How was this patch tested?

Unit & Integration Tests:

Added 10 integration test cases in AuronIcebergIntegrationSuite:
- Simple COW table scan
- Projection pushdown
- Partitioned table with partition filter
- Orc format support
- Empty table handling
- Residual filters fallback
- Metadata columns fallback
- Decimal type fallback
- Delete files (MOR) fallback
- Configuration toggle functionality

Test Environment:

Spark versions: 3.4.4, 3.5.8, 4.0.2
Iceberg version: 1.10.1
File formats: Parquet, ORC
Scala versions: 2.12, 2.13

slfan1989 · 2026-02-24T01:25:37Z

@cxzl25 @richox I’ve submitted the first version of the Iceberg-support code. It can now basically read COW tables, and I’ve added unit tests that pass in CI. If you have some time, could you please take a look and share any feedback? Thank you very much!

cxzl25 · 2026-02-24T14:57:46Z

dev/reformat

+# Check or format all code, including third-party code, with spark-3.4
 sparkver=spark-3.5


Should the comment be spark-3.5?

dev/reformat

Copilot

Pull request overview

This PR adds native scan support for Apache Iceberg Copy-On-Write (COW) tables to the Auron execution engine, enabling direct reads of Iceberg data files through Auron's native path for improved performance. The implementation follows the established SPI (Service Provider Interface) pattern used by other data source integrations like Paimon, with automatic fallback to Spark's execution path for unsupported scenarios.

Changes:

Adds IcebergConvertProvider SPI extension to detect and convert Iceberg BatchScanExec nodes to native execution
Implements validation logic to determine COW table eligibility (no delete files, no metadata columns, supported data types)
Creates NativeIcebergTableScanExec to execute native Iceberg scans with Parquet/ORC format support

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
thirdparty/auron-iceberg/src/main/scala/org/apache/spark/sql/auron/iceberg/IcebergConvertProvider.scala	SPI provider that checks version compatibility and delegates to IcebergScanSupport
thirdparty/auron-iceberg/src/main/scala/org/apache/spark/sql/auron/iceberg/IcebergScanSupport.scala	Core validation logic to determine native scan eligibility and extract FileScanTask metadata via reflection
thirdparty/auron-iceberg/src/main/scala/org/apache/spark/sql/execution/auron/plan/NativeIcebergTableScanExec.scala	Native execution node that converts Iceberg tasks to FilePartitions and generates protobuf scan plans
thirdparty/auron-iceberg/src/test/scala/org/apache/auron/iceberg/AuronIcebergIntegrationSuite.scala	Integration tests covering COW tables, projections, partitioning, ORC format, and fallback scenarios
thirdparty/auron-iceberg/src/test/scala/org/apache/auron/iceberg/BaseAuronIcebergSuite.scala	Test base configuration with Auron and Iceberg extensions enabled
thirdparty/auron-iceberg/src/main/resources/META-INF/services/org.apache.spark.sql.auron.AuronConvertProvider	SPI registration file for IcebergConvertProvider
thirdparty/auron-iceberg/pom.xml	Maven enforcer rules to validate Iceberg version (1.10.1) and Spark version (3.4-4.0) compatibility
spark-extension/src/main/java/org/apache/auron/spark/configuration/SparkAuronConfiguration.java	Adds ENABLE_ICEBERG_SCAN configuration option
spark-extension/src/main/scala/org/apache/spark/sql/auron/AuronConverters.scala	Adds default value handling for shuffle manager configuration
spark-extension/pom.xml	Adds arrow-memory-core and arrow-memory-netty dependencies
pom.xml	Adds Iceberg version properties and enforcer rules for all Spark version profiles
auron-build.sh	Updates Iceberg version support to 1.10.1 and Spark version range to 3.4-4.0
dev/reformat	Updates formatting script to include Iceberg module with version 1.10.1
.github/workflows/iceberg.yml	CI workflow for testing Iceberg integration across Spark 3.4, 3.5, 4.0 with multiple Java versions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dev/reformat

...rty/auron-iceberg/src/test/scala/org/apache/auron/iceberg/AuronIcebergIntegrationSuite.scala

...rty/auron-iceberg/src/main/scala/org/apache/spark/sql/auron/iceberg/IcebergScanSupport.scala

dev/reformat

…n-Write Tables. Signed-off-by: slfan1989 <slfan1989@apache.org>

github-actions bot added spark build thirdparty-iceberg infra common labels Feb 18, 2026

slfan1989 force-pushed the auron-2015 branch from f954ce5 to 41e9318 Compare February 18, 2026 02:29

github-actions bot added dev-tools and removed common labels Feb 18, 2026

cxzl25 requested a review from Copilot February 24, 2026 14:55

Copilot started reviewing on behalf of cxzl25 February 24, 2026 14:56 View session

cxzl25 reviewed Feb 24, 2026

View reviewed changes

dev/reformat Outdated Show resolved Hide resolved

Copilot AI reviewed Feb 24, 2026

View reviewed changes

[AURON apache#2015] Add Native Scan Support for Apache Iceberg Copy-O…

6b907b1

…n-Write Tables. Signed-off-by: slfan1989 <slfan1989@apache.org>

slfan1989 force-pushed the auron-2015 branch from 9ad3de5 to 6b907b1 Compare March 4, 2026 13:42

slfan1989 and others added 2 commits March 4, 2026 21:48

[AURON apache#2015] Add Native Scan Support for Apache Iceberg Copy-O…

6ba770e

…n-Write Tables. Signed-off-by: slfan1989 <slfan1989@apache.org>

Merge branch 'apache:master' into auron-2015

5c11f33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AURON #2015] Add Native Scan Support for Apache Iceberg Copy-On-Write Tables.#2016

[AURON #2015] Add Native Scan Support for Apache Iceberg Copy-On-Write Tables.#2016
slfan1989 wants to merge 3 commits intoapache:masterfrom
slfan1989:auron-2015

slfan1989 commented Feb 18, 2026 •

edited

Loading

Uh oh!

slfan1989 commented Feb 24, 2026

Uh oh!

cxzl25 Feb 24, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		# Check or format all code, including third-party code, with spark-3.4
		sparkver=spark-3.5

Conversation

slfan1989 commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

Key Motivations:

What changes are included in this PR?

Core Implementation:

Build & Configuration:

Supported Features:

Version Support:

Are there any user-facing changes?

How was this patch tested?

Unit & Integration Tests:

Test Environment:

Uh oh!

slfan1989 commented Feb 24, 2026

Uh oh!

cxzl25 Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

slfan1989 commented Feb 18, 2026 •

edited

Loading