[spark] Support data evolution for bucket table #6649

shidayang · 2025-11-21T09:39:48Z

Purpose

Support data-evolution for append only fixed hash table

Tests

API and Format

Documentation

leaves12138 · 2025-11-21T10:06:47Z

@JingsongLi

JingsongLi · 2025-11-24T08:38:44Z

paimon-core/src/main/java/org/apache/paimon/schema/SchemaValidation.java

    private static void validateRowTracking(TableSchema schema, CoreOptions options) {
        boolean rowTrackingEnabled = options.rowTrackingEnabled();
        if (rowTrackingEnabled) {
            checkArgument(


remove whole checkArgument.

JingsongLi · 2025-11-24T08:38:55Z

docs/content/append-table/row-tracking.md

 ```
 Notice that:
- Row tracking is only supported for unaware append tables, not for primary key tables. Which means you can't define `bucket` and `bucket-key` for the table.
+- Row tracking is only supported for unaware or hash_fixed bucket append tables, not for primary key tables.


supported for append tables

JingsongLi · 2025-11-24T08:41:26Z

And maybe you can add more tests for bucketed tables.

shidayang · 2025-11-27T06:02:26Z

And maybe you can add more tests for bucketed tables.

@JingsongLi I've made some modifications as required and consolidated the unit tests for both bucket and non-bucket cases. PTAL

JingsongLi · 2025-11-27T07:31:31Z

paimon-core/src/main/java/org/apache/paimon/operation/FileStoreCommitImpl.java

            checkArgument(
                    entry.file().fileSource().isPresent(),
                    "This is a bug, file source field for row-tracking table must present.");
-            if (entry.file().fileSource().get().equals(FileSource.APPEND)


Why modify here?

Because when performing compact in writer, the file will be rewritten by compact before it is committed and the rowId is generated. This scenario will result in the file generated by compact not having the rowid

For row-tracking only tables, compact files should not be assigned new row ids, row id already in Data File.

Maybe we should have a new bucketed append table mode here. Without considering sequence and compaction. Just like normal Append tables.

Maybe we should have a new bucketed append table mode here. Without considering sequence and compaction. Just like normal Append tables.

good idea

shidayang force-pushed the support-data-evolution-for-bucket branch from 6e05843 to 8c335f2 Compare November 21, 2025 10:02

shidayang changed the title ~~[#6794711734] Support data evolution for bucket table~~ [spark] Support data evolution for bucket table Nov 21, 2025

shidayang force-pushed the support-data-evolution-for-bucket branch from 8c335f2 to 635b51e Compare November 21, 2025 11:06

JingsongLi reviewed Nov 24, 2025

View reviewed changes

shidayang force-pushed the support-data-evolution-for-bucket branch 2 times, most recently from 02df75f to 14d57f6 Compare November 27, 2025 04:41

[spark] Support data evolution for bucket table

a483524

shidayang force-pushed the support-data-evolution-for-bucket branch from 14d57f6 to a483524 Compare November 27, 2025 07:30

JingsongLi reviewed Nov 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[spark] Support data evolution for bucket table #6649

[spark] Support data evolution for bucket table #6649

Uh oh!

shidayang commented Nov 21, 2025

Uh oh!

leaves12138 commented Nov 21, 2025

Uh oh!

JingsongLi Nov 24, 2025

Uh oh!

JingsongLi Nov 24, 2025

Uh oh!

JingsongLi commented Nov 24, 2025

Uh oh!

shidayang commented Nov 27, 2025

Uh oh!

JingsongLi Nov 27, 2025

Uh oh!

shidayang Nov 27, 2025

Uh oh!

JingsongLi Nov 27, 2025

Uh oh!

JingsongLi Nov 27, 2025

Uh oh!

shidayang Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[spark] Support data evolution for bucket table #6649

Are you sure you want to change the base?

[spark] Support data evolution for bucket table #6649

Uh oh!

Conversation

shidayang commented Nov 21, 2025

Purpose

Tests

API and Format

Documentation

Uh oh!

leaves12138 commented Nov 21, 2025

Uh oh!

JingsongLi Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

JingsongLi Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

JingsongLi commented Nov 24, 2025

Uh oh!

shidayang commented Nov 27, 2025

Uh oh!

JingsongLi Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

shidayang Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

JingsongLi Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

JingsongLi Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

shidayang Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants