Skip to content

Conversation

@leaves12138
Copy link
Contributor

@leaves12138 leaves12138 changed the title [WIP] [spark] Introduce global file index builder on spark [spark] Introduce global file index builder on spark Nov 27, 2025
* @param sequenceNumberNullable sequence number is not null for user, but is nullable when read
* and write
*/
public static RowType rowTypeWithRowTracking(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a new method: rowTypeWithRowId(rowType)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

}

/** We combine the previous and new index files by file name. */
static class GlobalIndexCombiner implements IndexManifestFileCombiner {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GlobalFileNameCombiner

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

this.context = context;
}

public List<IndexManifestEntry> build(Dataset<Row> input) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have a new framework to input range + files output index files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@JingsongLi
Copy link
Contributor

+1

@JingsongLi JingsongLi merged commit e14a811 into apache:master Nov 27, 2025
27 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants