Skip to content

Conversation

@LsomeYeah
Copy link
Contributor

Purpose

Linked issue: close #xxx

If new columns have been added to the table and new files containing these columns have been written while the dedicated compaction job in streaming mode has not been restarted, it may cause data loss. Specifically, the data from the new columns may be lost after compaction.

This pr will detect schema changes for newly added files for dedicated compaction in streaming mode, if schema of files has changed, the write will be refreshed.

Tests

API and Format

Documentation

}

@Nullable
public static WriterRefresher create(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduce a CompactRefresher which can wrap WriterRefresher.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants