Implement comma-separated parsing for chunk key columns #4115

xxntti3n · 2025-09-09T15:55:19Z

This PR adds support for per-table chunk key column configuration in the PostgreSQL CDC connector, enabling fine-grained control over incremental snapshot
chunking. Previously, all tables shared the same chunk key column, which was inefficient for heterogeneous table schemas.

Key Changes

✨ Added chunk key column parsing logic to PostgreSQL DataSourceFactory
🔧 Implemented comma-separated configuration format support

Example Usage

sourceConf:
scan.incremental.snapshot.chunk.key-column: public.action_logs:created_at,public.service_logs:created_at

lvyanquan · 2025-09-11T02:00:06Z

Please add more description about this change.

Mrart · 2025-09-19T01:28:44Z

@xxntti3n I'm trying to understand what you want to do, do you want to configure a separate chunk key column for schema.table? What's wrong with the previous chunkkeycolumn configuration?

xxntti3n · 2025-09-25T03:26:04Z

@xxntti3n I'm trying to understand what you want to do, do you want to configure a separate chunk key column for schema.table? What's wrong with the previous chunkkeycolumn configuration?

yes i want to conf with multiple tables like mysql dialect . Example Conf : scan.incremental.snapshot.chunk.key-column: public.table1:created_at,public.table2:created_at . Currently, Postgres only support 1 table

…litting

xxntti3n · 2025-09-25T08:10:21Z

Hi @lvyanquan, @Mrart - I've added the PR description and would appreciate your review. Looking forward to your feedback. Thanks!

Mrart · 2025-12-15T02:06:42Z

...cdc/src/main/java/org/apache/flink/cdc/connectors/postgres/source/PostgresSourceBuilder.java

+                new PostgresIncrementalSource<>(
+                        configFactory, checkNotNull(deserializer), offsetFactory, dialect);
+
+        return source;


Why do we need to change this, and return is not good？

Mrart · 2025-12-15T02:10:46Z

.../org/apache/flink/cdc/connectors/base/source/reader/external/JdbcSourceFetchTaskContext.java

 @Internal
 public abstract class JdbcSourceFetchTaskContext implements FetchTask.Context {
-
+    private static final Logger LOG = LoggerFactory.getLogger(JdbcSourceFetchTaskContext.class);


Is there a place to call this LOG?

Mrart · 2025-12-15T02:14:02Z

...cdc-base/src/main/java/org/apache/flink/cdc/connectors/base/source/utils/JdbcChunkUtils.java

+
+                if (tableName != null && !tableName.equalsIgnoreCase(table)) {
+                    return false;
+                }


We can extract the test functions and extract the test functions？

Mrart · 2025-12-15T02:26:54Z

...ink-cdc-base/src/main/java/org/apache/flink/cdc/connectors/base/utils/SourceRecordUtils.java

+        if (keyStruct != null && keyStruct.schema().field(splitFieldName) != null) {
+            return new Object[] {keyStruct.get(splitFieldName)};
+        }
+        LOG.info("Get Split Key From Value {} {}", dataRecord, splitFieldName);


Does this place need debug logs?

Mrart · 2025-12-15T02:28:58Z

...ink-cdc-base/src/main/java/org/apache/flink/cdc/connectors/base/utils/SourceRecordUtils.java

+        Struct value = (Struct) dataRecord.value();
+        if (value == null) {
+            return null; // No value struct available
+        }


dataRecord may not be null？

Mrart · 2025-12-15T02:32:12Z

...java/org/apache/flink/cdc/connectors/postgres/source/config/PostgresSourceConfigFactory.java


    private int lsnCommitCheckpointsDelay;

+    private Map<ObjectPath, String> chunkKeyColumns = new HashMap<>();


Is ConcurrentHashMap more appropriate?

github-actions bot added base postgres-cdc-connector labels Sep 9, 2025

github-actions bot added the build label Sep 11, 2025

xxntti3n force-pushed the universal-chunk-column-dev branch from 055ec61 to d8d2087 Compare September 13, 2025 18:10

github-actions bot added mysql-cdc-connector and removed mysql-cdc-connector labels Sep 15, 2025

github-actions bot added the mysql-cdc-connector label Sep 22, 2025

xxntti3n force-pushed the universal-chunk-column-dev branch 4 times, most recently from 5314014 to 4fc6319 Compare September 23, 2025 16:45

xxntti3n force-pushed the universal-chunk-column-dev branch from bae9fcf to c76cf80 Compare September 25, 2025 07:29

github-actions bot removed common runtime mongodb-cdc-connector build mysql-cdc-connector dist e2e-tests mysql-pipeline-connector paimon-pipeline-connector oceanbase-pipeline-connector maxcompute-pipeline-connector debezium postgres-pipeline-connector labels Sep 25, 2025

xxntti3n force-pushed the universal-chunk-column-dev branch from 2f65cc3 to 2742ce7 Compare September 25, 2025 07:42

feat: support per-table chunk key columns for incremental snapshot sp…

956c4e0

…litting

xxntti3n force-pushed the universal-chunk-column-dev branch from 2742ce7 to 956c4e0 Compare September 25, 2025 07:44

github-actions bot added the docs Improvements or additions to documentation label Sep 25, 2025

xxntti3n changed the title ~~add handle chunkKeyColumn for Postgres~~ Implement comma-separated parsing for chunk key columns Sep 25, 2025

tien-nguyen6-cake added 2 commits October 18, 2025 13:24

log infor when get split key from value

0802b30

log infor when get split key from value

31397de

Mrart reviewed Dec 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement comma-separated parsing for chunk key columns #4115

Implement comma-separated parsing for chunk key columns #4115

Uh oh!

xxntti3n commented Sep 9, 2025 •

edited

Loading

Uh oh!

lvyanquan commented Sep 11, 2025

Uh oh!

Mrart commented Sep 19, 2025

Uh oh!

xxntti3n commented Sep 25, 2025 •

edited

Loading

Uh oh!

xxntti3n commented Sep 25, 2025

Uh oh!

Mrart Dec 15, 2025

Uh oh!

Mrart Dec 15, 2025

Uh oh!

Mrart Dec 15, 2025

Uh oh!

Mrart Dec 15, 2025

Uh oh!

Mrart Dec 15, 2025

Uh oh!

Mrart Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		private int lsnCommitCheckpointsDelay;

		private Map<ObjectPath, String> chunkKeyColumns = new HashMap<>();

Implement comma-separated parsing for chunk key columns #4115

Are you sure you want to change the base?

Implement comma-separated parsing for chunk key columns #4115

Uh oh!

Conversation

xxntti3n commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lvyanquan commented Sep 11, 2025

Uh oh!

Mrart commented Sep 19, 2025

Uh oh!

xxntti3n commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xxntti3n commented Sep 25, 2025

Uh oh!

Mrart Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Mrart Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Mrart Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Mrart Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Mrart Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Mrart Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

xxntti3n commented Sep 9, 2025 •

edited

Loading

xxntti3n commented Sep 25, 2025 •

edited

Loading