Skip to content

Implement a write-stop during failover to ensure data consistency during a planned primary-secondary switch #384

@Paragrf

Description

@Paragrf

Motivation

Implement Controller modifications to ensure master-replica data consistency during failover, aligning with server-side changes in apache/kvrocks#3377

Solution

  • Step 1 (Pause): Send CLIENT PAUSE WRITE from the controller to the current master.

  • Step 2 (Wait): Monitor the master-replica sequence gap until it hits zero, ensuring no data loss.

  • Step 3 (Metadata): Update the global topology metadata for the switchover.

  • Step 4 (Switch & Unpause): Promote the target and demote the old master; then explicitly call CLIENT UNPAUSE on the old master to restore its status.

  • Step 5 (Replicate): Reconfigure all other followers to sync from the new master.

Are you willing to submit a PR?

I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions