Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • Kea Kea
  • Project information
    • Project information
    • Activity
    • Labels
    • Planning hierarchy
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 504
    • Issues 504
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 51
    • Merge requests 51
  • Deployments
    • Deployments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • ISC Open Source Projects
  • KeaKea
  • Issues
  • #1403

Closed
Open
Created Aug 26, 2020 by Marcin Siodelski@marcinDeveloper

Recover from two servers in partner-down and one way communication

We want to avoid the situations when two HA enabled servers get into the partner down state, but such situations may happen. Our state machine doesn't recover well from this situation when communication is broken one way, i.e. primary can't communicate with standby but standby can communicate with the primary (or vice versa). The server in the partner-down state seeing its partner in the same state will transition through: waiting, syncing to ready state and will remain there waiting for the other one. The other server will still remain in the partner-down state allocating new leases. Even though the first server was initially in sync, the longer the other one remains in the partner-down state the more leases will get allocated. The partner in the ready state will not get these leases. If the connection gets re-established they won't be re-synced. The only way to get them in sync is by manual restart. We should consider whether it is a problem or not and maybe do something smarter.

Edited Aug 26, 2020 by Marcin Siodelski
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking