Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • stork stork
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 274
    • Issues 274
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 25
    • Merge requests 25
  • Deployments
    • Deployments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • ISC Open Source ProjectsISC Open Source Projects
  • storkstork
  • Issues
  • #446
Closed
Open
Issue created Nov 06, 2020 by Marcin Siodelski@marcinMaintainer

Race and constraint errors when adding Kea with many subnets

This is a result of sanity checks for Stork 0.13.0 release. See #441 (comment 174606)

Copying over the errors and the debugging result below:

While testing the demo on Ubuntu 20.04 I apparently reproduced the issue earlier mentioned by @godfryd here: !230 (comment 173942)

I've got the following error trace:

server_1                  | ERRO[2020-11-05 10:03:47]      statepuller.go:278   cannot store application state: ERROR #23505 duplicate key value violates unique constraint "access_point_unique_idx"
server_1                  | problem with adding access point to app 11: &{11 control 8 127.0.0.1 8000 }
server_1                  | isc.org/stork/server/database/model.updateAppAccessPoints
server_1                  | 	/repo/build-root/backend/server/database/model/app.go:68
server_1                  | isc.org/stork/server/database/model.AddApp
server_1                  | 	/repo/build-root/backend/server/database/model/app.go:226
server_1                  | isc.org/stork/server/apps/kea.CommitAppIntoDB
server_1                  | 	/repo/build-root/backend/server/apps/kea/appkea.go:462
server_1                  | isc.org/stork/server/apps.GetMachineAndAppsState
server_1                  | 	/repo/build-root/backend/server/apps/statepuller.go:269
server_1                  | isc.org/stork/server/apps.(*StatePuller).pullData
server_1                  | 	/repo/build-root/backend/server/apps/statepuller.go:59
server_1                  | isc.org/stork/server/agentcomm.(*PeriodicPuller).pullerLoop
server_1                  | 	/repo/build-root/backend/server/agentcomm/puller.go:93
server_1                  | runtime.goexit
server_1                  | 	/repo/build-root/tools/1.13.5/go/src/runtime/asm_amd64.s:1357
server_1                  | problem with adding access points to app: &{ID:11 CreatedAt:2020-11-05 10:03:46.320564 +0000 UTC MachineID:8 Machine:0xc000e8e160 Type:kea Active:false Meta:{Version:1.7.3 ExtendedVersion:} AccessPoints:[0xc002c34820] Daemons:[0xc0007c00d0 0xc00023e8f0 0xc00037e0d0 0xc00037ef70]}
server_1                  | isc.org/stork/server/database/model.AddApp
server_1                  | 	/repo/build-root/backend/server/database/model/app.go:228
server_1                  | isc.org/stork/server/apps/kea.CommitAppIntoDB
server_1                  | 	/repo/build-root/backend/server/apps/kea/appkea.go:462
server_1                  | isc.org/stork/server/apps.GetMachineAndAppsState
server_1                  | 	/repo/build-root/backend/server/apps/statepuller.go:269
server_1                  | isc.org/stork/server/apps.(*StatePuller).pullData
server_1                  | 	/repo/build-root/backend/server/apps/statepuller.go:59
server_1                  | isc.org/stork/server/agentcomm.(*PeriodicPuller).pullerLoop
server_1                  | 	/repo/build-root/backend/server/agentcomm/puller.go:93
server_1                  | runtime.goexit
server_1                  | 	/repo/build-root/tools/1.13.5/go/src/runtime/asm_amd64.s:1357 
server_1                  | ERRO[2020-11-05 10:03:47]      statepuller.go:62    error occurred while getting info from machine 8: problem with storing application state in the database 
server_1                  | INFO[2020-11-05 10:03:47]      statepuller.go:67    completed pulling information from machines: 0/1 succeeded 
server_1                  | ERRO[2020-11-05 10:03:47]           puller.go:95    errors were encountered while pulling data from apps: problem with storing application state in the database
server_1                  | isc.org/stork/server/apps.(*StatePuller).pullData
server_1                  | 	/repo/build-root/backend/server/apps/statepuller.go:61
server_1                  | isc.org/stork/server/agentcomm.(*PeriodicPuller).pullerLoop
server_1                  | 	/repo/build-root/backend/server/agentcomm/puller.go:93
server_1                  | runtime.goexit
server_1                  | 	/repo/build-root/tools/1.13.5/go/src/runtime/asm_amd64.s:1357 
server_1                  | INFO[2020-11-05 10:03:48]       middleware.go:42    served request     

Looking further into the code I think that this is what happens:

  • User adds new machine via UI. The machine runs Kea with many subnets.
  • The machine entry is fairly quickly added to the database and then apps state is being fetched.
  • Before the app is added to the db its configuration is being fetched so it takes a while before we create app entry in the db.
  • The state puller monitors this machine already so it tries to get its updated state.
  • The state puller checks if the app is already in the db, but it is not because the first operation of machine creation is still in progress.
  • The state puller assumes it found a new Kea app because there is no such app in the db yet. The state puller will proceed as it was to add a new app.
  • The initial create machine operation finally completes.
  • The state pulling also completes and the state puller attempts to add the "new app" into db.
  • This operation causes no conflict with (app_id, type) because the puller assigned new app_id, treating it as a new machine.
  • This causes an attempt to insert the new app which fails because of the (machine_id, port) constraint.

Overall, the database state should remain consistent because the second instance of the app is not added. But, it produces a lot of work for the server to do parallel updates and also causes a lot of traffic.

The solution in #409 (closed) apparently worked because it fetched existing app by (type) rather than (type, app_id). Overall, the right approach would rather be to not do state pulling before the previous pull has completed. We'd need to implement some signaling for it.

Assignee
Assign to
Time tracking