Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • BIND BIND
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 570
    • Issues 570
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 100
    • Merge requests 100
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • ISC Open Source ProjectsISC Open Source Projects
  • BINDBIND
  • Merge requests
  • !2918

Fix a race in taskmgr between worker and task pausing/unpausing.

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Witold Krecicki requested to merge wpk/fix-taskmgr-pause-unpause-detach-race into master Jan 20, 2020
  • Overview 5
  • Commits 3
  • Pipelines 13
  • Changes 3

To reproduce the race - create a task, send two events to it, first one must take some time. Then, from the outside, pause(), unpause() and detach() the task.

When the long-running event is processed by the task it is in task_state_running state. When we called pause() the state changed to task_state_paused, on unpause we checked that there are events in the task queue, changed the state to task_state_ready and enqueued the task on the workers readyq. We then detach the task. The dispatch() is done with processing the event, it processes the second event in the queue, and then shuts down the task and frees it (as it's not referenced anymore). Dispatcher then takes the, already freed, task from the queue where it was wrongly put, causing an use-after free and, subsequently, either an assertion failure or a segmentation fault.

The probability of this happening is very slim, yet it might happen under a very high load, more probably on a recursive resolver than on an authoritative.

The fix introduces a new 'task_state_pausing' state - to which tasks are moved if they're being paused while still running. They are moved to task_state_paused state when dispatcher is done with them, and if we unpause a task in paused state it's moved back to task_state_running and not requeued.

Closes #1571 (closed)

Edited Jan 21, 2020 by Witold Krecicki
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: wpk/fix-taskmgr-pause-unpause-detach-race