Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • BIND BIND
  • Project information
    • Project information
    • Activity
    • Labels
    • Planning hierarchy
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 527
    • Issues 527
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 103
    • Merge requests 103
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • ISC Open Source Projects
  • BINDBIND
  • Issues
  • #1675

Closed
Open
Created Mar 12, 2020 by Michał Kępień@michalOwner

"logfileconfig" system test crashes occasionally

All recent failures of the logfileconfig system test look similar and are caused by segmentation faults. Two examples:

  • https://gitlab.isc.org/isc-projects/bind9/-/jobs/751371
  • https://gitlab.isc.org/isc-private/bind9/-/jobs/753929

I believe !2791 (merged) is the culprit.

The immediate problem is that lcfg is already destroyed when the segfault happens:

(gdb) print /x *lcfg
$26 = {magic = 0xdededede, lctx = 0xdededededededede, channels = {head = 0xdededededededede, 
    tail = 0xdededededededede}, channellists = 0xdededededededede, channellist_count = 0xdededede, 
  duplicate_interval = 0xdededede, highest_level = 0xdededede, tag = 0xdededededededede, dynamic = 0xde}

However, looking at the code, it could not have yet been destroyed only a few lines above the crash location (which is inside the while loop in isc_log_doit() in both cases, though on different lines) as e.g. lcfg->channellists was dereferenced without a segfault. This indicates that another thread must have freed lcfg while the crashing thread was running.

Note that in both crashes there is another thread running that is trying to lock the logging context mutex and in both cases that locking attempt is made from the same call site:

(gdb) frame 3
#3  0x00007f9490d6619c in isc_log_doit (lctx=0x1da1060, category=0x7f9490fadfd0, module=0x677730, 
    level=1, write_once=false, format=0x4649d8 "now using logging configuration from config file", 
    args=0x7f9489296480) at log.c:1537

This call site is inside load_configuration(), right after a call to isc_logconfig_use(), which contains a call to isc_logconfig_destroy().

I believe the crash happens upon startup and the order of events is as follows:

  1. Thread A (the crashing thread) tries to log a message using the default logging configuration. It reaches isc_log_doit(), locks the logging context mutex and fetches the pointer to the default logging configuration from lctx->logconfig.

  2. Thread B is in the process of loading configuration. It calls isc_logconfig_use() to replace the default logging configuration with the configured one. This causes isc_logconfig_destroy() to destroy the default logging configuration in the process.

  3. Thread A continues its attempt to log the requested message, but the default logging configuration is already destroyed and a segmentation fault occurs.

Note that before !2791 (merged), isc_logconfig_use() was holding the logging context mutex while updating lctx->logconfig, so the logging configuration could never be replaced while a message was being logged by another thread. !2791 (merged) replaced the LOCK()/UNLOCK() call in that location with a call to atomic_exchange_acq_rel(), allowing isc_logconfig_destroy() to be called while it was still in use by another thread.

This is a race between threads, it is not exploitable.

Assignee
Assign to
Time tracking