Commit 08e1873a authored by Michal 'vorner' Vaner's avatar Michal 'vorner' Vaner
Browse files

Merge branch 'trac213-incremental'

parents d0e0bab2 65f4be2b
- Read msgq configuration from configuration manager (Trac #213)
https://bind10.isc.org/ticket/213
- Provide more administrator options:
- Get process list
- Get information on a process (returns list of times started & stopped,
plus current information such as PID)
- Add a component (not necessary for parking lot, but...)
- Stop a component
- Force-stop a component
- Mechanism to wait for child to start before continuing
- Way to ask a child to die politely
- Start statistics daemon
- Statistics interaction (?)
- Use .spec file to define comands
- Rename "c-channel" stuff to msgq for clarity
- Use logger
- Reply to shutdown message?
- Some sort of group creation so termination signals can be sent to
children of children processes (if any)
......
......@@ -20,18 +20,72 @@ The boss process is starting up and will now check if the message bus
daemon is already running. If so, it will not be able to start, as it
needs a dedicated message bus.
% BIND10_CONFIGURATION_START_AUTH start authoritative server: %1
This message shows whether or not the authoritative server should be
started according to the configuration.
% BIND10_CONFIGURATION_START_RESOLVER start resolver: %1
This message shows whether or not the resolver should be
started according to the configuration.
% BIND10_INVALID_STATISTICS_DATA invalid specification of statistics data specified
An error was encountered when the boss module specified
statistics data which is invalid for the boss specification file.
% BIND10_COMPONENT_FAILED component %1 (pid %2) failed with %3 exit status
The process terminated, but the bind10 boss didn't expect it to, which means
it must have failed.
% BIND10_COMPONENT_RESTART component %1 is about to restart
The named component failed previously and we will try to restart it to provide
as flawless service as possible, but it should be investigated what happened,
as it could happen again.
% BIND10_COMPONENT_START component %1 is starting
The named component is about to be started by the boss process.
% BIND10_COMPONENT_START_EXCEPTION component %1 failed to start: %2
An exception (mentioned in the message) happened during the startup of the
named component. The componet is not considered started and further actions
will be taken about it.
% BIND10_COMPONENT_STOP component %1 is being stopped
A component is about to be asked to stop willingly by the boss.
% BIND10_COMPONENT_UNSATISFIED component %1 is required to run and failed
A component failed for some reason (see previous messages). It is either a core
component or needed component that was just started. In any case, the system
can't continue without it and will terminate.
% BIND10_CONFIGURATOR_BUILD building plan '%1' -> '%2'
A debug message. This indicates that the configurator is building a plan
how to change configuration from the older one to newer one. This does no
real work yet, it just does the planning what needs to be done.
% BIND10_CONFIGURATOR_PLAN_INTERRUPTED configurator plan interrupted, only %1 of %2 done
There was an exception during some planned task. The plan will not continue and
only some tasks of the plan were completed. The rest is aborted. The exception
will be propagated.
% BIND10_CONFIGURATOR_RECONFIGURE reconfiguring running components
A different configuration of which components should be running is being
installed. All components that are no longer needed will be stopped and
newly introduced ones started. This happens at startup, when the configuration
is read the first time, or when an operator changes configuration of the boss.
% BIND10_CONFIGURATOR_RUN running plan of %1 tasks
A debug message. The configurator is about to execute a plan of actions it
computed previously.
% BIND10_CONFIGURATOR_START bind10 component configurator is starting up
The part that cares about starting and stopping the right component from the
boss process is starting up. This happens only once at the startup of the
boss process. It will start the basic set of processes now (the ones boss
needs to read the configuration), the rest will be started after the
configuration is known.
% BIND10_CONFIGURATOR_STOP bind10 component configurator is shutting down
The part that cares about starting and stopping processes in the boss is
shutting down. All started components will be shut down now (more precisely,
asked to terminate by their own, if they fail to comply, other parts of
the boss process will try to force them).
% BIND10_CONFIGURATOR_TASK performing task %1 on %2
A debug message. The configurator is about to perform one task of the plan it
is currently executing on the named component.
% BIND10_INVALID_USER invalid user: %1
The boss process was started with the -u option, to drop root privileges
and continue running as the specified user, but the user is unknown.
......@@ -51,27 +105,15 @@ old process was not shut down correctly, and needs to be killed, or
another instance of BIND10, with the same msgq domain socket, is
running, which needs to be stopped.
% BIND10_MSGQ_DAEMON_ENDED b10-msgq process died, shutting down
The message bus daemon has died. This is a fatal error, since it may
leave the system in an inconsistent state. BIND10 will now shut down.
% BIND10_MSGQ_DISAPPEARED msgq channel disappeared
While listening on the message bus channel for messages, it suddenly
disappeared. The msgq daemon may have died. This might lead to an
inconsistent state of the system, and BIND 10 will now shut down.
% BIND10_PROCESS_ENDED_NO_EXIT_STATUS process %1 (PID %2) died: exit status not available
The given process ended unexpectedly, but no exit status is
available. See BIND10_PROCESS_ENDED_WITH_EXIT_STATUS for a longer
description.
% BIND10_PROCESS_ENDED_WITH_EXIT_STATUS process %1 (PID %2) terminated, exit status = %3
The given process ended unexpectedly with the given exit status.
Depending on which module it was, it may simply be restarted, or it
may be a problem that will cause the boss module to shut down too.
The latter happens if it was the message bus daemon, which, if it has
died suddenly, may leave the system in an inconsistent state. BIND10
will also shut down now if it has been run with --brittle.
% BIND10_PROCESS_ENDED process %2 of %1 ended with status %3
This indicates a process started previously terminated. The process id
and component owning the process are indicated, as well as the exit code.
This doesn't distinguish if the process was supposed to terminate or not.
% BIND10_READING_BOSS_CONFIGURATION reading boss configuration
The boss process is starting up, and will now process the initial
......@@ -107,6 +149,9 @@ The boss module is sending a SIGKILL signal to the given process.
% BIND10_SEND_SIGTERM sending SIGTERM to %1 (PID %2)
The boss module is sending a SIGTERM signal to the given process.
% BIND10_SETUID setting UID to %1
The boss switches the user it runs as to the given UID.
% BIND10_SHUTDOWN stopping the server
The boss process received a command or signal telling it to shut down.
It will send a shutdown command to each process. The processes that do
......@@ -125,11 +170,6 @@ which failed is unknown (not one of 'S' for socket or 'B' for bind).
The boss requested a socket from the creator, but the answer is unknown. This
looks like a programmer error.
% BIND10_SOCKCREATOR_CRASHED the socket creator crashed
The socket creator terminated unexpectedly. It is not possible to restart it
(because the boss already gave up root privileges), so the system is going
to terminate.
% BIND10_SOCKCREATOR_EOF eof while expecting data from socket creator
There should be more data from the socket creator, but it closed the socket.
It probably crashed.
......@@ -208,8 +248,15 @@ During the startup process, a number of messages are exchanged between the
Boss process and the processes it starts. This error is output when a
message received by the Boss process is not recognised.
% BIND10_START_AS_NON_ROOT starting %1 as a user, not root. This might fail.
The given module is being started or restarted without root privileges.
% BIND10_START_AS_NON_ROOT_AUTH starting b10-auth as a user, not root. This might fail.
The authoritative server is being started or restarted without root privileges.
If the module needs these privileges, it may have problems starting.
Note that this issue should be resolved by the pending 'socket-creator'
process; once that has been implemented, modules should not need root
privileges anymore. See tickets #800 and #801 for more information.
% BIND10_START_AS_NON_ROOT_RESOLVER starting b10-resolver as a user, not root. This might fail.
The resolver is being started or restarted without root privileges.
If the module needs these privileges, it may have problems starting.
Note that this issue should be resolved by the pending 'socket-creator'
process; once that has been implemented, modules should not need root
......
This diff is collapsed.
......@@ -4,16 +4,71 @@
"module_description": "Master process",
"config_data": [
{
"item_name": "start_auth",
"item_type": "boolean",
"item_name": "components",
"item_type": "named_set",
"item_optional": false,
"item_default": true
},
{
"item_name": "start_resolver",
"item_type": "boolean",
"item_optional": false,
"item_default": false
"item_default": {
"b10-auth": { "special": "auth", "kind": "needed", "priority": 10 },
"setuid": {
"special": "setuid",
"priority": 5,
"kind": "dispensable"
},
"b10-xfrin": { "special": "xfrin", "kind": "dispensable" },
"b10-xfrout": { "address": "Xfrout", "kind": "dispensable" },
"b10-zonemgr": { "address": "Zonemgr", "kind": "dispensable" },
"b10-stats": { "address": "Stats", "kind": "dispensable" },
"b10-stats-httpd": {
"address": "StatsHttpd",
"kind": "dispensable"
},
"b10-cmdctl": { "special": "cmdctl", "kind": "needed" }
},
"named_set_item_spec": {
"item_name": "component",
"item_type": "map",
"item_optional": false,
"item_default": { },
"map_item_spec": [
{
"item_name": "special",
"item_optional": true,
"item_type": "string"
},
{
"item_name": "process",
"item_optional": true,
"item_type": "string"
},
{
"item_name": "kind",
"item_optional": false,
"item_type": "string",
"item_default": "dispensable"
},
{
"item_name": "address",
"item_optional": true,
"item_type": "string"
},
{
"item_name": "params",
"item_optional": true,
"item_type": "list",
"list_item_spec": {
"item_name": "param",
"item_optional": false,
"item_type": "string",
"item_default": ""
}
},
{
"item_name": "priority",
"item_optional": true,
"item_type": "integer"
}
]
}
}
],
"commands": [
......
This diff is collapsed.
SUBDIRS = isc
python_PYTHON = bind10_config.py
nodist_python_PYTHON = bind10_config.py
pythondir = $(pyexecdir)
# Explicitly define DIST_COMMON so ${python_PYTHON} is not included
# as we don't want the generated file included in distributed tarfile.
DIST_COMMON = $(srcdir)/Makefile.am $(srcdir)/Makefile.in bind10_config.py.in
# When setting DIST_COMMON, then need to add the .in file too.
EXTRA_DIST = bind10_config.py.in
CLEANFILES = bind10_config.pyc
CLEANDIRS = __pycache__
......
......@@ -23,6 +23,10 @@ def reload():
global DATA_PATH
global PLUGIN_PATHS
global PREFIX
global LIBEXECDIR
LIBEXECDIR = ("@libexecdir@/@PACKAGE@"). \
replace("${exec_prefix}", "@exec_prefix@"). \
replace("${prefix}", "@prefix@")
BIND10_MSGQ_SOCKET_FILE = os.path.join("@localstatedir@",
"@PACKAGE_NAME@",
"msgq_socket").replace("${prefix}",
......
SUBDIRS = . tests
python_PYTHON = __init__.py sockcreator.py
python_PYTHON = __init__.py sockcreator.py component.py special_component.py
pythondir = $(pyexecdir)/isc/bind10
This diff is collapsed.
......@@ -202,6 +202,9 @@ class WrappedSocket:
class Creator(Parser):
"""
This starts the socket creator and allows asking for the sockets.
Note: __process shouldn't be reset once created. See the note
of the SockCreator class for details.
"""
def __init__(self, path):
(local, remote) = socket.socketpair(socket.AF_UNIX, socket.SOCK_STREAM)
......@@ -213,11 +216,20 @@ class Creator(Parser):
env['PATH'] = path
self.__process = subprocess.Popen(['b10-sockcreator'], env=env,
stdin=remote.fileno(),
stdout=remote2.fileno())
stdout=remote2.fileno(),
preexec_fn=self.__preexec_work)
remote.close()
remote2.close()
Parser.__init__(self, WrappedSocket(local))
def __preexec_work(self):
"""Function used before running a program that needs to run as a
different user."""
# Put us into a separate process group so we don't get
# SIGINT signals on Ctrl-C (the boss will shut everthing down by
# other means).
os.setpgrp()
def pid(self):
return self.__process.pid
......@@ -225,4 +237,3 @@ class Creator(Parser):
logger.warn(BIND10_SOCKCREATOR_KILL)
if self.__process is not None:
self.__process.kill()
self.__process = None
# Copyright (C) 2011 Internet Systems Consortium, Inc. ("ISC")
#
# Permission to use, copy, modify, and distribute this software for any
# purpose with or without fee is hereby granted, provided that the above
# copyright notice and this permission notice appear in all copies.
#
# THE SOFTWARE IS PROVIDED "AS IS" AND INTERNET SYSTEMS CONSORTIUM
# DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL
# INTERNET SYSTEMS CONSORTIUM BE LIABLE FOR ANY SPECIAL, DIRECT,
# INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING
# FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT,
# NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION
# WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
from isc.bind10.component import Component, BaseComponent
import isc.bind10.sockcreator
from bind10_config import LIBEXECDIR
import os
import posix
import isc.log
from isc.log_messages.bind10_messages import *
logger = isc.log.Logger("boss")
class SockCreator(BaseComponent):
"""
The socket creator component. Will start and stop the socket creator
accordingly.
Note: _creator shouldn't be reset explicitly once created. The
underlying Popen object would then wait() the child process internally,
which breaks the assumption of the boss, who is expecting to see
the process die in waitpid().
"""
def __init__(self, process, boss, kind, address=None, params=None):
BaseComponent.__init__(self, boss, kind)
self.__creator = None
def _start_internal(self):
self._boss.curproc = 'b10-sockcreator'
self.__creator = isc.bind10.sockcreator.Creator(LIBEXECDIR + ':' +
os.environ['PATH'])
self._boss.register_process(self.pid(), self)
self._boss.log_started(self.pid())
def _stop_internal(self):
self.__creator.terminate()
def name(self):
return "Socket creator"
def pid(self):
"""
Pid of the socket creator. It is provided differently from a usual
component.
"""
return self.__creator.pid() if self.__creator else None
def kill(self, forcefull=False):
# We don't really care about forcefull here
if self.__creator:
self.__creator.kill()
class Msgq(Component):
"""
The message queue. Starting is passed to boss, stopping is not supported
and we leave the boss kill it by signal.
"""
def __init__(self, process, boss, kind, address=None, params=None):
Component.__init__(self, process, boss, kind, None, None,
boss.start_msgq)
def _stop_internal(self):
"""
We can't really stop the message queue, as many processes may need
it for their shutdown and it doesn't have a shutdown command anyway.
But as it is stateless, it's OK to kill it.
So we disable this method (as the only time it could be called is
during shutdown) and wait for the boss to kill it in the next shutdown
step.
This actually breaks the recommendation at Component we shouldn't
override its methods one by one. This is a special case, because
we don't provide a different implementation, we completely disable
the method by providing an empty one. This can't hurt the internals.
"""
pass
class CfgMgr(Component):
def __init__(self, process, boss, kind, address=None, params=None):
Component.__init__(self, process, boss, kind, 'ConfigManager',
None, boss.start_cfgmgr)
class Auth(Component):
def __init__(self, process, boss, kind, address=None, params=None):
Component.__init__(self, process, boss, kind, 'Auth', None,
boss.start_auth)
class Resolver(Component):
def __init__(self, process, boss, kind, address=None, params=None):
Component.__init__(self, process, boss, kind, 'Resolver', None,
boss.start_resolver)
class CmdCtl(Component):
def __init__(self, process, boss, kind, address=None, params=None):
Component.__init__(self, process, boss, kind, 'Cmdctl', None,
boss.start_cmdctl)
class XfrIn(Component):
def __init__(self, process, boss, kind, address=None, params=None):
Component.__init__(self, process, boss, kind, 'Xfrin', None,
boss.start_xfrin)
class SetUID(BaseComponent):
"""
This is a pseudo-component which drops root privileges when started
and sets the uid stored in boss.
This component does nothing when stopped.
"""
def __init__(self, process, boss, kind, address=None, params=None):
BaseComponent.__init__(self, boss, kind)
self.uid = boss.uid
def _start_internal(self):
if self.uid is not None:
logger.info(BIND10_SETUID, self.uid)
posix.setuid(self.uid)
def _stop_internal(self): pass
def kill(self, forcefull=False): pass
def name(self):
return "Set UID"
def pid(self):
return None
def get_specials():
"""
List of specially started components. Each one should be the class than can
be created for that component.
"""
return {
'sockcreator': SockCreator,
'msgq': Msgq,
'cfgmgr': CfgMgr,
# TODO: Should these be replaced by configuration in config manager only?
# They should not have any parameters anyway
'auth': Auth,
'resolver': Resolver,
'cmdctl': CmdCtl,
# FIXME: Temporary workaround before #1292 is done
'xfrin': XfrIn,
# TODO: Remove when not needed, workaround before sockcreator works
'setuid': SetUID
}
PYCOVERAGE_RUN = @PYCOVERAGE_RUN@
#PYTESTS = args_test.py bind10_test.py
# NOTE: this has a generated test found in the builddir
PYTESTS = sockcreator_test.py
PYTESTS = sockcreator_test.py component_test.py
EXTRA_DIST = $(PYTESTS)
......
This diff is collapsed.
......@@ -50,7 +50,7 @@ if [ $status != 0 ]; then echo "I:failed"; fi
n=`expr $n + 1`
echo "I:Stopping b10-auth and checking that ($n)"
echo 'config set Boss/start_auth false
echo 'config remove Boss/components b10-auth
config commit
quit
' | $RUN_BINDCTL \
......@@ -61,7 +61,8 @@ if [ $status != 0 ]; then echo "I:failed"; fi
n=`expr $n + 1`
echo "I:Restarting b10-auth and checking that ($n)"
echo 'config set Boss/start_auth true
echo 'config add Boss/components b10-auth
config set Boss/components/b10-auth { "special": "auth", "kind": "needed" }
config commit
quit
' | $RUN_BINDCTL \
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment