Commit b3526430 authored by Stephen Morris's avatar Stephen Morris
Browse files

[2414] Merge branch 'master' into trac2414

parents f9e9ca67 c8b32f1a
......@@ -47,10 +47,15 @@ available. It is issued during server startup is an indication that
the initialization is proceeding normally.
% AUTH_CONFIG_LOAD_FAIL load of configuration failed: %1
An attempt to configure the server with information from the configuration
database during the startup sequence has failed. (The reason for
the failure is given in the message.) The server will continue its
initialization although it may not be configured in the desired way.
An attempt to configure the server with information from the
configuration database during the startup sequence has failed. The
server will continue its initialization although it may not be
configured in the desired way. The reason for the failure is given in
the message. One common reason is that the server failed to acquire a
socket bound to a privileged port (53 for DNS). In that case the
reason in the log message should show something like "permission
denied", and the solution would be to restart BIND 10 as a super
(root) user.
% AUTH_CONFIG_UPDATE_FAIL update of configuration failed: %1
At attempt to update the configuration the server with information
......
......@@ -82,6 +82,21 @@ the boss process will try to force them).
A debug message. The configurator is about to perform one task of the plan it
is currently executing on the named component.
% BIND10_CONNECTING_TO_CC_FAIL failed to connect to configuration/command channel; try -v to see output from msgq
The boss process tried to connect to the communication channel for
commands and configuration updates during initialization, but it
failed. This is a fatal startup error, and process will soon
terminate after some cleanup. There can be several reasons for the
failure, but the most likely cause is that the msgq daemon failed to
start, and the most likely cause of the msgq failure is that it
doesn't have a permission to create a socket file for the
communication. To confirm that, you can see debug messages from msgq
by starting BIND 10 with the -v command line option. If it indicates
permission problem for msgq, make sure the directory where the socket
file is to be created is writable for the msgq process. Note that if
you specify the -u option to change process users, the directory must
be writable for that user.
% BIND10_INVALID_STATISTICS_DATA invalid specification of statistics data specified
An error was encountered when the boss module specified
statistics data which is invalid for the boss specification file.
......@@ -94,11 +109,6 @@ and continue running as the specified user, but the user is unknown.
The boss module was not able to start every process it needed to start
during startup, and will now kill the processes that did get started.
% BIND10_KILL_PROCESS killing process %1
The boss module is sending a kill signal to process with the given name,
as part of the process of killing all started processes during a failed
startup, as described for BIND10_KILLING_ALL_PROCESSES
% BIND10_LOST_SOCKET_CONSUMER consumer %1 of sockets disconnected, considering all its sockets closed
A connection from one of the applications which requested a socket was
closed. This means the application has terminated, so all the sockets it was
......
......@@ -331,11 +331,7 @@ class BoB:
each one. It then clears that list.
"""
logger.info(BIND10_KILLING_ALL_PROCESSES)
for pid in self.components:
logger.info(BIND10_KILL_PROCESS, self.components[pid].name())
self.components[pid].kill(True)
self.components = {}
self.__kill_children(True)
def _read_bind10_config(self):
"""
......@@ -427,6 +423,7 @@ class BoB:
while self.cc_session is None:
# if we have been trying for "a while" give up
if (time.time() - cc_connect_start) > 5:
logger.error(BIND10_CONNECTING_TO_CC_FAIL)
raise CChannelConnectError("Unable to connect to c-channel after 5 seconds")
# try to connect, and if we can't wait a short while
......@@ -1145,6 +1142,21 @@ def main():
options = parse_args()
# Announce startup. Making this is the first log message.
try:
logger.info(BIND10_STARTING, VERSION)
except RuntimeError as e:
sys.stderr.write('ERROR: failed to write the initial log: %s\n' %
str(e))
sys.stderr.write("""\
TIP: if this is about permission error for a lock file, check if the directory
of the file is writable for the user of the bind10 process; often you need
to start bind10 as a super user. Also, if you specify the -u option to
change the user and group, the directory must be writable for the group,
and the created lock file must be writable for that user.
""")
sys.exit(1)
# Check user ID.
setuid = None
setgid = None
......@@ -1177,9 +1189,6 @@ def main():
logger.fatal(BIND10_INVALID_USER, options.user)
sys.exit(1)
# Announce startup.
logger.info(BIND10_STARTING, VERSION)
# Create wakeup pipe for signal handlers
wakeup_pipe = os.pipe()
signal.set_wakeup_fd(wakeup_pipe[1])
......
......@@ -178,6 +178,8 @@ class MsgQ:
if os.path.exists(self.socket_file):
os.remove(self.socket_file)
self.listen_socket.close()
sys.stderr.write("[b10-msgq] failed to setup listener on %s: %s\n"
% (self.socket_file, str(e)))
raise e
if self.poller:
......@@ -543,9 +545,10 @@ if __name__ == "__main__":
msgq = MsgQ(options.msgq_socket_file, options.verbose)
setup_result = msgq.setup()
if setup_result:
sys.stderr.write("[b10-msgq] Error on startup: %s\n" % setup_result)
try:
msgq.setup()
except Exception as e:
sys.stderr.write("[b10-msgq] Error on startup: %s\n" % str(e))
sys.exit(1)
try:
......
......@@ -16,6 +16,7 @@
import socket
import struct
import os
import errno
import copy
import subprocess
import copy
......@@ -36,16 +37,16 @@ class CreatorError(Exception):
passed to the __init__ function.
"""
def __init__(self, message, fatal, errno=None):
def __init__(self, message, fatal, error_num=None):
"""
Creates the exception. The message argument is the usual string.
The fatal one tells if the error is fatal (eg. the creator crashed)
and errno is the errno value returned from socket creator, if
and error_num is the errno value returned from socket creator, if
applicable.
"""
Exception.__init__(self, message)
self.fatal = fatal
self.errno = errno
self.errno = error_num
class Parser:
"""
......@@ -94,6 +95,13 @@ class Parser:
self.__socket = None
raise CreatorError(str(se), True)
def __addrport_str(self, address, port):
'''Convert a pair of IP address and port to common form for logging.'''
if address.family == socket.AF_INET:
return str(address) + ':' + str(port)
else:
return '[' + str(address) + ']:' + str(port)
def get_socket(self, address, port, socktype):
"""
Asks the socket creator process to create a socket. Pass an address
......@@ -136,9 +144,9 @@ class Parser:
elif answer == b'E':
# There was an error, read the error as well
error = self.__socket.recv(1)
errno = struct.unpack('i',
self.__read_all(len(struct.pack('i',
0))))
rcv_errno = struct.unpack('i',
self.__read_all(len(struct.pack('i',
0))))
if error == b'S':
cause = 'socket'
elif error == b'B':
......@@ -147,10 +155,22 @@ class Parser:
self.__socket = None
logger.fatal(BIND10_SOCKCREATOR_BAD_CAUSE, error)
raise CreatorError('Unknown error cause' + str(answer), True)
logger.error(BIND10_SOCKET_ERROR, cause, errno[0],
os.strerror(errno[0]))
raise CreatorError('Error creating socket on ' + cause, False,
errno[0])
logger.error(BIND10_SOCKET_ERROR, cause, rcv_errno[0],
os.strerror(rcv_errno[0]))
# Provide as detailed information as possible on the error,
# as error related to socket creation is a common operation
# trouble. In particular, we are intentionally very verbose
# if it fails due to "permission denied" so the administrator
# can easily identify what is wrong and how to fix it.
addrport = self.__addrport_str(address, port)
error_text = 'Error creating socket on ' + cause + \
' to be bound to ' + addrport + ': ' + \
os.strerror(rcv_errno[0])
if rcv_errno[0] == errno.EACCES:
error_text += ' - probably need to restart BIND 10 ' + \
'as a super user'
raise CreatorError(error_text, False, rcv_errno[0])
else:
self.__socket = None
logger.fatal(BIND10_SOCKCREATOR_BAD_RESPONSE, answer)
......
......@@ -15,6 +15,8 @@
#include "interprocess_sync_file.h"
#include <string>
#include <cerrno>
#include <cstring>
#include <stdlib.h>
#include <string.h>
......@@ -68,8 +70,8 @@ InterprocessSyncFile::do_lock(int cmd, short l_type) {
if (fd_ == -1) {
isc_throw(InterprocessSyncFileError,
"Unable to use interprocess sync lockfile: " +
lockfile_path);
"Unable to use interprocess sync lockfile ("
<< std::strerror(errno) << "): " << lockfile_path);
}
}
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment