Commit 6d3e0f4b authored by JINMEI Tatuya's avatar JINMEI Tatuya
Browse files

[master] Merge branch 'trac2903'

parents 7ff403a4 6df5353a
......@@ -53,6 +53,78 @@ The asynchronous I/O code encountered an error when trying to send data to
the specified address on the given protocol. The number of the system
error that caused the problem is given in the message.
% ASIODNS_SYNC_UDP_CLOSE_FAIL failed to close a DNS/UDP socket: %1
This is the same to ASIODNS_UDP_CLOSE_FAIL but happens on the
"synchronous UDP server", mainly used for the authoritative DNS server
daemon.
% ASIODNS_TCP_ACCEPT_FAIL failed to accept TCP DNS connection: %1
Accepting a TCP connection from a DNS client failed due to an error
that could happen but should be rare. The reason for the error is
included in the log message. The server still keeps accepting new
connections, so unless it happens often it's probably okay to ignore
this error. If the shown error indicates something like "too many
open files", it's probably because the run time environment is too
restrictive on this limitation, so consider adjusing the limit using
a tool such as ulimit. If you see other types of errors too often,
there may be something overlooked; please file a bug report in that case.
% ASIODNS_TCP_CLEANUP_CLOSE_FAIL failed to close a DNS/TCP socket on port cleanup: %1
A TCP DNS server tried to close a TCP socket (one created on accepting
a new connection or is already unused) as a step of cleaning up the
corresponding listening port, but it failed to do that. This is
generally an unexpected event and so is logged as an error.
See also the description of ASIODNS_TCP_CLOSE_ACCEPTOR_FAIL.
% ASIODNS_TCP_CLOSE_ACCEPTOR_FAIL failed to close listening TCP socket: %1
A TCP DNS server tried to close a listening TCP socket (for accepting
new connections) as a step of cleaning up the corresponding listening
port (e.g., on server shutdown or updating port configuration), but it
failed to do that. This is generally an unexpected event and so is
logged as an error. See ASIODNS_TCP_CLOSE_FAIL on the implication of
related system resources.
% ASIODNS_TCP_CLOSE_FAIL failed to close DNS/TCP socket with a client: %1
A TCP DNS server tried to close a TCP socket used to communicate with
a client, but it failed to do that. While closing a socket should
normally be an error-free operation, there have been known cases where
this happened with a "connection reset by peer" error. This might be
because of some odd client behavior, such as sending a TCP RST after
establishing the connection and before the server closes the socket,
but how exactly this could happen seems to be system dependent (i.e,
it's not part of the standard socket API), so it's difficult to
provide a general explanation. In any case, it is believed that an
error on closing a socket doesn't mean leaking system resources (the
kernel should clean up any internal resource related to the socket,
just reporting an error detected in the close call), but, again, it
seems to be system dependent. This message is logged at a debug level
as it's known to happen and could be triggered by a remote node and it
would be better to not be too verbose, but you might want to increase
the log level and make sure there's no resource leak or other system
level troubles when it's logged.
% ASIODNS_TCP_GETREMOTE_FAIL failed to get remote address of a DNS TCP connection: %1
A TCP DNS server tried to get the address and port of a remote client
on a connected socket but failed. It's expected to be rare but can
still happen. See also ASIODNS_TCP_READLEN_FAIL.
% ASIODNS_TCP_READDATA_FAIL failed to get DNS data on a TCP socket: %1
A TCP DNS server tried to read a DNS message (that follows a 2-byte
length field) but failed. It's expected to be rare but can still happen.
See also ASIODNS_TCP_READLEN_FAIL.
% ASIODNS_TCP_READLEN_FAIL failed to get DNS data length on a TCP socket: %1
A TCP DNS server tried to get the length field of a DNS message (the first
2 bytes of a new chunk of data) but failed. This is generally expected to
be rare but can still happen, e.g, due to an unexpected reset of the
connection. A specific reason for the failure is included in the log
message.
% ASIODNS_TCP_WRITE_FAIL failed to send DNS message over a TCP socket: %1
A TCP DNS server tried to send a DNS message to a remote client but
failed. It's expected to be rare but can still happen. See also
ASIODNS_TCP_READLEN_FAIL.
% ASIODNS_UDP_ASYNC_SEND_FAIL Error sending UDP packet to %1: %2
The low-level ASIO library reported an error when trying to send a UDP
packet in asynchronous UDP mode. This can be any error reported by
......@@ -64,6 +136,23 @@ If you see a single occurrence of this message, it probably does not
indicate any significant problem, but if it is logged often, it is probably
a good idea to inspect your network traffic.
% ASIODNS_UDP_CLOSE_FAIL failed to close a DNS/UDP socket: %1
A UDP DNS server tried to close its UDP socket, but failed to do that.
This is generally an unexpected event and so is logged as an error.
% ASIODNS_UDP_RECEIVE_FAIL failed to receive UDP DNS packet: %1
Receiving a UDP packet from a DNS client failed due to an error that
could happen but should be very rare. The server still keeps
receiving UDP packets on this socket. The reason for the error is
included in the log message. This log message is basically not
expected to appear at all in practice; if it does, there may be some
system level failure and other system logs may have to be checked.
% ASIODNS_UDP_SYNC_RECEIVE_FAIL failed to receive UDP DNS packet: %1
This is the same to ASIODNS_UDP_RECEIVE_FAIL but happens on the
"synchronous UDP server", mainly used for the authoritative DNS server
daemon.
% ASIODNS_UDP_SYNC_SEND_FAIL Error sending UDP packet to %1: %2
The low-level ASIO library reported an error when trying to send a UDP
packet in synchronous UDP mode. See ASIODNS_UDP_ASYNC_SEND_FAIL for
......
......@@ -58,9 +58,15 @@ public:
template<class Ptr, class Server> void addServerFromFD(int fd, int af) {
Ptr server(new Server(io_service_.get_io_service(), fd, af, checkin_,
lookup_, answer_));
server->setTCPRecvTimeout(tcp_recv_timeout_);
(*server)();
servers_.push_back(server);
startServer(server);
}
// SyncUDPServer has different constructor signature so it cannot be
// templated.
void addSyncUDPServerFromFD(int fd, int af) {
SyncUDPServerPtr server(new SyncUDPServer(io_service_.get_io_service(),
fd, af, lookup_));
startServer(server);
}
void setTCPRecvTimeout(size_t timeout) {
......@@ -72,6 +78,13 @@ public:
(*it)->setTCPRecvTimeout(timeout);
}
}
private:
void startServer(DNSServerPtr server) {
server->setTCPRecvTimeout(tcp_recv_timeout_);
(*server)();
servers_.push_back(server);
}
};
DNSService::DNSService(IOService& io_service, SimpleCallback* checkin,
......@@ -95,8 +108,7 @@ void DNSService::addServerUDPFromFD(int fd, int af, ServerFlag options) {
<< options);
}
if ((options & SERVER_SYNC_OK) != 0) {
impl_->addServerFromFD<DNSServiceImpl::SyncUDPServerPtr,
SyncUDPServer>(fd, af);
impl_->addSyncUDPServerFromFD(fd, af);
} else {
impl_->addServerFromFD<DNSServiceImpl::UDPServerPtr, UDPServer>(
fd, af);
......
......@@ -39,18 +39,21 @@ namespace isc {
namespace asiodns {
SyncUDPServer::SyncUDPServer(asio::io_service& io_service, const int fd,
const int af, asiolink::SimpleCallback* checkin,
DNSLookup* lookup, DNSAnswer* answer) :
const int af, DNSLookup* lookup) :
output_buffer_(new isc::util::OutputBuffer(0)),
query_(new isc::dns::Message(isc::dns::Message::PARSE)),
answer_(new isc::dns::Message(isc::dns::Message::RENDER)),
checkin_callback_(checkin), lookup_callback_(lookup),
answer_callback_(answer), stopped_(false)
udp_endpoint_(sender_), lookup_callback_(lookup),
resume_called_(false), done_(false), stopped_(false),
recv_callback_(boost::bind(&SyncUDPServer::handleRead, this, _1, _2))
{
if (af != AF_INET && af != AF_INET6) {
isc_throw(InvalidParameter, "Address family must be either AF_INET "
"or AF_INET6, not " << af);
}
if (!lookup) {
isc_throw(InvalidParameter, "null lookup callback given to "
"SyncUDPServer");
}
LOG_DEBUG(logger, DBGLVL_TRACE_BASIC, ASIODNS_FD_ADD_UDP).arg(fd);
try {
socket_.reset(new asio::ip::udp::socket(io_service));
......@@ -61,59 +64,36 @@ SyncUDPServer::SyncUDPServer(asio::io_service& io_service, const int fd,
// convert it
isc_throw(IOError, exception.what());
}
udp_socket_.reset(new UDPSocket<DummyIOCallback>(*socket_));
}
void
SyncUDPServer::scheduleRead() {
socket_->async_receive_from(asio::buffer(data_, MAX_LENGTH), sender_,
boost::bind(&SyncUDPServer::handleRead, this,
_1, _2));
socket_->async_receive_from(asio::mutable_buffers_1(data_, MAX_LENGTH),
sender_, recv_callback_);
}
void
SyncUDPServer::handleRead(const asio::error_code& ec, const size_t length) {
// Abort on fatal errors
if (ec) {
using namespace asio::error;
if (ec.value() != would_block && ec.value() != try_again &&
ec.value() != interrupted) {
const asio::error_code::value_type err_val = ec.value();
// See TCPServer::operator() for details on error handling.
if (err_val == operation_aborted || err_val == bad_descriptor) {
return;
}
if (err_val != would_block && err_val != try_again &&
err_val != interrupted) {
LOG_ERROR(logger, ASIODNS_UDP_SYNC_RECEIVE_FAIL).arg(ec.message());
}
}
// Some kind of interrupt, spurious wakeup, or like that. Just try reading
// again.
if (ec || length == 0) {
scheduleRead();
return;
}
// OK, we have a real packet of data. Let's dig into it!
// XXX: This is taken (and ported) from UDPSocket class. What the hell does
// it really mean?
// The UDP socket class has been extended with asynchronous functions
// and takes as a template parameter a completion callback class. As
// UDPServer does not use these extended functions (only those defined
// in the IOSocket base class) - but needs a UDPSocket to get hold of
// the underlying Boost UDP socket - DummyIOCallback is used. This
// provides the appropriate operator() but is otherwise functionless.
UDPSocket<DummyIOCallback> socket(*socket_);
UDPEndpoint endpoint(sender_);
IOMessage message(data_, length, socket, endpoint);
if (checkin_callback_ != NULL) {
(*checkin_callback_)(message);
if (stopped_) {
return;
}
}
// If we don't have a DNS Lookup provider, there's no point in
// continuing; we exit the coroutine permanently.
if (lookup_callback_ == NULL) {
scheduleRead();
return;
}
// Make sure the buffers are fresh. Note that we don't touch query_
// because it's supposed to be cleared in lookup_callback_. We should
// eventually even remove this member variable (and remove it from
......@@ -121,41 +101,28 @@ SyncUDPServer::handleRead(const asio::error_code& ec, const size_t length) {
// implementation should be careful that it's the responsibility of
// the callback implementation. See also #2239).
output_buffer_->clear();
answer_->clear(isc::dns::Message::RENDER);
// Mark that we don't have an answer yet.
done_ = false;
resume_called_ = false;
// Call the actual lookup
(*lookup_callback_)(message, query_, answer_, output_buffer_, this);
(*lookup_callback_)(IOMessage(data_, length, *udp_socket_, udp_endpoint_),
query_, answer_, output_buffer_, this);
if (!resume_called_) {
isc_throw(isc::Unexpected,
"No resume called from the lookup callback");
}
if (stopped_) {
return;
}
if (done_) {
// Good, there's an answer.
// Call the answer callback to render it.
(*answer_callback_)(message, query_, answer_, output_buffer_);
if (stopped_) {
return;
}
asio::error_code ec;
socket_->send_to(asio::buffer(output_buffer_->getData(),
output_buffer_->getLength()),
sender_, 0, ec);
if (ec) {
socket_->send_to(asio::const_buffers_1(output_buffer_->getData(),
output_buffer_->getLength()),
sender_, 0, ec_);
if (ec_) {
LOG_ERROR(logger, ASIODNS_UDP_SYNC_SEND_FAIL).
arg(sender_.address().to_string()).
arg(ec.message());
arg(sender_.address().to_string()).arg(ec_.message());
}
}
......@@ -181,13 +148,13 @@ SyncUDPServer::stop() {
/// for it won't be scheduled by io service not matter it is
/// submit to io service before or after close call. And we will
/// get bad_descriptor error.
socket_->close();
socket_->close(ec_);
stopped_ = true;
if (ec_) {
LOG_ERROR(logger, ASIODNS_SYNC_UDP_CLOSE_FAIL).arg(ec_.message());
}
}
/// Post this coroutine on the ASIO service queue so that it will
/// resume processing where it left off. The 'done' parameter indicates
/// whether there is an answer to return to the client.
void
SyncUDPServer::resume(const bool done) {
resume_called_ = true;
......
......@@ -25,10 +25,14 @@
#include <dns/message.h>
#include <asiolink/simple_callback.h>
#include <asiolink/dummy_io_cb.h>
#include <asiolink/udp_socket.h>
#include <util/buffer.h>
#include <exceptions/exceptions.h>
#include <boost/function.hpp>
#include <boost/noncopyable.hpp>
#include <boost/scoped_ptr.hpp>
#include <stdint.h>
......@@ -39,29 +43,39 @@ namespace asiodns {
///
/// That means, the lookup handler must provide the answer right away.
/// This allows for implementation with less overhead, compared with
/// the UDPClass.
/// the \c UDPServer class.
class SyncUDPServer : public DNSServer, public boost::noncopyable {
public:
/// \brief Constructor
///
/// Due to the nature of this server, it's meaningless if the lookup
/// callback is NULL. So the constructor explicitly rejects that case
/// with an exception. Likewise, it doesn't take "checkin" or "answer"
/// callbacks. In fact, calling "checkin" from receive callback does not
/// make sense for any of the DNSServer variants (see Trac #2935);
/// "answer" callback is simply unnecessary for this class because a
/// complete answer is built in the lookup callback (it's the user's
/// responsibility to guarantee that condition).
///
/// \param io_service the asio::io_service to work with
/// \param fd the file descriptor of opened UDP socket
/// \param af address family, either AF_INET or AF_INET6
/// \param checkin the callbackprovider for non-DNS events
/// \param lookup the callbackprovider for DNS lookup events
/// \param answer the callbackprovider for DNS answer events
/// \param lookup the callbackprovider for DNS lookup events (must not be
/// NULL)
///
/// \throw isc::InvalidParameter if af is neither AF_INET nor AF_INET6
/// \throw isc::InvalidParameter lookup is NULL
/// \throw isc::asiolink::IOError when a low-level error happens, like the
/// fd is not a valid descriptor.
SyncUDPServer(asio::io_service& io_service, const int fd, const int af,
isc::asiolink::SimpleCallback* checkin = NULL,
DNSLookup* lookup = NULL, DNSAnswer* answer = NULL);
DNSLookup* lookup);
/// \brief Start the SyncUDPServer.
///
/// This is the function operator to keep interface with other server
/// classes. They need that because they're coroutines.
virtual void operator()(asio::error_code ec = asio::error_code(),
size_t length = 0);
size_t length = 0);
/// \brief Calls the lookup callback
virtual void asyncLookup() {
......@@ -114,22 +128,39 @@ private:
// If it was OK to have just a buffer, not the wrapper class,
// we could reuse the data_
isc::util::OutputBufferPtr output_buffer_;
// Objects to hold the query message and the answer
// Objects to hold the query message and the answer. The latter isn't
// used and only defined as a placeholder as the callback signature
// requires it.
isc::dns::MessagePtr query_, answer_;
// The socket used for the communication
std::auto_ptr<asio::ip::udp::socket> socket_;
// Wrapper of socket_ in the form of asiolink::IOSocket.
// "DummyIOCallback" is not necessary for this class, but using the
// template is the easiest way to create a UDP instance of IOSocket.
boost::scoped_ptr<asiolink::UDPSocket<asiolink::DummyIOCallback> >
udp_socket_;
// Place the socket puts the sender of a packet when it is received
asio::ip::udp::endpoint sender_;
// Callbacks
const asiolink::SimpleCallback* checkin_callback_;
// Wrapper of sender_ in the form of asiolink::IOEndpoint. It's set to
// refer to sender_ on initialization, and keeps the reference throughout
// this server class.
asiolink::UDPEndpoint udp_endpoint_;
// Callback
const DNSLookup* lookup_callback_;
const DNSAnswer* answer_callback_;
// Answers from the lookup callback (not sent directly, but signalled
// through resume()
bool resume_called_, done_;
// This turns true when the server stops. Allows for not sending the
// answer after we closed the socket.
bool stopped_;
// Placeholder for error code object. It will be passed to ASIO library
// to have it set in case of error.
asio::error_code ec_;
// The callback functor for internal asynchronous read event. This is
// stateless (and it will be copied in the ASIO library anyway), so
// can be const
const boost::function<void(const asio::error_code&, size_t)>
recv_callback_;
// Auxiliary functions
......@@ -144,3 +175,7 @@ private:
} // namespace asiodns
} // namespace isc
#endif // SYNC_UDP_SERVER_H
// Local Variables:
// mode: c++
// End:
......@@ -14,13 +14,6 @@
#include <config.h>
#include <unistd.h> // for some IPC/network system calls
#include <netinet/in.h>
#include <sys/socket.h>
#include <errno.h>
#include <boost/shared_array.hpp>
#include <log/dummylog.h>
#include <util/buffer.h>
......@@ -32,6 +25,14 @@
#include <asiodns/tcp_server.h>
#include <asiodns/logger.h>
#include <boost/shared_array.hpp>
#include <cassert>
#include <unistd.h> // for some IPC/network system calls
#include <netinet/in.h>
#include <sys/socket.h>
#include <errno.h>
using namespace asio;
using asio::ip::udp;
using asio::ip::tcp;
......@@ -100,41 +101,58 @@ TCPServer::operator()(asio::error_code ec, size_t length) {
CORO_REENTER (this) {
do {
/// Create a socket to listen for connections
/// Create a socket to listen for connections (no-throw operation)
socket_.reset(new tcp::socket(acceptor_->get_io_service()));
/// Wait for new connections. In the event of non-fatal error,
/// try again
do {
CORO_YIELD acceptor_->async_accept(*socket_, *this);
// Abort on fatal errors
// TODO: Log error?
if (ec) {
using namespace asio::error;
if (ec.value() != would_block && ec.value() != try_again &&
ec.value() != connection_aborted &&
ec.value() != interrupted) {
const error_code::value_type err_val = ec.value();
// The following two cases can happen when this server is
// stopped: operation_aborted in case it's stopped after
// starting accept(). bad_descriptor in case it's stopped
// even before starting. In these cases we should simply
// stop handling events.
if (err_val == operation_aborted ||
err_val == bad_descriptor) {
return;
}
// Other errors should generally be temporary and we should
// keep waiting for new connections. We log errors that
// should really be rare and would only be caused by an
// internal erroneous condition (not an odd remote
// behavior).
if (err_val != would_block && err_val != try_again &&
err_val != connection_aborted &&
err_val != interrupted) {
LOG_ERROR(logger, ASIODNS_TCP_ACCEPT_FAIL).
arg(ec.message());
}
}
} while (ec);
/// Fork the coroutine by creating a copy of this one and
/// scheduling it on the ASIO service queue. The parent
/// will continue listening for DNS connections while the
/// will continue listening for DNS connections while the child
/// handles the one that has just arrived.
CORO_FORK io_.post(TCPServer(*this));
} while (is_parent());
// From this point, we'll simply return on error, which will
// immediately trigger destroying this object, cleaning up all
// resources including any open sockets.
/// Instantiate the data buffer that will be used by the
/// asynchronous read call.
data_.reset(new char[MAX_LENGTH]);
/// Start a timer to drop the connection if it is idle
if (*tcp_recv_timeout_ > 0) {
timeout_.reset(new asio::deadline_timer(io_));
timeout_->expires_from_now(
timeout_.reset(new asio::deadline_timer(io_)); // shouldn't throw
timeout_->expires_from_now( // consider any exception fatal.
boost::posix_time::milliseconds(*tcp_recv_timeout_));
timeout_->async_wait(boost::bind(&do_timeout, boost::ref(*socket_),
asio::placeholders::error));
......@@ -144,29 +162,22 @@ TCPServer::operator()(asio::error_code ec, size_t length) {
CORO_YIELD async_read(*socket_, asio::buffer(data_.get(),
TCP_MESSAGE_LENGTHSIZE), *this);
if (ec) {
socket_->close();
CORO_YIELD return;
LOG_DEBUG(logger, DBGLVL_TRACE_BASIC, ASIODNS_TCP_READLEN_FAIL).
arg(ec.message());
return;
}
/// Now read the message itself. (This is done in a different scope
/// to allow inline variable declarations.)
CORO_YIELD {
InputBuffer dnsbuffer(data_.get(), length);
uint16_t msglen = dnsbuffer.readUint16();
const uint16_t msglen = dnsbuffer.readUint16();
async_read(*socket_, asio::buffer(data_.get(), msglen), *this);
}
if (ec) {
socket_->close();
CORO_YIELD return;
}
// Due to possible timeouts and other bad behaviour, after the
// timely reads are done, there is a chance the socket has
// been closed already. So before we move on to the actual
// processing, check that, and stop if so.
if (!socket_->is_open()) {
CORO_YIELD return;
LOG_DEBUG(logger, DBGLVL_TRACE_BASIC, ASIODNS_TCP_READDATA_FAIL).
arg(ec.message());
return;
}
// Create an \c IOMessage object to store the query.
......@@ -174,7 +185,12 @@ TCPServer::operator()(asio::error_code ec, size_t length) {
// (XXX: It would be good to write a factory function
// that would quickly generate an IOMessage object without
// all these calls to "new".)
peer_.reset(new TCPEndpoint(socket_->remote_endpoint()));
peer_.reset(new TCPEndpoint(socket_->remote_endpoint(ec)));
if (ec) {
LOG_DEBUG(logger, DBGLVL_TRACE_BASIC, ASIODNS_TCP_GETREMOTE_FAIL).
arg(ec.message());
return;
}
// The TCP socket class has been extended with asynchronous functions
// and takes as a template parameter a completion callback class. As
......@@ -183,7 +199,8 @@ TCPServer::operator()(asio::error_code ec, size_t length) {
// the underlying Boost TCP socket - DummyIOCallback is used. This
// provides the appropriate operator() but is otherwise functionless.
iosock_.reset(new TCPSocket<DummyIOCallback>(*socket_));
io_message_.reset(new IOMessage(data_.get(), length, *iosock_, *peer_));
io_message_.reset(new IOMessage(data_.get(), length, *iosock_,
*peer_));
// Perform any necessary operations prior to processing the incoming
// packet (e.g., checking for queued configuration messages).
......@@ -198,8 +215,7 @@ TCPServer::operator()(asio::error_code ec, size_t length) {
// If we don't have a DNS Lookup provider, there's no point in
// continuing; we exit the coroutine permanently.
if (lookup_callback_ == NULL) {
socket_->close();
CORO_YIELD return;
return;
}
// Reset or instantiate objects that will be needed by the
......@@ -210,25 +226,24 @@ TCPServer::operator()(asio::error_code ec, size_t length) {
// Schedule a DNS lookup, and yield. When the lookup is
// finished, the coroutine will resume immediately after
// this point.
// this point. On resume, this method should be called with its
// default parameter values (because of the signature of post()'s
// handler), so ec shouldn't indicate any error.
CORO_YIELD io_.post(AsyncLookup<TCPServer>(*this));
assert(!ec);
// The 'done_' flag indicates whether we have an answer
// to send back. If not, exit the coroutine permanently.
if (!done_) {
// TODO: should we keep the connection open for a short time
// to see if new requests come in?
socket_->close();
CORO_YIELD return;
return;
}
if (ec) {
CORO_YIELD return;
}
// Call the DNS answer provider to render the answer into
// wire format
(*answer_callback_)(*io_message_, query_message_,
answer_message_, respbuf_);
(*answer_callback_)(*io_message_, query_message_, answer_message_,
respbuf_);
// Set up the response, beginning with two length bytes.
lenbuf.writeUint16(respbuf_->getLength());
......@@ -240,13 +255,22 @@ TCPServer::operator()(asio::error_code ec, size_t length) {
// (though we have nothing further to do, so the coroutine
// will simply exit at that time).