Skip to content

StreamDNS - unified transport for DNS over TLS and DNS over TCP (and possibly more)

Artem Boldariev requested to merge artem-stream-dns into main

This merge request contains a DNS transport suitable for replacing both DNS over TLS and DNS over TCP, as described in #3374 (closed).

This merge replaces TCP, and TLS transports for DNS with a new unified transport - Stream DNS. The main goal is to reduce the number of TLS implementations and the number of DNS transports which carry the data in the same format sans the encryption.

Last but not least, this merge request is a foundational part of the PROXY protocol support both for DNS over TCP and DNS over TLS (DNS over HTTP/2 transport is already structured in such a way that allows extending it to support PROXYv2).

To give you a higher-level view of the merge request, it does the following:

  • extends generic TLS transport used in DoH to make it compatible with TCP. That means implementing more functionality from the TCP transport, mostly read timer management.
  • manual read timer control for both TCP and TLS. This is required due to requirements that a DNS transport imposes for timer control.
  • isc_dnsbuffer_t and isc_dnsstream_assembler_t data types. The first one is a wrapper on top of isc_buffer_t specifically attuned to handle small DNS messages; the second one represents a state machine for handling incoming DNS messages - this allows decoupling the code for assembling DNS messages out of data received over a network from the networking code itself to allow better testing of this critical part. In fact, one-third of the code is unit tests.
  • Contains the new transport itself - Stream DNS. It is a drop-in replacement for both DNS over TCP and DNS over TLS transports, which is significantly smaller, around 1100 lines of code (without supporting code), compared to around 3000 lines of code for both DNS over TCP and DNS over TLS. It is achieved by delegating networking and encryption to generic TCP and TLS code.
  • Integrates the new transport to the code base, replacing the TLS DNS and TCP DNS.

Despite the quite fundamental changes (i.e. replacing one of the core DNS transports) - none of the existing system/unit tests were adapted to work with it, proving that it is truly a drop-in replacement.

Initial Testing Results

We (@artem and @pspacek) have done initial testing of the transport from this branch when used both as TLS DNS and TCP DNS replacement. Overall, it can be summarised that it delivers what was initially promised:

  • It gets rid of one TLS implementation;
  • It is capable of replacing two transport codebases with one (TCP DNS and TLS DNS);
  • It is relatively small code-wise with clear separation between the networking code, TLS implementation and DNS message processing;
  • There is no code duplication.
  • It is not slower - on the contrary, it can be faster, especially compared to the old TLS DNS implementation.

It is worth noting that if we intend to support the PROXY protocol at some point, this design provides a natural way to implement it.

However, mostly due to layered design and being structured to avoid additional memory allocations when handling generic, small DNS messages by using small preallocated memory buffers that can be larger than the memory buffers allocated by the old code. The old code is structured to allocate as little memory as possible; the new code is written with the idea of avoiding memory allocations as much as possible. That should provide better performance as well as decrease stress on the memory manager, which should be beneficial on a long-running server.

Due to the layered design, there is a set of socket and handle objects for each layer, which also increases memory usage to a certain extent.

Additionally to that, during testing, there are some interesting peculiarities found. However, let's start with memory usage.

TCP

TLS

One can notice that it is the case both for DNS over TCP and DNS over TLS, as can be seen on these charts obtained during 1x load (these results are reproducible).

We should note that both the new branch and the main branch are capable of handling 4x load over TCP fairly equally (and both cannot handle more than 4x). I believe that the new code should have performed even better, but the current networking code is not a bottleneck in BIND anyway. Everything here is fine, taking into account the increased memory usage as described above.

The results for TCP can be found here.

TCP pipeline results

TCP 4x Response Rate

TCP 4x Latency

With DNS over TLS the results are a bit more interesting (and they are reproducible). Firstly, BIND continues to serve queries under both 5x and 6x load, although the significant number of them are SERVFAILs (obviously, 6x causes more of them - to the point when it does not make much sense to test it under this load).

TLS pipeline results

TLS Response Rate

But it needs to be said that the main branch also has a SERVFAIL spike, so it might not be new - just more visible under a higher load.

This TCP-only pipeline seems saner - most importantly, no big chunks of SERVFAILs. Failures also seems saner - the server just does not keep up, which is fine, sort of.

TCP Pipeline Rcodes

In this pipeline, it's clear that load 6x is way too much. After two minutes, it barely responds to all requests (without even looking at the RCODE).

TLS Pipeline Response Rate 6x

So, it is safe to say that testing 6x is a waste of resources for now.

Even on load the 5x it seems broken - it should not take 5+ seconds until it starts responding. Also, the number of non-okay RCODEs is suspicious.

TLS Pipeline Rcodes 5x

As we said above 4x load seems to be handled quite acceptably:

TLS Response Rate 4x

TLS Rcodes 4x

How can we interpret the given results?

We agreed on the following:

  1. This might not be a transport-layer issue as the server continues to serve requests to the best of its abilities.
  2. The new code behaves better (or not worse) than the old one.
  3. The transport layer is not the bottleneck.

So, overall, the new transport code is capable of handling more load, as despite the excessive SERVFAILs, it continued to serve clients. On the other hand, the old code cannot cope with the load. The new code should not be a bottleneck as we resolve performance issues in other BIND parts.

Shotgun Pipelines for Deeper Performance Investigation:

Closes #3374 (closed).

Edited by Artem Boldariev

Merge request reports