BIND issueshttps://gitlab.isc.org/isc-projects/bind9/-/issues2024-02-24T08:19:42Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4345Debug messages logging network traffic only include the address of one peer2024-02-24T08:19:42ZMichał KępieńDebug messages logging network traffic only include the address of one peerEven with `-d 99` used on the command line, `named` only logs lines
like:
28-Sep-2023 14:31:23.212 sending packet to 2001:503:ba3e::2:30#53
or:
28-Sep-2023 14:31:23.232 received packet from 2001:503:ba3e::2:30#53
However, net...Even with `-d 99` used on the command line, `named` only logs lines
like:
28-Sep-2023 14:31:23.212 sending packet to 2001:503:ba3e::2:30#53
or:
28-Sep-2023 14:31:23.232 received packet from 2001:503:ba3e::2:30#53
However, network traffic is always sent **from** one socket **to**
another. The currently available debug messages do not include the
sender's address (first example) or the receiver's address (second
example). As a result, just bumping up the log level is often not
enough to diagnose certain issues and a network traffic sniffer has to
be used in order to learn the details of the packets being exchanged.
This lack of detail sometimes also makes debugging system test issues
harder than it has to be. With multiple tests being run in parallel,
knowing the exact addresses and ports that were used by each running
`named` instance is crucial for determining whether a test failure was
caused by an unexpected interaction between tests or not. (Such issues
happened more than once in the past, particularly when network code
and/or the system test framework were being worked on.)
Debug messages logging network traffic should be extended to include
information about both sides of each communication channel.
While this issue is technically only tangential to #4344, having
detailed network-level information available would greatly improve the
benefits of the feature proposed here.May 2024 (9.18.27, 9.18.27-S1, 9.19.24)Michał KępieńMichał Kępieńhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4344Enable extraction of exact local socket addresses2024-02-24T08:19:47ZMichał KępieńEnable extraction of exact local socket addressesThe Network Manager API is currently unable to expose the exact
address/port that a local wildcard/TCP socket is bound to. This limits
the level of detail available to all sorts of traffic-logging code
(debug messages, dnstap, etc.)
Th...The Network Manager API is currently unable to expose the exact
address/port that a local wildcard/TCP socket is bound to. This limits
the level of detail available to all sorts of traffic-logging code
(debug messages, dnstap, etc.)
This has been previously discussed (in dnstap context) in #3143. Back
then, it quickly [emerged][1] that extracting the exact address that a
local wildcard/TCP socket is bound to requires issuing a system call.
Unfortunately, the function that would be responsible for doing this is
[called on a hot path][2]. After running some performance tests, it
[became obvious][3] that doing that unconditionally is a non-starter
performance-wise. The proposal was scrapped and replaced with a [note
in documentation](!6472).
However, the problem persists and limits the capabilities of not just
dnstap, but also logging code. In some cases, more detailed logging is
preferred over raw performance and there should be some way for the user
to express their preference in that regard.
[1]: https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5816#note_272336
[2]: https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5816#note_272404
[3]: https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5816#note_272407May 2024 (9.18.27, 9.18.27-S1, 9.19.24)Michał KępieńMichał Kępieńhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4585Add an option to named-compilezone to retain comments2024-02-18T03:27:07ZMarco DavidsAdd an option to named-compilezone to retain comments### Description
`named-compilezone` strips comments from zone files.
### Request
There might be use cases, where `named-compilezone` is used as a cleanup tool, while any comments that are present need to be retained.
It would be gre...### Description
`named-compilezone` strips comments from zone files.
### Request
There might be use cases, where `named-compilezone` is used as a cleanup tool, while any comments that are present need to be retained.
It would be great if this could be achieved by some command-line option.
Obviously there are some caveats, but it seems that these can be addressed by defining certain conditions that must be met and by properly documenting the right way of working. For example, just as a suggestion: to have any comments on the same line in the zonefile like this:
Undefined (may fail):
```
; comment 1
www AAAA 2001:db8::1
; comment 2
```
Well defined (will work):
`www AAAA 2001:db8::1 ; comment 1`
### Links / references
n/aNot plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4582Add support for QUIC and DNS over QUIC/DoQ (RFC 9250)2024-03-05T20:04:55ZArtem BoldarievAdd support for QUIC and DNS over QUIC/DoQ (RFC 9250)_This section is very likely to be changed/updated/expanded in the future._
## Overview
One of the relatively recent additions of transport protocols for DNS is QUIC (DoQ, covered by [RFC 9250](https://www.rfc-editor.org/rfc/rfc9250.ht..._This section is very likely to be changed/updated/expanded in the future._
## Overview
One of the relatively recent additions of transport protocols for DNS is QUIC (DoQ, covered by [RFC 9250](https://www.rfc-editor.org/rfc/rfc9250.html)) and HTTP/3 (DoH3), which also works on top of QUIC.
We need a generic implementation of the QUIC protocol in BIND's codebase to proceed with these new transports.
QUIC is a sophisticated transport that works on top of UDP and uses encryption on top of TLSv1.3. It is covered by multiple RFCs and is being actively worked on. The list of RFCs includes the following:
- https://www.rfc-editor.org/rfc/rfc9250.html
- https://www.rfc-editor.org/rfc/rfc8999.html
- https://www.rfc-editor.org/rfc/rfc9000.html
- https://www.rfc-editor.org/rfc/rfc9001.html
- https://www.rfc-editor.org/rfc/rfc9002.html
- https://www.rfc-editor.org/rfc/rfc9369.html
The protocol includes a lot of functionality, resembling the one from HTTP/2. Most notably, it is multiple uni- or bi-directional streams per connection. This aspect might have been influenced by the need to carry a new version of HTTP protocol (HTTP/3), which uses the multi-stream nature of QUIC instead of protocol-specific multiplexing found in HTTP/2.
Each bi-directional stream (in which we are the most interested, as these are used for DoQ) from the point of view of the higher-level code acts similarly to a TCP connection, though in the case of DNS, no more than one request/query per stream is allowed. That means that DNS pipelining is achieved by relying on the multistream nature of QUIC (again, similarly to HTTP/2). The thing is that each stream (being, effectively, a separate "connection") shares TLS parameters with others.
Another important aspect of QUIC is client connection end-point migration. That is, a client may change its location (= IP address and UDP port) while keeping the connection active. That functionality requires complete virtualisation of networking connections, which are now being identified not by IP address and port combination but by abstract connection identifiers (connection IDs). That brings a notion of a "connection path" into the picture as well as a procedure of "path migration" for a client.
Though that is my personal opinion, QUIC seems to be at least partially a result of large companies' experience of running user-space TCP/IP stacks on scale - when TCP/IP stack is implemented as a user-space library in an application which "directly" uses a dedicated network card. Indeed, in the case of QUIC, many things that we expect in-kernel TCP/IP implementation to do are brought under the control of a user-space application, including, but not limited to, such intricacies as congestion control. In this case, UDP can be seen as a portable kernel interface to the network card with the additional advantage of making it possible to run multiple applications simultaneously using a network card (which is not the case when using user-space TCP/IP stacks).
QUIC is often described as a replacement for the TCP protocol. One of the authors of "Computer Networks: A Systems Approach" [argues](https://www.theregister.com/2022/10/07/quic_tcp_replacement/) that it is, in fact, an addition to the internet protocols suite, which is meant to implement a missing paradigm - a basis for Remote Procedure Calls (RPC) protocols. That is, it might be considered a replacement of the TCP only in the cases when TCP was used due to a lack of a better alternative.
It should be noted that while QUIC is meant to be used as a universal transport for DNS, it, unlike HTTP/2 (DoH), can be used for zone transfers as well. It is [not always guaranteed](https://www.theregister.com/2021/08/04/dissecting_performance_of_production_quic/) that it will provide a significant performance boost. In fact, it might require more traffic in some cases, as even the initial QUIC message should be no less than 1200 bytes, which is a lot by common DNS measurements. However, it might compensate for that by lower latency due to 0-RTT support. Also, it seems to be more like a client-oriented protocol due to the ability to migrate client connections to new addresses (which is great for portable mobile devices). That being said, it is not clear how beneficial it would be to use it for server-to-server communications for things like zone transfers: servers do not change addresses often, and the protocol itself is more verbose than, e.g. DoT, so I fail to see the immediate benefits for this case (although it is standardised).
## The State of Open Source QUIC Implementations and the Great OpenSSL Schism
As it was stated before, QUIC includes TLSv1.3. Thus, most implementations decided to dedicate TLS-specific functionality to base the crypto-related bits to OpenSSL and its derivatives, which makes sense as these libraries are widely deployed and used. However, initially, these libraries lacked QUIC-specific parts in their TLS implementations.
As the early QUIC adopters were also QUIC implementors, they forked OpenSSL and added the missing bits. The related changes to the API ended up in multiple OpenSSL forks, namely [QuicTLS](https://quictls.github.io/) (maintained by Akamai and Microsoft), [LibreSSL](https://www.libressl.org/) (maintained by OpenBSD), and [BoringSSL](https://boringssl.googlesource.com/boringssl) (maintained by Google). These libraries only implement the low-level TLS 1.3 bits related to QUIC but do not have any internal QUIC implementations, as their early adopters developed their own.
For quite some time, the original OpenSSL had [a merge request](https://github.com/openssl/openssl/pull/8797) opened to implement the same API and remain mostly compatible with its forks, as was the case for a long time. However, OpenSSL authors decided to provide a high-level implementation of QUIC of their own making and eventually closed the MR. That caused a lot of drama, about which you can read [here](https://daniel.haxx.se/blog/2021/10/25/the-quic-api-openssl-will-not-provide/) or [here](https://github.com/haproxy/haproxy/issues/680#issuecomment-1433118828) as well as in other places. Regarding OpenSSL, it is worth keeping multiple things in mind.
Firstly, [OpenSSL's implementation](https://www.openssl.org/docs/manmaster/man7/openssl-quic.html) is not ready yet, as it includes only minimal client-side implementation starting from the version of OpenSSL v3.2. The server-side support was planned for 3.3 (April 2024) according to [the project's roadmap](https://www.openssl.org/roadmap.html), but eventually, it was moved to 3.4 (October 2024), as there are [many not completed tasks](https://github.com/orgs/openssl/projects/2/views/31?pane=issue&itemId=31713456), some of them are marked as "Epic". Even after that, we most likely will have only the most basic (_Minimal Working Product_) implementation that is not as battle-tested as some others and with missing features.
Secondly, OpenSSL does not allow the use of third-party implementations of QUIC: with OpenSSL, the only option is to use the internal QUIC implementation. That was [noted by Tatsuhiro Tsujikawa](https://github.com/ngtcp2/ngtcp2/issues/898#issuecomment-1692538880), the principal author of nghttp2/ngtcp2/nghttp3.
As a result of these decisions, most QUIC implementations chose to depend on QuicTLS and other forks that provide similar API. That list includes [MS-QUIC](https://github.com/microsoft/msquic), [Quiche](https://blog.cloudflare.com/enjoy-a-slice-of-quic-and-rust/), Chromium QUIC, as well as internal (=not exposed as a redistributable library as it is very specific) implementation in HAProxy. Probably, most other libraries are likely doing the same.
There are notable exceptions to this, though.
Firstly, NGINX does not depend on a fork-related functionality to implement QUIC support, nor does it depend on the OpenSSL implementation of QUIC. One could get this impression after reading [the announcement of this functionality in NGINX](https://www.nginx.com/blog/quic-http3-support-openssl-nginx/), which discusses that they implement _only_ OpenSSL compatibility layer in order to remain compatible with both OpenSSL and its now numerous forks. It should be noted, though, that in this particular case, whoever wrote the announcement was modest, as, in fact, NGINX includes [their own in-house QUIC implementation](https://github.com/nginx/nginx/tree/master/src/event/quic). And yes, it seems that they managed to do it without relying on any QUIC-related functionality in OpenSSL or its forks (like QuicTLS, LibreSSL, and BoringTLS). Their code seems to work with basically any OpenSSL-like library with TLSv1.3 support.
Secondly, there is [ngtcp2](https://github.com/ngtcp2/ngtcp2), which itself does not depend on any cryptographic library per se. The library itself may use one of the provided backends implemented on top of QuicTLS, GnuTLS, PicoTLS or BoringSSL and implemented as separate libraries, but it does not have to, as an application can provide a custom implementation of the [ngtcp2 crypto API](https://nghttp2.org/ngtcp2/crypto_apiref.html) as described in [the programmer's guide](https://nghttp2.org/ngtcp2/programmers-guide.html). The library itself is [very complete](https://nghttp2.org/ngtcp2/apiref.html) and seems to implement all the intricacies of the QUIC protocol, unlike the OpenSSL's implementation at the time of writing (and likely, for quiet some time after that).
At this point, it is clear that as far as QUIC support goes, the OpenSSL and its numerous forks **will remain incompatible**. _That is not a technical decision and, thus, hard to resolve._
## Implementing QUIC in BIND
For the reasons given above, I think that we should choose the ngtcp2 library as a basis for our QUIC transport. Additionally, it is much more mature than the implementation that OpenSSL will initially provide as and has been considered stable for a while (1.0.0 was released Oct 15, 2023). Moreover, before implementing the final IETF QUIC, it implemented a number of drafts, so it is safe to say that the authors have been tracking the development of the protocol very closely. Considering that the most recent RFCs have been implemented as well (like [QUIC version 2](https://www.rfc-editor.org/rfc/rfc9369.html), which, in fact, is a minor update of the protocol), it appears to still be the case. We can pair it with our own crypto API implementation inspired by the code from NGINX and currently provided crypto libraries. That will ensure that BIND still remains compatible with other OpenSSL forks, in particular on the platforms that use them by default (like OpenBSD). That should give us the flexibility of NGINX without the burden of maintaining our own QUIC implementation, as I cannot justify doing that as much as depending on any of the numerous OpenSSL forks.
I cannot justify waiting for OpenSSL to have their implementation completed either, as it is not clear how good the initial implementation will be, and I am not sure if the higher-level API they intend to provide is going to be the best fit for us. Ngtcp2 already looks more promising in that regard, not to mention that we have, in my opinion, a very good experience of using [nghttp2](https://nghttp2.org/) from the same author. Also, it is worth noting that at this stage, we have an internal subsystem for managing TLS contexts used by DoT and DoH, so it is very desirable to use it for QUIC as well and having our own ngtcp2 crypto API implementation should make it possible. We can use the code [from NGINX](https://github.com/nginx/nginx/blob/master/src/event/quic/ngx_event_quic_openssl_compat.c) and existing [ngtcp2 crypto libraries](https://github.com/ngtcp2/ngtcp2/tree/main/crypto) as examples.
Apart from choosing the library which will serve as the foundation of our QUIC implementation, there are many other problems to solve, which are no less challenging and will require some thinking and trial and error, so it is better for us not to wait for OpenSSL so that we will have more time to iron out our QUIC-related code before the new release. That is, the code related to connections and stream management and, ideally, connection migration is the most challenging, in my opinion, though the choice of the library will affect its structure for sure.
I think that it would be best to use and scale the experience of structuring the transport code in a way similar to Stream DNS and PROXYv2. It is very desirable as it allows testing without direct dependence on the networking code. By such a design, I mean implementing the most important parts of the QUIC code as a black box to which we pass data, and it calls the necessary callbacks when required.
On the highest level, bi-directional QUIC streams (in which we are interested the most for now) should map well to Stream DNS or a very similar transport (a QUIC Stream based on `isc_dnsstream_assembler_t`).
QUIC has some newer characteristics not present in other transports, like an ability by both end-points to create new streams at any moment as much as a concept of a generic multiplexed transport itself. These things are likely to be resolved in a manner similar to HTTP/2 - using a virtual connection per stream. Another thing to mention is connection migration. Our code is not ready for that (and, likely, never will), so we should set a QUIC Stream end-point information at the moment of creating and then never update that (from the point of view of the higher level code for compatibility reasons - in fact, we still can do the actual connection migration with all the open streams).
So, in short, it seems that we should attempt to use ngtcp2 with custom crypto API implementation first. The initial plan was to combine the client-side support for QUIC from OpenSSL and combine it with "OpenSSL Compatibility Layer" (as they named it) from NGINX (which turned out to be a complete internal only in-house QUIC implementation), but due to the reasons given above, that will not work as it is not possible to combine QUIC-related functionality in OpenSSL with any third-party QUIC implementation at all (and likely never will). Ngtcp2 is the only crypto library agnostic implementation of QUIC at the moment.
I would rather choose to depend on OpenSSL for our QUIC implementation as a backup plan.
## Bridging the Gap: How We Could Make an Ngtcp2 Crypto API Implementation
Let's see how we can proceed with providing our own ngtcp2 crypto API implementation.
### Ngtcp2 Crypto API Implementations Structure
As it was noted above, in order to work, ngtcp2 requires a crypto API implementation. The project provides multiple of them for the crypto libraries that have explicit support for QUIC. The list of these libraries includes (at the time of writing) QuicTLS, BoringSSL, GnuTLS, WolfSSL, and PicoTLS.
Each crypto API implementation library consists of two parts.
Firstly, it is a high-level, [shared part](https://github.com/ngtcp2/ngtcp2/blob/main/crypto/shared.c). Among other things, it includes default implementations of all callbacks used by ngtcp2, and some important API calls, most notably `ngtcp2_crypto_derive_and_install_rx_key()` and `ngtcp2_crypto_derive_and_install_tx_key()`, which are mentioned in [the ngtcp2 programmers guide](https://nghttp2.org/ngtcp2/programmers-guide.html).
Secondly, a low-level part that provides a foundation for the functionality of the shared part. There is an implementation for each of the above mentioned supported crypto libraries.
The shared part and a low-level part linked together is an ngtcp2 crypto API implementation.
Of these implementations, the most interesting for us is the one for [QuicTLS](https://github.com/ngtcp2/ngtcp2/blob/main/crypto/quictls/quictls.c) and, to a somewhat lesser extent, [BoringSSL](https://github.com/ngtcp2/ngtcp2/blob/main/crypto/boringssl/boringssl.c), as both are OpenSSL forks. In fact, we can use most of the code from there without much adaptation, as QuicTLS is essentially OpenSSL + QUIC-related API (it is regularly rebased on top of the "mainline" OpenSSL).
The QuicTLS crypto API implementation does not use many QUIC-related API calls. I have identified only the following:
- `SSL_CTX_set_quic_method()` - we would be better using a very similar SSL_set_quic_method();
- `SSL_provide_quic_data()`;
- `SSL_set_quic_transport_params()`/`SSL_get_peer_quic_transport_params()`;
- `SSL_process_quic_post_handshake()` - this one appears to be optional but omitting it will leave us without 0-RTT support (at least, for now).
There is one problem with how the crypto API libraries are structured - as there is no way to use only the shared part without depending on one of the above-mentioned crypto libraries (which would not work for us anyway).
So, in our ngtcp2 crypto API implementation, we would need to provide a replacement for the shared part - at least for the callbacks and `ngtcp2_crypto_derive_and_install_rx_key()`/ `ngtcp2_crypto_derive_and_install_tx_key()`. That is doable but very unfortunate, as it includes a lot of crypto-related things, not to mention that it is extra code to maintain.
Regarding the missing QUIC API in OpenSSL, there is a solution from NGINX which might work for us as well.
### NGINX OpenSSL Compatibility Layer
As noted above, NGINX includes a compatibility layer with NGINX as a part of its QUIC implementation. It includes the implementations of the following functions:
- `SSL_set_quic_method()`
- `SSL_provide_quic_data()`
- `SSL_set_quic_transport_params()`/`SSL_get_peer_quic_transport_params()`
NGINX's [OpenSSL compatibility layer](https://github.com/nginx/nginx/blob/master/src/event/quic/ngx_event_quic_openssl_compat.c) clearly provides an internal implementation of missing parts of BoringSSL QUIC API, which is not that different from QuicTLS API.
One missing thing is `SSL_process_quic_post_handshake()`, which is needed for 0-RTT support (to be more precise - for TLS early data processing). In fact, NGINX explicitly turns off TLS early data support when using QUIC. It is not clear if we can easily provide a replacement for this function, but even if not - we can live without 0-RTT support for now. At the time of writing, it is still optional in many crypto libraries.
It is worth noting that NGINX is a server application, so the provided QUIC API implementation might need some adjustments to work for client-side code.
At this point, it seems that despite some limitations, it is possible to provide a portable ngtcp2 crypto API implementation.BIND 9.21.xArtem BoldarievArtem Boldarievhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4540RFC 9471 DNS Glue Requirements in Referral Responses2024-02-08T10:27:45ZPeter DaviesRFC 9471 DNS Glue Requirements in Referral Responses[RFC 9471](https://www.rfc-editor.org/rfc/rfc9471.html) - DNS Glue Requirements in Referral Responses
It would be of help to users to implement RFC 9471 and allow BIND to reply TC=1 when
glue records would make a UDP reply larger than...[RFC 9471](https://www.rfc-editor.org/rfc/rfc9471.html) - DNS Glue Requirements in Referral Responses
It would be of help to users to implement RFC 9471 and allow BIND to reply TC=1 when
glue records would make a UDP reply larger than the maxium allowed.
3.2. Glue for Sibling Domain Name Servers
This document clarifes that when a name server generates a referral response, it
include all available glue records in the additional section. If, after adding glue for all in-domain
name servers, the glue for all sibling domain name servers does not ft due to message size
constraints, the name server set TC=1 but is not obligated to do so.BIND 9.19.xhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4532An option to not have bind9/dnssec-settime (possibly other tools) reset permi...2024-01-16T20:30:36ZDan MahoneyAn option to not have bind9/dnssec-settime (possibly other tools) reset permissions on a .private file.### Description
The `named` process and `dnssec-settime` (perhaps other tools) will take it upon themselves to change the permissions of a private key on certain changes.
However, we track our key-directory (and other configs) using gi...### Description
The `named` process and `dnssec-settime` (perhaps other tools) will take it upon themselves to change the permissions of a private key on certain changes.
However, we track our key-directory (and other configs) using git, with a group-shared repository.
Typical permissions on .private files are bind:bind with mode 660, but because a normal user (in the bind group) diffs/commits/pushes the repository, these keys can also be user:bind mode 660.
(Noting as well that our tooling is not more comfortable running git tasks as root, complaining of other permissions issues. Also, the less we can do as root, the better.)
With bind's usual permissions model, one cannot do a git diff/git log if the file is owned by bind. If the file is owned by user:bind, bind loses access to it on the permissions change.
Changing the umask under which the process runs doesn't seem to fix this, we tried.
Running a periodic cron job to fix this is a possible workaround, but feels like it shouldn't be necessary.
### Request
For command line tools, an option to not do this.
For `named, an `options` statement that lets us turn this off.
Both retaining the current behavior by default.
### Links / referencesNot plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4505Implement kTLS support in BIND2024-02-14T17:13:53ZArtem BoldarievImplement kTLS support in BIND
Recent versions of Linux and FreeBSD support TLS encryption by kernel (kTLS). One of the benefits of it is that when TLS encryption is performed by kernel, it might use additional hardware features otherwise not available in the user sp...
Recent versions of Linux and FreeBSD support TLS encryption by kernel (kTLS). One of the benefits of it is that when TLS encryption is performed by kernel, it might use additional hardware features otherwise not available in the user space, including offloading TLS encryption to the NICs that support that (e.g. [NVIDIA Mellanox ConnectX-6 Dx](https://www.nvidia.com/en-us/networking/ethernet/connectx-6-dx/)), almost completely freeing the CPU from this task, because even in the case of hardware acceleration of encryption within the CPU, it still requires some cycles from it. Also, using it might reduce memory copying in some cases.
Of course, kernel space encryption is more limited compared to the one provided by OpenSSL and its derivatives in the user space: these limitations are imposed by hardware - e.g. NICs might not support anything but AES 128 (aka `TLS_AES_128_GCM_SHA256`), as it is the only cipher mandatory for TLS v1.3). If it is good enough for WEB servers, it should be good enough for DNS, too.
Even when kTLS is used, the handshake itself happens in the user space (e.g. using OpenSSL) with negotiated parameters passed to the kernel using `setsockopt()` calls on a TCP socket descriptor.
OpenSSL provides support for kTLS encryption natively since version 3.X (see `SSL_OP_ENABLE_KTLS` [option](https://www.openssl.org/docs/manmaster/man3/SSL_set_options.html)) but, as far as I understand it, it does so only when OpenSSL manages the underlying TCP socket file descriptor natively: not our case, as we are using LibUV for that. However, considering that the idea of kTLS is that with it enabled, we are supposed to pass unencrypted data to `send()` and `recv()`, that is kTLS-enabled socket from the higher level perspective works (mostly) as a TCP socket, we might try the following approach to implement kTLS, that *might* work:
1. We use our existing code (`tlsstream.c`) to handle handshake, just like we do now;
2. After completing the handshake, we pass the negotiated information to the kernel. OpenSSL might have some interfaces for that. In the worst case, we might need to do that by hand using. `setsockopt()`;
3. Then, we add new code paths to `tlsstream.c` to bypass TLS connection objects (`isc_tls_t`) and use the underlying TCP connection directly, which, by now, works in "kTLS-mode", providing transparent TLS encryption;
4. Control messages, like TLS shutdown, will require additional care.
That is how I see the initial plan that might or might not work. There can (and, likely, will) be unforeseen obstacles that might turn out to overcomplicate the code base so much that it might make it unfeasible to implement, like adding a kTLS-only transport. Furthermore, that might require some assistance from LibUV. That will require some trial and error.
That is mostly written with Linux in mind. If the kTLS interface in FreeBSD is similar enough (it seems so at the first glance), we should support both platforms.
The issue is created mostly to dump the information from my mind and keep kTLS under our radar: we might want to do that, as at least `dnsdist` has experimental support for it. It will be even more important in the future, as it seems now that encrypted DNS transport will be even more important to the point of replacing the good ol' Do53 at some point.
For sure, it is not a 9.20 material - rather 9.21-9.22 if we are lucky, as it is a big feature. Also, I foresee a similar concept eventually appearing for QUIC, too (kQUIC?). Also, I am aquiet certain that we *will* need #3504 for this (implemented here: !8576).
See also:
1. https://docs.kernel.org/networking/tls.html
2. https://man.freebsd.org/cgi/man.cgi?query=ktls&apropos=0&sektion=0&manpath=FreeBSD+13.0-RELEASE+and+Ports&arch=default&format=html
3. https://delthas.fr/blog/2023/kernel-tls/ - mostly discusses it in the context of HTTP and `sendfile()` acceleration, but contains many references on the topic.
4. https://docs.nvidia.com/networking/display/ofedv512580/kernel+transport+layer+security+(ktls)+offloadsLong-termArtem BoldarievArtem Boldarievhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4426Feature request - client.bind chaos class queries2023-11-20T10:39:24ZRay BellisFeature request - client.bind chaos class queriesNot plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4370Check that a zone is served by IPv6 servers if it has AAAA records2023-12-05T09:04:32ZMark AndrewsCheck that a zone is served by IPv6 servers if it has AAAA recordsOne thing that is often forgotten when turning on an IPv6 service is to ensure that the zone holding the AAAA records for that service is also served over IPv6. This can be relatively easy be checked for by named-checkzone by looking fo...One thing that is often forgotten when turning on an IPv6 service is to ensure that the zone holding the AAAA records for that service is also served over IPv6. This can be relatively easy be checked for by named-checkzone by looking for AAAA records in the zone contents, including glue AAAA records, and then checking that there are AAAA records published for some of the nameservers if any are found (in zone or elsewhere). This can also sometimes be determined by named without needing to look beyond the zone's contents, but cannot be guaranteed.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4358Randomize client address to improve security against cache poisoning2023-10-10T12:46:26ZKasper DupontRandomize client address to improve security against cache poisoning### Description
Existing solutions to increase request entropy only provide a few extra bits. Port number randomization provide at most 16 bits of entropy and randomizing casing of the name will often provide less than that. Moreover fo...### Description
Existing solutions to increase request entropy only provide a few extra bits. Port number randomization provide at most 16 bits of entropy and randomizing casing of the name will often provide less than that. Moreover for responses spanning multiple packets that extra entropy may only protect the first packet of the response.
### Request
Take advantage of the larger address space provided by IPv6 to randomize both client IP address and port. Using a /72 prefix would leave 56 bits for randomization which is more than request ID, port number, and name randomization combined. The IP address is part of each packet and will thus protect all packets of the response. It does not require the other methods of adding entropy to be disabled, combined the entropy can be even higher.
Ideally the prefix length used will be configurable. Supporting only multiples of 8 will keep the implementation simpler. A /72 prefix length will be preferred in at least some deployments. It avoids randomizing the multicast and globally unique bits of the address. Additionally it's usable with providers who only route a /64 to customers. (Use cases exist for /48, /56, /64, /72, and /104 with /72 being the best compromise if only a single prefix length is supported.)
To use this feature the server administrator would need to:
- Ensure a prefix is routed to the host
- Choose a sub-prefix of that to use for source IP randomization (could be the entire range if the routed prefix is used for nothing else)
- Add a local route for the chosen sub-prefix (this functionality exists on Linux, I don't know which other OS supports this functionality.)
- Configure the chosen sub-prefix for address randomization in the BIND configuration.
The BIND recursion code will need to do the following:
- Before calling bind on a newly created socket set the IP_FREEBIND option.
- In the arguments for bind provide not just a port number but also a local IP address constructed by combining the configured prefix with a random bitstring to produce a total of 128 bits.
- Ensure that any replies not sent to the correct local IP address are ignored either by the kernel or by BIND itself.
When using TCP IP randomization should also be used, but it should not change how long the TCP connections are kept alive. So multiple queries could be sent over a TCP connection where IP randomization was applied only once.
### Links / references
The security feature requested here already exists in Unbound and can be configured using outgoing-interface: https://nlnetlabs.nl/documentation/unbound/unbound.conf/Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4348implement QPDB databases2024-03-08T23:38:09ZEvan Huntimplement QPDB databasesCreate QP-trie based databases to take the place of RBTDB, for use as a:
- [ ] zone database
- [ ] cacheCreate QP-trie based databases to take the place of RBTDB, for use as a:
- [ ] zone database
- [ ] cacheBIND 9.21.xEvan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/4303dnstap logging based on rcode2023-09-13T14:09:41ZPetr Špačekpspacek@isc.orgdnstap logging based on rcode### Motivation
When FORMERR or SERVFAIL happen in the middle of resolution, we don't have exact information what we have sent and what exactly came back. We have to guess and attempt to reproduce the problem with `dig` or other tools.
#...### Motivation
When FORMERR or SERVFAIL happen in the middle of resolution, we don't have exact information what we have sent and what exactly came back. We have to guess and attempt to reproduce the problem with `dig` or other tools.
### Request
Add parameter to `dnstap` statement which would allow logging just selected RCODEs. Presumably FORMERR and/or SERVFAIL. I imagine that this could be so low-touch that it could be running in production indefinitely (as a cyclic buffer).
### Links / referencesNot plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4256implement 0x202023-08-22T16:07:55ZEvan Huntimplement 0x20A recent conversation on dnsop reminded me that several of the open source servers have implemented the 0x20 draft, and now google public DNS has done so as well, and we still haven't.
The idea is to add entropy to outgoing queries by r...A recent conversation on dnsop reminded me that several of the open source servers have implemented the 0x20 draft, and now google public DNS has done so as well, and we still haven't.
The idea is to add entropy to outgoing queries by randomizing the case of letters in the query name. There are two parts to this:
1. The resolver requires responses to have an exact bit-for-bit copy of the name that was sent, and ignores responses that don't. We'd probably need a `server` option to relax this requirement in the event that a remote server was known to be responding persistently with the QNAME downsized. (This is arguably something we might want to do just for the sake of better protocol compliance; our current practice of case-insensitive QNAME matching seems a little iffy to me.)
2. When sending queries, the resolver randomly capitalizes letters in query names. We'd need a `view` option to decide whether to do this. For a first iteration I'd default to off.
Pros:
- cheap way to increase entropy, so why not
- ticks off a feature-parity box with unbound, knot resolver, google public DNS, probably others
Cons:
- doesn't add much entropy for short QNAMEs, which are more frequent now with QNAME minimization, and kinda important
- some increase in complexity
- may break resolution with some servers that work now
- we already have DNS COOKIE and should prioritize thatNot plannedEvan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/4239Add dnstap-output support for tcp2023-08-02T08:00:29ZMr BenAdd dnstap-output support for tcp### Description
goals:Add dnstap-output support for tcp
### Request
###
Now the output of dnstap only supports file and unix domain socket to generate logs. I expect to output logs in the form of tcp, so that the log server can be on ...### Description
goals:Add dnstap-output support for tcp
### Request
###
Now the output of dnstap only supports file and unix domain socket to generate logs. I expect to output logs in the form of tcp, so that the log server can be on other hosts. Reduce the overhead of the dns server itself.
The reason for such a requirement is that in the production environment, even if the dns server itself generates logs, they will still be output to other places for analysis and detection through tcp/udp. It is better to output directly to other hosts.
###
### Links / references
###
Like coredns, it supports tcp output log.
dnstap /tmp/dnstap.sock
dnstap unix:///tmp/dnstap.sock full
dnstap tcp://127.0.0.1:6000 full
dnstap tcp://example.com:6000 full
###Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4233dig: add +human option2023-08-08T13:04:27ZJulia Evansdig: add +human option### Description
I've spent a lot of time explaining `dig` to newcomers to DNS over the years, and I've found that they generally find `dig`'s output format to be very inscrutable. Of course, there's always `+short` or `+noall +answer` f...### Description
I've spent a lot of time explaining `dig` to newcomers to DNS over the years, and I've found that they generally find `dig`'s output format to be very inscrutable. Of course, there's always `+short` or `+noall +answer` for a terser output, but I generally want to explain more advanced DNS concepts to people (like glue records or SOA records for example), and for that you do need the full output.
Some specific things that I find confusing in dig's default output: ([example output here](https://gist.github.com/jvns/2d252e3f54a86ee6b2ea001e733a8e78))
* There's some ASCII art decoration (`<<>> DiG 9.10.6 <<>>`, `->>HEADER<<-`) that feels very ad hoc and it's hard to tell initially if those symbols are supposed to have some technical meaning. (why is `->>HEADER<--` styled like that, but not `OPT PSEUDOSECTION`?)
* the header is split across 2 lines, and it's not completely clear that the second line is also part of the header
* overall, it's not obvious which pieces of information are part of the DNS response itself and which aren't. For example, is `global options: +cmd` part of the DNS response? (of course it isn't, I don't think that's immediately obvious)
* There's no newline between `OPT PSEUDOSECTION` and `QUESTION SECTION`, but there is a newline between `QUESTION SECTION` and `ANSWER SECTION`. There seems to be no reason for that inconsistency.
* In `MSG SIZE rcvd: 56`, why are there two spaces between `SIZE` and `rcvd`? Are there more fields in `MSG SIZE`? (I checked the source code and the answer is no, the code says `printf(";; MSG SIZE rcvd: %u\n", bytes);`, so it seems like this is just an ad hoc choice)
* The `;;` prefix is confusing to many people. I realize it's because `;` it's the comment character in a zone file, but I personally do not use `bind` or zone files and most DNS users I talk to don't either: they either use web-based admin consoles to administrate their DNS records or do it through an API like Route 53. So the `;` character isn't familiar.
These might sound a little nitpicky -- each of these things on its own is pretty minor, and of course most users of dig learn to ignore them. But taken together I've found that newcomers are often misled into thinking that DNS responses are much more complicated than they actually are, which is really unfortunate.
I find the way Wireshark displays DNS packets to be much more clear ([screenshot](https://jvns.ca/images/dns-wireshark.png)), even though they're both working at around the same level of detail.
### Request
I realize that `dig`'s default output format needs to remain relatively stable because people parse it in scripts. But would ISC be open to adding a `+human` (or something) option to dig that's designed to be more intuitive for newcomers to dig? Similarly to how `du` has an `-h` option.
I'm imagining something like this:
```
$ dig +human example.com
Received response from 192.168.1.1:53 (UDP), 68 bytes in 16ms
HEADER:
status: NOERROR
opcode: QUERY
id: 15451
flags: qr rd ra
records: QUESTION: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
OPT PSEUDOSECTION:
EDNS: version: 0, flags: None, udp: 4096
QUESTION SECTION:
example.com. IN A
ANSWER SECTION:
example.com. 78709 IN A 93.184.216.34
```
I think the important thing is to make it easy for a newcomer to see at a glance that there are 4 parts to this DNS response (the header, the EDNS record, the question, and the answer)
I created a very rough proof of concept at https://github.com/jvns/dig-pretty that parses dig's `+yaml` format and outputs a format like what I suggested above, with a tiny bit of syntax highlighting for the DNS status code.
### Alternatives I've considered
* `+short` or `+noall +answer` are great for a lot of use cases, but as I mentioned above, they don't work for more advanced usage like looking at the `SOA` record on a `NXDOMAIN` response.
* We already have `+yaml`, but I find `+yaml` to be extremely verbose (the output for `dig +yaml example.com` doesn't fit in my terminal window), and it's really a machine format and not a human format.
* There are also alternative DNS tools (like `dog`) that aim to be more user friendly, but in general I've found those tools to be missing important features that `dig` has.
Thanks for considering this! I love dig and would love to see it become a little more approachable.
### Links / referencesNot plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4218Extract the no DS proof, if any, from the referral and save/validate it.2023-07-25T06:56:43ZMark AndrewsExtract the no DS proof, if any, from the referral and save/validate it.Insecure referrals contain a no DS proof. We are currently not using it and instead are make DS queries and validating those. Extracting the no DS proof should resolution for data in insecure zones.Insecure referrals contain a no DS proof. We are currently not using it and instead are make DS queries and validating those. Extracting the no DS proof should resolution for data in insecure zones.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4216Report missing DNSSEC algorithms in returned RRSIGs2023-11-02T16:30:30ZMark AndrewsReport missing DNSSEC algorithms in returned RRSIGsWhile we only require one DNSSEC algorithm to validate a response reporting missing algorithms listed in the DS RRset will be useful for the overall health of the system.While we only require one DNSSEC algorithm to validate a response reporting missing algorithms listed in the DS RRset will be useful for the overall health of the system.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4162SHA-1 removal2023-06-27T07:27:14ZPetr Špačekpspacek@isc.orgSHA-1 removal### Description
From [NIST announcement](https://csrc.nist.gov/news/2022/nist-transitioning-away-from-sha-1-for-all-apps):
> As a result, NIST will transition away from the use of SHA-1 for applying cryptographic protection to **all app...### Description
From [NIST announcement](https://csrc.nist.gov/news/2022/nist-transitioning-away-from-sha-1-for-all-apps):
> As a result, NIST will transition away from the use of SHA-1 for applying cryptographic protection to **all applications** by December 31, 2030
### Request
Don't get caught asleep at the wheel.
### Links / references
Send questions about the transition in an email to sha-1-transition@nist.gov. Visit the [Policy on Hash Functions](https://csrc.nist.gov/projects/hash-functions/nist-policy-on-hash-functions) page to learn more.Long-term2030-12-31https://gitlab.isc.org/isc-projects/bind9/-/issues/4161Support quantum safe DNSSEC algorithms2023-06-27T07:27:28ZPetr Špačekpspacek@isc.orgSupport quantum safe DNSSEC algorithms### Description
Reportedly US government is going to mandate post-quantum algorithm support from 2026 onward, with no legacy algorithms allowed after 2033.
### Request
Explore how we can integrate quantum safe algorithms for early exp...### Description
Reportedly US government is going to mandate post-quantum algorithm support from 2026 onward, with no legacy algorithms allowed after 2033.
### Request
Explore how we can integrate quantum safe algorithms for early experimentation.
Many algorithms are already available as OpenSSL provider here: https://github.com/open-quantum-safe/oqs-provider
### Additional details
* [FALCON implementation in PowerDNS](https://indico.dns-oarc.net/event/42/contributions/902/attachments/871/1601/Post-Quantum%20DNSSEC%20with%20FALCON-512%20and%20PowerDNS(2).pdf)
* [Verisign's presentation](https://indico.dns-oarc.net/event/46/contributions/985/attachments/938/1728/OARC40-ResearchAgendaForAPQCDNSSEC-Final.pdf)
Word of mouth from Red Hat crypto people I talked to:
Right now it seems that NIST might standardize 5 algorithms, with several variants for each algorithm with intent to provide 128/256 bit-equivalent of security.
Rambling about candidate algorithms for DNSSEC:
- HSS/LMS & XMSS^MT algorithms are extremely susceptible to key reuse. One key reuse ruins the whole thing. Don't use it.
- Falcon-512 has smallest signatures by large margin (around 666 bytes). CRYSTALS-Dillithium are built on the same principle but have larger signatures (about 2420 bytes). The problem is, both are reportedly built on shaky grounds because we as humankind don't fully understand the math behind them, so chances for breaking these algorithms in couple years are non-negligible.
- The remaining candidate algorithm is SPHINCS+-128. That one is most solid because it's based on ordinary hashes, which are well understood. The catch is that one signature is about 7856 bytes :exploding_head:
Consequently, this sounds like we need very good very solid TCP/TLS/QUIC support in client and server, so we are not limited to UDP packet sizes. That's IMHO the only way to go without significantly changing the protocol.
(Or we can go and engineer DNS 2.0 :grinning:)
### Links / referencesLong-term2026-01-01https://gitlab.isc.org/isc-projects/bind9/-/issues/4057dig XDG basedir support2023-05-12T07:24:54ZPaul Töttermandig XDG basedir support### Description
Check ${XDG_CONFIG_HOME}/dig/digrc in addition to ~/.digrc
### Links / references
https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html### Description
Check ${XDG_CONFIG_HOME}/dig/digrc in addition to ~/.digrc
### Links / references
https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.htmlNot planned