ISC Open Source Projects issueshttps://gitlab.isc.org/groups/isc-projects/-/issues2018-04-24T13:30:34Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/216implement rndc healthcheck2018-04-24T13:30:34ZBrian Conryimplement rndc healthcheckAs discussed in the "BIND triage cribsheet for on-call Support" session from our All Hands meeting.
The basic premise was a way to have named describe, from the perspective of its running configuration, what should be visible from a spe...As discussed in the "BIND triage cribsheet for on-call Support" session from our All Hands meeting.
The basic premise was a way to have named describe, from the perspective of its running configuration, what should be visible from a specific network location.
This should help operators recognize when named is running an incorrect configuration and also provide them with specific things that can be probed when an external-to-BIND issue is suspected.
From the meeting notes:
* rndc healthcheck - is named doing what you think it should be doing per it's configuration?
* rndc healthcheck ```<address>``` (so that it can check what it's possible to do from <address>...
* Access to these zones (loaded from these files...)
* ACLs that this address is in (or not in)
* Views - and access to ...
* Is recursion allowed?
* Do some dig commands based on the above ...
* rndc status - could it list the ports and interfaces that it is *currently* listening on - and what it was configured on when started
* Could it check what the backlog is in the UDP and TCP socket buffers for those listen sockets?
* ls -l on the named binary (when was it last updated) and ls -l named.conf (when was it last updated?)
* this could be part of 'rndc healthcheck'
See the wiki for the agenda for a link to the meeting notepad.https://gitlab.isc.org/isc-projects/bind9/-/issues/226refactor DLZ into a dyndb module?2018-04-24T16:14:53ZEvan Huntrefactor DLZ into a dyndb module?There was an offhand remark made at the ISC all-hands, which I think has potential.
It would be nice to reduce the number of external database API's for which BIND needs native support. Currently we have SDB (obsolete but still in use f...There was an offhand remark made at the ISC all-hands, which I think has potential.
It would be nice to reduce the number of external database API's for which BIND needs native support. Currently we have SDB (obsolete but still in use for built-in zones such as authors.bind), SDLZ (the DLZ driver API, semi-obsolete but still used for the dlz-dlopen driver and for one of the system tests), the DLZ module API (currently supported, and provided by the dlz-dlopen driver mentioned before). Finally there's dyndb (the new hotness with many advantages over the others, but seriously lacking in module support).
We could remove SDLZ and DLZ support from named by writing a dyndb module that provides the functions of the dlz-dlopen driver and the various helper functions in sdlz.c and dlz.c. The overall architecture could be substantially simplified, while keeping the interface for DLZ modules themselves unchanged.
We can keep configuration stable by having the dyndb module loaded automatically if "dlz" is used in named.conf. I think we still need to keep some DLZ-specific functions such as dns_view_searchdlz() but much of the DLZ enabling code could be moved out of named and pulled in only when needed.Evan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/262Add a RPZ mode to named-checkzone2018-05-23T12:57:58ZMark AndrewsAdd a RPZ mode to named-checkzoneIt might be useful to have a RPZ mode added to named-checkzone to check that the contents are valid for use by RPZ.It might be useful to have a RPZ mode added to named-checkzone to check that the contents are valid for use by RPZ.https://gitlab.isc.org/isc-projects/bind9/-/issues/296Add EDDSA algorightm support to inline test2018-07-12T11:39:45ZOndřej SurýAdd EDDSA algorightm support to inline testWhile culling the GOST, I noticed that there's no EDDSA algorithm test in the `inline` system test.While culling the GOST, I noticed that there's no EDDSA algorithm test in the `inline` system test.https://gitlab.isc.org/isc-projects/bind9/-/issues/299Change ECDSA to Deterministic Usage Elliptic Curve Digital Signature Algorith...2022-07-29T18:21:00ZOndřej SurýChange ECDSA to Deterministic Usage Elliptic Curve Digital Signature Algorithm (RFC 6979)https://gitlab.isc.org/isc-projects/bind9/-/issues/349Set __attribute__((warn_unused_result)) on all function that return values to...2018-08-30T19:28:06ZMark AndrewsSet __attribute__((warn_unused_result)) on all function that return values to catch unused results.Not plannedMark AndrewsMark Andrewshttps://gitlab.isc.org/isc-projects/bind9/-/issues/352Refactored RPZ has a scalability design problem2021-10-04T12:50:35ZGhost UserRefactored RPZ has a scalability design problem(Don't let that title sound discouraging - IMO it is still better than the old RPZ code, but more work is necessary.)
Last week I heard from a provider of feed data that they need resolvers to react quickly and the current 60s default u...(Don't let that title sound discouraging - IMO it is still better than the old RPZ code, but more work is necessary.)
Last week I heard from a provider of feed data that they need resolvers to react quickly and the current 60s default update rate plus the fact that RPZ does a full iteration of the DB will not work for them as their data is not latency tolerant and their policy zones are large.
There was a separate bug ticket by a popular reporter from another ISC customer in the last 1-2 months that asked whether this 60s update time could be improved to match the old RPZ behavior.
I feel this can be addressed. BIND RPZ would need to hook not just into `dns_db_updatenotify_register()`, but also into a `newdbversion` signal and every `add` and `del` update to the DB. Note that the RPZ view summary datastructure is still not versioned, so RPZ code would have to separately batch and keep aside a copy of the update diffs until the DB version commit passes, and then apply just the diffs to the RPZ view summary data structure instead of iterating the zone. I think versioning the view summary datastructure would be too much work and complicate the structure.. as long as the diff is applied to it after the DB version commit passes, there should be no problem of inconsistent data.Long-termhttps://gitlab.isc.org/isc-projects/bind9/-/issues/370Support not returning until the additional section is fully populated.2018-07-12T11:55:36ZMark AndrewsSupport not returning until the additional section is fully populated.For SRV to be as useful as CNAME the additional section needs to be populated even when there was a cache miss for target servers A and AAAA records. Add code fetch these records before returning the answer to the SRV request. Addition...For SRV to be as useful as CNAME the additional section needs to be populated even when there was a cache miss for target servers A and AAAA records. Add code fetch these records before returning the answer to the SRV request. Additionally CNAME records need to be chased in the additional section.
Additionally there needs to be a way to indicate that the additional section is complete.
Add a flag to enable this experimental behaviour. In future this should be triggered by a EDNS option indicating that the client wants the additional section to be populated.Mark AndrewsMark Andrewshttps://gitlab.isc.org/isc-projects/bind9/-/issues/377RPZ qname triggers are always searched even if can be skipped2024-03-27T13:41:20ZCathy AlmondRPZ qname triggers are always searched even if can be skippedNot sure if this is a bug or not - but noticed by a customer who reported:
```
I happen to notice a minor possible glitch in RPZ query handling.
When loading an RPZ, the corresponding bit for 'rpzs->have.qname' is
always set as the exis...Not sure if this is a bug or not - but noticed by a customer who reported:
```
I happen to notice a minor possible glitch in RPZ query handling.
When loading an RPZ, the corresponding bit for 'rpzs->have.qname' is
always set as the existence of the origin name is regarded as the
existence of a QNAME trigger for the root name (.). So the following
check in rpz_rewrite_name() is pointless unless 'allowed_zbits' clears
some of the bits in case of "qname-wait-recurse yes":
zbits = rpz_get_zbits(client, qtype, rpz_type);
zbits &= allowed_zbits;
if (zbits == 0)
return (ISC_R_SUCCESS);
Since the root name is never subject to RPZ rewrite, we could actually
optimize it a bit by not setting have.qname for the RPZ's origin
name. This may be a minor optimization, though, since
dns_rpz_find_name() should be relatively cheap, and I guess it's quite
unlikely that we use RPZs that don't have any QNAME triggers. So you
may or may not want to "fix" it.
I primarily checked it for 9.11.3-S2, but I believe it's the same for all recent versions including 9.10.x.
```https://gitlab.isc.org/isc-projects/bind9/-/issues/421Add support-status command to rndc2021-10-15T22:01:55ZOndřej SurýAdd support-status command to rndcWhile reading https://www.ubuntu.com/info/release-end-of-life I came to an idea that it would be a good to have a `rndc support-status` command that would print the support status of the current version and the current release branch.
A...While reading https://www.ubuntu.com/info/release-end-of-life I came to an idea that it would be a good to have a `rndc support-status` command that would print the support status of the current version and the current release branch.
As it doesn't get automatically executed, it could easily just call home to ask.
@vicky @cathya What do you think?Not plannedOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/430Disable case preservation when case-sensitive compression is disabled [ISC-su...2021-10-04T13:02:38ZBrian ConryDisable case preservation when case-sensitive compression is disabled [ISC-support #13227]### Description
ISC Support Customer reports significant performance impact in some situations caused by case preservation.
Specifically:
>>>
- allocation of a new dns_name_t and name copy in dns_compress_add()
- name copy in rdataset....### Description
ISC Support Customer reports significant performance impact in some situations caused by case preservation.
Specifically:
>>>
- allocation of a new dns_name_t and name copy in dns_compress_add()
- name copy in rdataset.c:towiresorted() and call to dns_rdataset_getownercase()
>>>
### Request
>>>
Admittedly these should be generally minor overhead, but in scenarios where the overall query processing is relatively cheap the performance drop can be non-negligible. Right now it exceeds the acceptable level of performance regression in our release engineering standard. If we suppress the above change the performance is (still worse than before but) acceptable.
Meanwhile, trying to preserve the case of the owner name is quite moot in practice if, for example, case-sensitive name compression is disabled. So we wonder if the case-preserving feature can be configurable at least when responding to normal queries. I'm attaching a patch (for 9.11.3-S2) that implements this idea, by disabling case-preserving when case-sensitive name compression is disabled (this could also be a separate configuration option).
>>>
### Links / references
[case.diff](/uploads/657d56eaf8595854ae83c892f438d714/case.diff)https://gitlab.isc.org/isc-projects/bind9/-/issues/455lazy catalog zone configuration makes a server lame2021-10-04T13:05:00ZTony Finchlazy catalog zone configuration makes a server lameI restarted my test server while another task was frequently repeating several queries; the server got stuck in a funny state where it would SERVFAIL these queries persistently.
tl;dr: I think this problem might affect normal authoritat...I restarted my test server while another task was frequently repeating several queries; the server got stuck in a funny state where it would SERVFAIL these queries persistently.
tl;dr: I think this problem might affect normal authoritative servers that use catalog zones: there's a chance that lazy zone configuration might cause the server to return referrals rather than proper answers during startup, leading it to being unfairly marked lame by recursive servers for 10 minutes.
I reproduced this with vanilla 9.12.2 and a simplified configuration. The setup is:
* an auth view, containing:
* a catalog zone listing all our local zones
* a local root zone
* a rec view, containing:
* static-stub zone configurations for all the auth zones pointing at the auth view
If I make frequent queries for `cam.ac.uk` while the server starts, it usually gets into this SERVFAIL state.
When a cam.ac.uk query happens while the server is starting (before catz configuration is complete), the rec view gets a referral from the root in the auth view. It then decides to mark the auth view as lame for cam.ac.uk, so everything is broken for 10 minutes (the `lame-ttl`).
If there's no root zone, the rec view gets REFUSED; it SERVFAILs a few queries while the server is starting but it doesn't mark the auth view as lame - surprisingly, REFUSED is treated a lot more gently than an unexpected referral.
I can trigger the same problem with *only* the catalog zone, with no root zone, but in this case it is harder to lose the race. If I repeatedly query for www.cl.cam.ac.uk, and the server loads `cam.ac.uk` before `cl.cam.ac.uk`, then it can get a referral instead of the expected answer.
```
while sleep 0.05; do dig +tries=1 +timeout=1 @::1 www.cl.cam.ac.uk | grep status & done
```
I can't trigger the same effect with explicitly configured zones, i.e. if I replace the auth view configuration with `include "../etc/named.slave"`.
These attachments have the relevant config; I can provide access to a VM if you want to try this config out for yourselves.
[named.conf](/uploads/9e077f0e27f49446462fc93010243728/named.conf)
[make-static-stub](/uploads/08300990a5b352629e8cc2efa57e9c2a/make-static-stub)https://gitlab.isc.org/isc-projects/bind9/-/issues/459[RT 39964] logging of NXDOMAIN query-responses (response source and type)2024-03-13T13:46:50ZVicky Riskvicky@isc.org[RT 39964] logging of NXDOMAIN query-responses (response source and type)Edited/compressed from the original in classic bugs-RT
for analyzing queries resulting in NXDOMAIN responses...
Please add the following to normal query log:
a) what kind of answer was given (nxrrset, rrset (how many) or servfail)
b) w...Edited/compressed from the original in classic bugs-RT
for analyzing queries resulting in NXDOMAIN responses...
Please add the following to normal query log:
a) what kind of answer was given (nxrrset, rrset (how many) or servfail)
b) where did the answer came from (authorative, from cache or was it the result of a recursive search)
The actual content of the answer is not needed outside some very special debug-cases and for these cases you have to do a complete network trace anyway.
spawned from #39253: dnstap wish-list: Query log limited by zone/domain & Answer logging
Reference to https://support.isc.org/Ticket/Display.html?id=8385 added
----
* This is response, not query information
<from Ray>
NB: recording these either means two separate log entries (one for query, one for response) or if they're merged that the query log will now be in response order rather than request order.
------
This request is for 'normal' query logging, but many have asked for this via **dnstap** as well. We would love to get this in dnstap if that is possible, realizing there is a standards/schema issue with dnstap.
------
Tagging with 9.13.3 because we would really like to try for this in 9.14.0. This is a fairly long standing and frequently-heard request.Not plannedhttps://gitlab.isc.org/isc-projects/kea/-/issues/22stringop-truncation warnings2022-11-02T15:08:41ZFrancis Dupontstringop-truncation warningsG++ 8 has a new warning stringop truncation which is emitted when strncat or strncpy (only the second in kea) fails to terminate (i.e. append a null character) its result.
There are on Fedora 28 spurious warnings on local/unix socket ad...G++ 8 has a new warning stringop truncation which is emitted when strncat or strncpy (only the second in kea) fails to terminate (i.e. append a null character) its result.
There are on Fedora 28 spurious warnings on local/unix socket address or ifname because they are filled using strncpy.
I have a mixed feeling about this: IMHO the issue is not in Kea but in the system header files which should add a ```nonstring``` attribute but did not, so no action is a possible answer to this...backloghttps://gitlab.isc.org/isc-projects/bind9/-/issues/497[ISC-support #13383] Add RRset size-limiting option to protect masters from r...2024-02-20T10:59:13ZCathy Almond[ISC-support #13383] Add RRset size-limiting option to protect masters from rogue dynamic updates and slaves from damaging IXFR/AXFRsBased on https://support.isc.org/Ticket/Display.html?id=13383
The issue is that although the DNS protocol does not restrict how many RRs can exist in a single RRset, once the number of RRs becomes significantly large, both dynamic updat...Based on https://support.isc.org/Ticket/Display.html?id=13383
The issue is that although the DNS protocol does not restrict how many RRs can exist in a single RRset, once the number of RRs becomes significantly large, both dynamic updates and inbound IXFRs become highly CPU-intensive and performance degrading. Encountering oversized RRsets in a production environment is almost certainly the result of a rogue client or poorly-considered configuration/design, but the outcome for any DNS servers having to maintain oversized RRsets can be a significantly degraded performance.
(This scenario is similar to the one that led to the introduction of BIND option max-rrset which protects servers that are hosting zones provided by 3rd parties from being overwhelmed unexpectedly by a zone that unexpectedly becomes of significant size - more than the hosting server can handle).
---
Some more background information to the case leading to this request is that it was observed in a zone that has an RRset consisting of 4K RRs of type A for the same hostname, that adding a new RR with a different TTL can take in excess of several seconds and the task performing the update consumes a significant amount of CPU doing so.
The reason why adding an additional RR causes so much internal zone maintenance activity is complex, but it is directly due to the TTL change and what this implies for maintenance of the .jnl file (used both to satisfy outbound IXFR requests, and for rolling forward/loading the zone locally, in case of an unexpected interrupt of named (a normal rndc stop will write the current version of the zone to disk, but a sigterm or crash won't - although see option flush-zones-on-shutdown which defaults to "no" that can influence this).
When the TTL of one of the RRset changes, it results in all of the other RRs in the set having to be updated to adopt the new TTL at the same time (a difference in TTL between the members of the same RRset is not permitted). Although the TTL is stored on a per-RRset basis internal to BIND, this is, unfortunately, not the end of the story. When the RRset is being updated, named has to generate the set of zone content changes for the .jnl file - and it's here that this becomes, not a single RR add, but the deletion of 4K RRs and the addition of 4K+1 RRs. And this is what drives named doing exactly this when it adds this new RRset itself to its zone.
However, this gets worse (and this next part explains why the maintenance of large RRsets increases exponentially rather than linearly). When you add a RR to a pre-existing RRset, in order to prevent duplicates, for each 'add' named has to check the new candidate against all of the other RRs already in the set. So, when adding the 200th, we have to check it against 199 others, but for the 4001th, it's checking against 4000 others.
Yes, potentially we could optimise this internally for the single case where we know we've checked the one RR that we're adding, on the basis that we know the only reason we're doing 4K deletes and adds is to change the TTL on the whole lot, but that really we're just adding one new RR. But that is not going to help a slave server receiving an IXFR of the same update, and it's not going to help this server if it needs to reload/replay the zone update from the .jnl file if it gets interrupted.
So similarly, just accepting an IXFR for a zone with 4K+ new RRs to be added to a single RRset is also going to impact named, irrespective of the TTLs on those RRs. (Basically, oversized RRsets are bad news for a server that is properly checking the contents of zones that it is loading/updating).
---
Therefore, going back to the request for a new option... it's clearly bordering on insane to have RRsets of this size - DNS certainly wasn't designed to handle them (think of the query responses and the clients!) and BIND certainly wasn't optimised internally for this extreme scenario. Therefore, please could we have a new option (with sane defaults on a per-RTYPE basis) to prevent accidental craziness that degrades performance.
Something like:
max-rrset [rrtype] integer;
(Defaults to be discussed... but they're potentially going to be different for PTR vs other RTYPEs).Not plannedhttps://gitlab.isc.org/isc-projects/kea/-/issues/37revamp subnet sanity checks2022-11-02T15:08:41ZGhost Userrevamp subnet sanity checksOn one side decides what should be checked:
- interface in shared network
- "same subnet" (cf #5423)
- malformed prefix
etc
And apply this to documentation and code in:
- plain subnet configuration
- in shared network subnet config...On one side decides what should be checked:
- interface in shared network
- "same subnet" (cf #5423)
- malformed prefix
etc
And apply this to documentation and code in:
- plain subnet configuration
- in shared network subnet configuration
- subnet REST API
Should be done after #5423 (definition of "same subnet") and client-class in pools.backloghttps://gitlab.isc.org/isc-projects/kea/-/issues/38Updating DNS entry on host reservation changing2022-11-02T15:08:42ZGhost UserUpdating DNS entry on host reservation changingI sent this questions to kea-users@lists.isc.org two days ago, but nothing happens and I can't see my message in thread list. So, I decided to create a new ticket.
My previous message:
I'm trying to bond Kea with BIND. When a new lease ...I sent this questions to kea-users@lists.isc.org two days ago, but nothing happens and I can't see my message in thread list. So, I decided to create a new ticket.
My previous message:
I'm trying to bond Kea with BIND. When a new lease is created or expired it works well. In this cases I get correct records in "forward" and "reverse" DNS zones. But, when I'm changing an IP-address in host reservation entry in MySQL database, a new address is allocated to the customer and new correct entries appear in DNS. However, an old entry for previous IP-address still remains in "reverse" DNS zone. Thus, now I have a "ghost" entry in my DNS.
I would manually remove the lease BEFORE changing the reservation entry. I guess it should work. But maybe there is a routine solution for this issue?backloghttps://gitlab.isc.org/isc-projects/kea/-/issues/39shared-network option takes precedence before option defined in client class2022-11-02T15:08:43ZGhost Usershared-network option takes precedence before option defined in client classWhen kea6 is configured with shared-network that contain option, and subnet (within that shared-network) which has assigned class with the same option defined - Kea ignores option defined in class.
Example configuration:
```
{
"Dhcp...When kea6 is configured with shared-network that contain option, and subnet (within that shared-network) which has assigned class with the same option defined - Kea ignores option defined in class.
Example configuration:
```
{
"Dhcp6":
{
"renew-timer":1000,
"rebind-timer":2000,
"preferred-lifetime":3000,
"valid-lifetime":4000,
"client-classes":[
{
"name":"Client_Class_1",
"test":"substring(option[1].hex,8,2)==0xf2f1",
"option-data":[
{
"csv-format":true,
"code":23,
"data":"2001:db8::888",
"name":"dns-servers",
"space":"dhcp6"
}
]
}
],
"interfaces-config":
{
"interfaces":["eth2"]
},
"lease-database":
{
"type":"memfile"
},
"shared-networks":[
{
"name":"name-abc",
"interface":"eth2",
"option-data":[
{
"csv-format":true,
"code":23,
"data":"2001:db8::1",
"name":"dns-servers",
"space":"dhcp6"
}
],
"subnet6":[
{
"subnet":"2001:db8:a::/64",
"client-class":"Client_Class_1",
"pools":[
{
"pool":"2001:db8:a::1-2001:db8:a::10"
}
]
}
]
}
]
}
}
```
Packet is evaluated correctly, option 23 has value that is configured on shared-network level, not what is in the class.
```
DEBUG [kea-dhcp6.eval/18704] EVAL_DEBUG_EQUAL Popping 0xF2F1 and 0xF2F1 pushing result 'true'
INFO [kea-dhcp6.dhcp6/18704] EVAL_RESULT Expression Client_Class_1 evaluated to 1
```
but message is created incorreclty:
```
DHCP6_RESPONSE_DATA responding with packet type 2 data is localAddr=[ff02::1:2]:547 remoteAddr=[fe80::800:27ff:fe00:1]:546
msgtype=2(ADVERTISE), transid=0xeda107
type=00001, len=00010: 00:03:00:01:66:55:44:33:f2:f1
type=00002, len=00014: 00:01:00:01:21:81:be:d4:08:00:27:19:b8:2a
type=00003(IA_NA), len=00040: iaid=39866, t1=1000, t2=2000,
options:
type=00005(IAADDR), len=00024: address=2001:db8:a::1, preferred-lft=3000, valid-lft=4000
type=00023, len=00016: 2001:db8::1
```
Entire logs and network capture attached.
Number of subnets within shared-network, or number of shared-networks makes no difference - bug occur.
When client has reservation with option X it correctly overrides option configured on shared-network level.backloghttps://gitlab.isc.org/isc-projects/kea/-/issues/41Kea should be able to print performance metrics2023-01-09T12:25:26ZGhost UserKea should be able to print performance metricsWhen debugging an issue, it became clear that finding out how long it takes Kea to process a packet and actually send a response is difficult. It requires matching different log entries, which sometimes is very problematic if there are m...When debugging an issue, it became clear that finding out how long it takes Kea to process a packet and actually send a response is difficult. It requires matching different log entries, which sometimes is very problematic if there are multiple packets sent from a client.
We should develop a way to measure how long it takes to process a packet. The easiest way will be to use a stopwatch (see src/lib/util/stopwatch.h). I think we should remember the timestamp somewhere in Pkt4 (and possibly Pkt6) very early when the packet is received (perhaps in Pkt4 constructor?) and then print the interval value once the response packet is being sent out.
I think it would be useful to have separate logger for this, maybe call it performance or perf? If the concept proves to be useful, we may soon extend it to print out more detailed information about different stages (it took X ms to find host reservation, Y ms to select a lease, Z ms to do DNS update etc).backloghttps://gitlab.isc.org/isc-projects/kea/-/issues/44make database config parsing more flexible2022-11-02T15:08:41ZGhost Usermake database config parsing more flexibleCf. #5528 comments (look for "line 125").Cf. #5528 comments (look for "line 125").backlog