ISC Open Source Projects issueshttps://gitlab.isc.org/groups/isc-projects/-/issues2022-12-06T13:11:16Zhttps://gitlab.isc.org/isc-projects/stork/-/issues/314Req 2.2.2: Show reservation options2022-12-06T13:11:16ZTomek MrugalskiReq 2.2.2: Show reservation optionsStork is able to show host reservations with some details, but options are not displayed.
The complex part here is that options can have varied syntax (string, boolean, address, integers, structure, empty, etc).
This requirement calls ...Stork is able to show host reservations with some details, but options are not displayed.
The complex part here is that options can have varied syntax (string, boolean, address, integers, structure, empty, etc).
This requirement calls to display the options and their values somehow.
In particular anything related to PXE should be displayed.
This is a follow-up to Req 2.2: #45.backloghttps://gitlab.isc.org/isc-projects/stork/-/issues/309events with deleted object should not be presented with a link to deleted object2020-10-06T15:23:12ZMichal Nowikowskievents with deleted object should not be presented with a link to deleted objectFunction that fetches events from database should check if related object are deleted and if so patch event text by adding `deleted="true"` attrib to object tag in text.Function that fetches events from database should check if related object are deleted and if so patch event text by adding `deleted="true"` attrib to object tag in text.outstandinghttps://gitlab.isc.org/isc-projects/bind9/-/issues/1916Check ECS response in DiG for RFC compliance2024-03-13T13:11:50ZMark AndrewsCheck ECS response in DiG for RFC complianceWe have seen servers that return ECS responses that don't meet this requirement.
```
RFC 7871, 7.2.1. Authoritative Nameserver
FAMILY, SOURCE PREFIX-LENGTH, and ADDRESS in the response MUST match
those in the query. Echoing back ...We have seen servers that return ECS responses that don't meet this requirement.
```
RFC 7871, 7.2.1. Authoritative Nameserver
FAMILY, SOURCE PREFIX-LENGTH, and ADDRESS in the response MUST match
those in the query. Echoing back these values helps to mitigate
certain attack vectors, as described in Section 11.
```
Add a warning when the ECS response fails to meet this requirement.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1905Forbid the asterisk (*) single character domains on non-leaf level in the mas...2023-11-02T16:58:16ZOndřej SurýForbid the asterisk (*) single character domains on non-leaf level in the master zonesCurrently, the `sub.*.example.com` is a valid and legal domain name. But not everything that's legal is right and this is a perfect example, as the domain in question is not a wildcard domain name covering `sub.<anything>.example.com`, ...Currently, the `sub.*.example.com` is a valid and legal domain name. But not everything that's legal is right and this is a perfect example, as the domain in question is not a wildcard domain name covering `sub.<anything>.example.com`, but a single domain name `sub.*.example.com` where `*` is literal asterisk character. Remember that `*` has no special meaning in the `QNAME`, it is only processed as `*` when loading the zone files.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1900Runtime system test fails badly when run as root on non-linux systems.2023-11-02T16:58:16ZMark AndrewsRuntime system test fails badly when run as root on non-linux systems.Lots of the sub tests depend on capabilities being enabled to get "permission denied" when run as root.
```
% sudo sh run.sh runtime
Making check in dyndb/driver
make[1]: Nothing to be done for `check'.
Making check in dlzexternal/drive...Lots of the sub tests depend on capabilities being enabled to get "permission denied" when run as root.
```
% sudo sh run.sh runtime
Making check in dyndb/driver
make[1]: Nothing to be done for `check'.
Making check in dlzexternal/driver
make[1]: Nothing to be done for `check'.
/Applications/Xcode.app/Contents/Developer/usr/bin/make feature-test makejournal pipelined/pipequeries rndc/gencheck rpz/dnsrps tkey/keycreate tkey/keydelete
make[2]: `feature-test' is up to date.
make[2]: `makejournal' is up to date.
make[2]: `pipelined/pipequeries' is up to date.
make[2]: `rndc/gencheck' is up to date.
make[2]: `rpz/dnsrps' is up to date.
make[2]: `tkey/keycreate' is up to date.
make[2]: `tkey/keydelete' is up to date.
/Applications/Xcode.app/Contents/Developer/usr/bin/make check-TESTS
S:runtime:2020-06-01T09:18:45+1000
T:runtime:1:A
A:runtime:System test runtime
I:runtime:PORTS:5330,5331,5332,5333,5334,5335,5336,5337,5338,5339
I:runtime:starting servers
I:runtime:verifying that named started normally (1)
I:runtime:verifying that named checks for conflicting named processes (2)
I:runtime:verifying that 'lock-file none' disables process check (3)
I:runtime:checking that named refuses to reconfigure if working directory is not writable (4)
I:runtime:failed
I:runtime:checking that named refuses to reconfigure if managed-keys-directory is not writable (5)
I:runtime:failed
I:runtime:checking that named refuses to reconfigure if new-zones-directory is not writable (6)
I:runtime:failed
I:runtime:checking that named recovers when configuration file is valid again (7)
I:runtime:failed
I:runtime:checking that named refuses to start if working directory is not writable (8)
I:runtime:failed
I:runtime:checking that named refuses to start if managed-keys-directory is not writable (9)
I:runtime:failed
I:runtime:checking that named refuses to start if new-zones-directory is not writable (10)
I:runtime:failed
I:runtime:checking that named logs control characters in octal notation (11)
I:runtime:checking that named escapes special characters in the logs (12)
I:runtime:checking that named logs an ellipsis when the command line is larger than 8k bytes (13)
I:runtime:verifying that named switches UID (14)
I:runtime:failed
I:runtime:exit status: 8
I:runtime:stopping servers
R:runtime:FAIL
E:runtime:2020-06-01T09:19:22+1000
FAIL: runtime
============================================================================
Testsuite summary for BIND 9.17.1-dev
============================================================================
# TOTAL: 1
# PASS: 0
# SKIP: 0
# XFAIL: 0
# FAIL: 1
# XPASS: 0
# ERROR: 0
============================================================================
See bin/tests/system/run.log
Please report to info@isc.org
============================================================================
make[3]: *** [run.log] Error 1
make[2]: *** [check-TESTS] Error 2
make[1]: *** [check-am] Error 2
make: *** [check-recursive] Error 1
%
```
Additionally it appears that the following also fails on centos8 (from bind-users) which is what prompted me to check.
```
I:runtime:verifying that named switches UID (14)
I:runtime:failed
```Not plannedhttps://gitlab.isc.org/isc-projects/kea/-/issues/1260avoid more race conditions2021-10-20T10:18:11ZRazvan Becheriuavoid more race conditionsit seems that addLease, updateLease and deleteLease are called in several other places. we should lock the resource there as well:
```
Dhcpv4Srv::processRelease
Dhcpv4Srv::declineLease
Dhcpv6Srv::releaseIA_NA
Dhcpv6Srv::releaseIA_PD
Dh...it seems that addLease, updateLease and deleteLease are called in several other places. we should lock the resource there as well:
```
Dhcpv4Srv::processRelease
Dhcpv4Srv::declineLease
Dhcpv6Srv::releaseIA_NA
Dhcpv6Srv::releaseIA_PD
Dhcpv6Srv::declineLease
Dhcpv6Srv::generateFqdn
LeaseCmdsImpl::lease6BulkApplyHandler - there is a leaseDelete which can cause other races.
LeaseCmdsImpl::lease4DelHandler - will cause race
LeaseCmdsImpl::lease6DelHandler - will cause race
AllocEngine::allocateReservedLeases6
from AllocEngine::allocateLeases6
from Dhcpv6Srv::assignIA_NA
from Dhcpv6Srv::assignIA_PD
from AllocEngine::renewLeases6
from Dhcpv6Srv::extendIA_NA
from Dhcpv6Srv::extendIA_PD
AllocEngine::allocateGlobalReservedLeases6
from AllocEngine::allocateReservedLeases6
AllocEngine::removeNonmatchingReservedLeases6
from AllocEngine::allocateLeases6
from Dhcpv6Srv::assignIA_NA
from Dhcpv6Srv::assignIA_PD
from AllocEngine::renewLeases6
from Dhcpv6Srv::extendIA_NA
from Dhcpv6Srv::extendIA_PD
AllocEngine::removeNonmatchingReservedNoHostLeases6
from AllocEngine::removeNonmatchingReservedLeases6
from AllocEngine::allocateLeases6
from Dhcpv6Srv::assignIA_NA
from Dhcpv6Srv::assignIA_PD
from AllocEngine::renewLeases6
from Dhcpv6Srv::extendIA_NA
from Dhcpv6Srv::extendIA_PD
AllocEngine::removeNonreservedLeases6
from AllocEngine::allocateLeases6
from Dhcpv6Srv::assignIA_NA
from Dhcpv6Srv::assignIA_PD
from AllocEngine::renewLeases6
from Dhcpv6Srv::extendIA_NA
from Dhcpv6Srv::extendIA_PD
AllocEngine::reuseExpiredLease
from AllocEngine::allocateUnreservedLeases6
from AllocEngine::allocateLeases6
from Dhcpv6Srv::assignIA_NA
from Dhcpv6Srv::assignIA_PD
from AllocEngine::renewLeases6
from Dhcpv6Srv::extendIA_NA
from Dhcpv6Srv::extendIA_PD
AllocEngine::createLease6
from AllocEngine::allocateUnreservedLeases6
from AllocEngine::allocateLeases6
from Dhcpv6Srv::assignIA_NA
from Dhcpv6Srv::assignIA_PD
from AllocEngine::renewLeases6
from Dhcpv6Srv::extendIA_NA
from Dhcpv6Srv::extendIA_PD
from AllocEngine::allocateReservedLeases6
from AllocEngine::allocateLeases6
from Dhcpv6Srv::assignIA_NA
from Dhcpv6Srv::assignIA_PD
from AllocEngine::renewLeases6
from Dhcpv6Srv::extendIA_NA
from Dhcpv6Srv::extendIA_PD
from AllocEngine::allocateGlobalReservedLeases6
from AllocEngine::allocateLeases6
from Dhcpv6Srv::assignIA_NA
from Dhcpv6Srv::assignIA_PD
from AllocEngine::renewLeases6
from Dhcpv6Srv::extendIA_NA
from Dhcpv6Srv::extendIA_PD
AllocEngine::extendLease6
from AllocEngine::renewLeases6
from Dhcpv6Srv::extendIA_NA
from Dhcpv6Srv::extendIA_PD
AllocEngine::updateLeaseData
from AllocEngine::allocateLeases6
from Dhcpv6Srv::assignIA_NA
from Dhcpv6Srv::assignIA_PD
AllocEngine::deleteExpiredReclaimedLeases6 - will cause race
AllocEngine::deleteExpiredReclaimedLeases4 - will cause race
AllocEngine::reclaimLeaseInDatabase
from AllocEngine::reclaimExpiredLease Lease4Ptr
from AllocEngine::reclaimExpiredLease Lease6Ptr
AllocEngine::reclaimExpiredLease Lease4Ptr
from AllocEngine::reclaimExpiredLeases4 - safe
from AllocEngine::renewLease4
from AllocEngine::reuseExpiredLease4
AllocEngine::reclaimExpiredLease Lease6Ptr
from AllocEngine::reuseExpiredLease
from AllocEngine::extendLease6
from AllocEngine::reclaimExpiredLeases6 - safe
AllocEngine::createLease4
from AllocEngine::allocateOrReuseLease4
from AllocEngine::discoverLease4
from AllocEngine::requestLease4
AllocEngine::requestLease4
from AllocEngine::allocateLease4
from Dhcpv4Srv::assignLease
AllocEngine::renewLease4
from AllocEngine::discoverLease4
from AllocEngine::allocateLease4
from Dhcpv4Srv::assignLease
from AllocEngine::requestLease4
from AllocEngine::allocateLease4
from Dhcpv4Srv::assignLease
AllocEngine::reuseExpiredLease4
from AllocEngine::allocateOrReuseLease4
from AllocEngine::discoverLease4
from AllocEngine::requestLease4
from AllocEngine::allocateUnreservedLease4
from AllocEngine::discoverLease4
from AllocEngine::requestLease4
```outstandinghttps://gitlab.isc.org/isc-projects/bind9/-/issues/1896spurious root queries on timeout2023-11-02T16:58:16ZEvan Huntspurious root queries on timeoutReported by Ke Li <kl3158@columbia.edu> against 9.11.18 and 9.16.1.
```
Dear BIND authors,
We have documented specific cases where BIND9 (9.11.18 and 9.16.1)
generates generate requests to root servers which we think are not very
usef...Reported by Ke Li <kl3158@columbia.edu> against 9.11.18 and 9.16.1.
```
Dear BIND authors,
We have documented specific cases where BIND9 (9.11.18 and 9.16.1)
generates generate requests to root servers which we think are not very
useful. We would like to know if it is a known behavior or if there is
an underlying design choice for these queries that we do not understand?
Below is a brief overview of what we found.
The behavior we found is that when BIND9 has TLD servers' addresses in
the cache, which authoritative for domains like "com", and BIND9 gets an
A or AAAA type request like "some.example.com" from users, it still
sends requests like "ns1.example.com" to root and root server replies
with addresses of TLD servers again. The pattern looks like this:
user asks BIND9 Query: bidder.criteo.com, Type A =
BIND9 asks TLD servers To: 192.42.93.30 (g.gtld) Query: =
bidder.criteo.com, Type A =
Get a response from TLD servers From: 192.42.93.30 (g.gtld) Query: =
bidder.criteo.com =
=
Response: NS ns23.criteo.com NS =
ns22.criteo.com NS ns25.criteo.com NS =
ns26.criteo.com NS ns27.criteo.com NS =
ns28.criteo.com. All with A-type records in =
"Additional Records". =
BIND9 asks one of the nameservers. No reply To: 74.119.119.1 (ns25.criteo.=
com) Query: =
bidder.criteo.com, Type A =
BIND9 asks another nameserver. To: 182.161.73.4 (ns28.criteo.com) Query: =
bidder.criteo.com Type A =
And at the same time, =
=
BIND9 sends requests to root =
To: 192.58.128.30 (j.root) Query: =
ns22.criteo.com Type AAAA =
To: 192.58.128.30 (j.root) Query: =
ns23.criteo.com Type AAAA =
To: 192.58.128.30 (j.root) Query: =
ns27.criteo.com Type AAAA =
To: 192.58.128.30 (j.root) Query: =
ns25.criteo.com Type AAAA =
To: 192.58.128.30 (j.root) Query: =
ns26.criteo.com Type AAAA =
To: 192.58.128.30 (j.root) Query: =
ns28.criteo.com Type AAAA =
We deployed a BIND9 v9.11.18 instance and a BIND9 v9.16.1 locally and
loaded web captured traffic by Wireshark on port 53. Then we analyzed
the data and found several about these interesting requests to root.
1. they are requesting authoritative nameservers of a subdomain or a
hostname. For "ns23.criteo.com" and "ns22.criteo.com" are authoritative
nameservers for
2. they are requesting records that are not in the last level
nameserver's response. For in the response from the TLD server to
BIND9's request on "bidder.criteo.com", there is no type record (in
"Additional Records") for nameserver "ns23.criteo.com", so BIND9 later
AAAA type request on "ns23.criteo.com" to root.
3. if BIND9 timeouts when it queries one of these nameservers, BIND9
will generate these requests to root. For example, after getting the
response from the TLD server on "bidder.criteo.com", BIND9 goes ahead
and sends a request on "bidder.criteo.com" to "ns25.criteo.com", but
there is no reply. Then BIND9 will send the request to another name
server (randomly chose) "ns28.criteo.com" and also generate requests to
root.
Therefore, we guess this kind of request are generated by timeouts when
BIND9 queries nameservers. We then tried to validate our hypothesis. We
manually created timeouts iptables to ban IPs of some nameservers and
the same behavior happened. A simple test pcap file as an example is
attached, with an explanation. Also, the configuration file of our
deployment is attached. We then validated our hypothesis on a recursive
resolver at an academic institution running BIND9 v9.11.14, found out
that around 80% A and AAAA root servers were in this pattern.
We'd appreciate it if you help us understand this behavior. We mainly
are curious about reason behind it. Is it a necessary design or is it
avoidable? We think maybe some DNS root servers would be saved if BIND9
could avoid this kind of behavior.
Thank you very much!
````Not plannedhttps://gitlab.isc.org/isc-projects/stork/-/issues/294Export list of subnets with current utilization in csv file2023-01-09T12:04:41ZVicky Riskvicky@isc.orgExport list of subnets with current utilization in csv fileIn discussing use of Stork with a Kea user, they mentioned that they need to be able to export current pool size and utilization in a csv, for use in a separate Excel-based application that helps forecast pool utilization.In discussing use of Stork with a Kea user, they mentioned that they need to be able to export current pool size and utilization in a csv, for use in a separate Excel-based application that helps forecast pool utilization.outstandinghttps://gitlab.isc.org/isc-projects/kea/-/issues/1253subnet inheritance inconsistencies2022-11-02T15:10:18ZFrancis Dupontsubnet inheritance inconsistenciesThere are some inconsistencies (nothing critical so not a bug but lost opportunities to simplify code and improve performance) in the way subnets are handles for at least relay, interface name and v6 interface id:
- relay is a direct fi...There are some inconsistencies (nothing critical so not a bug but lost opportunities to simplify code and improve performance) in the way subnets are handles for at least relay, interface name and v6 interface id:
- relay is a direct field of Network, is derived in syntax parsing and checked for both subnet and parent shared network for subnet selection.
- interface name (getIface) is inherited using getProperty, checked in sharedNetworksSanityChecks after syntax parsing and checked for both subnet and parent shared network for subnet selection.
- interface id (v6 option) is inherited using getProperty and subject of #652.
Ideas are:
- get rid of the syntax derivation when possible (in particular when the other inheritance mechanism applies)
- avoid spurious inheritance in CB cmds (aka #652)
- apply a subset of sharedNetworksSanityChecks in merging
- at the opposite use inheritance to make only subnet level checks in subnet selection (note this means a subnet should be attached to its parent shared network before being added to the global subnet container)
Related to #513 (sharedNetworksSanityChecks not applied to config backend) and #554 (select subnet performance).backloghttps://gitlab.isc.org/isc-projects/bind9/-/issues/1879ARM and named man page incorrect regarding -U and number of listeners2024-02-14T14:48:26ZCathy AlmondARM and named man page incorrect regarding -U and number of listenersAs verified in 9.16.3 ARM. From [Support ticket #16280](https://support.isc.org/Ticket/Display.html?id=16280)
The ARM still says (about the options for starting named):
```
-U #listeners
Use #listeners worker threads to listen for inc...As verified in 9.16.3 ARM. From [Support ticket #16280](https://support.isc.org/Ticket/Display.html?id=16280)
The ARM still says (about the options for starting named):
```
-U #listeners
Use #listeners worker threads to listen for incoming UDP packets on each address. If
not specified, named will calculate a default value based on the number of detected CPUs:
1 for 1 CPU, and the number of detected CPUs minus one for machines with more than 1
CPU. This cannot be increased to a value higher than the number of CPUs. If -n has been
set to a higher value than the number of detected CPUs, then -U may be increased as high
as that value, but no higher. On Windows, the number of UDP listeners is hardwired to 1
and this option has no effect.
```
This is in fact untrue - we're using '-n' throughout (apart from Windows), as of 9.12 and up.
E.g. from named starting up:
> ...
> 16-Apr-2020 05:51:48.172 found 24 CPUs, using 24 worker threads
> 16-Apr-2020 05:51:48.172 using 24 UDP listeners per interface
> 16-Apr-2020 05:51:48.201 using up to 21000 sockets
> ...
I expect this changed post-9.11 at some point when we changed how the legacy server sockets code works.
Please fix the ARM and man page appropriately (maybe in the next maintenance releases?)Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1878Improve mirror zone implementation by using an iterator for the zone validati...2023-11-02T16:58:15ZCathy AlmondImprove mirror zone implementation by using an iterator for the zone validation stepTying loosely into [Support ticket #16268](https://support.isc.org/Ticket/Display.html?id=16268)
We were never able to replicate problems with client resolution during mirror zone refresh (using IXFR), and neither were we able to replic...Tying loosely into [Support ticket #16268](https://support.isc.org/Ticket/Display.html?id=16268)
We were never able to replicate problems with client resolution during mirror zone refresh (using IXFR), and neither were we able to replicated 'slowness' of updating the zone itself. The suspicion is now that the reported issues were due to `something else` and what we were seeing was symptom not cause.
However, along the way (and in #1802 and #1803) what was re-exposed is that the validation step for mirror zone updates doesn't take place within an iterator, so it's anti-social, in that it doesn't relinquish the CPU/thread it's working on until it's finished. This is documented as a known feature of the mirror zone implementation, and most of the time it really should not matter (it doesn't take long to validate the entire root zone, which is what the mirror zone implementation was designed for).
This issue ticket is a placeholder to note that we considered this something that we'd like to do, although it's not burningly urgent in the bigger picture of Things That Need To Be Done.
(We also uncovered that the validation step takes place against the entire zone, for each increment being applied (increment = bundled set of changes between SOA start and end RRs, not each individual change), so is potentially inefficient when pulling IXFRs rather than AXFRs from the root servers - but this has to be balanced against the rate of flux of the mirror zone (low for the root zone), so it's probably not worth tackling this too. )Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1836Return Extended EDNS Errors (EDE)2023-11-02T16:58:15ZMark AndrewsReturn Extended EDNS Errors (EDE)This is a Summary issue. Please open individual issues for individual code points.
There are 25 defined EDE codes. Some will be possible to return easily, others not so. See [RFC 8914](https://datatracker.ietf.org/doc/html/rfc8914) for...This is a Summary issue. Please open individual issues for individual code points.
There are 25 defined EDE codes. Some will be possible to return easily, others not so. See [RFC 8914](https://datatracker.ietf.org/doc/html/rfc8914) for explanation of the error codes.
* [ ] 0 - Other
* [ ] 1 - Unsupported DNSKEY Algorithm #2715
* [ ] 2 - Unsupported DS Digest Type #2715
* [x] 3 - Stale Answer #2267 (9.18.3, 9.19.1)
* [x] 4 - Forged Answer #3410 (9.19.5)
* [ ] 5 - DNSSEC Indeterminate #2715
* [ ] 6 - DNSSEC Bogus #2715
* [ ] 7 - Signature Expired #2715
* [ ] 8 - Signature Not Yet Valid #2715
* [ ] 9 - DNSKEY Missing #2715
* [ ] 10 - RRSIGs Missing #2715
* [ ] 11 - No Zone Key Bit Set #2715
* [ ] 12 - NSEC Missing #2715
* [ ] 13 - Cached Error
* [ ] 14 - Not Ready
* [x] 15 - Blocked #3410 (9.19.5)
* [x] 16 - Censored #3410 (9.19.5)
* [x] 17 - Filtered #3410 (9.19.5)
* [x] 18 - Prohibited (9.17.21), #3410 (9.19.5)
* [x] 19 - Stale NXDOMAIN Answer #2267 (9.18.3, 9.19.1)
* [ ] 20 - Not Authoritative
* [ ] 21 - Not Supported
* [ ] 22 - No Reachable Authority #2268
* [ ] 23 - Network Error
* [ ] 24 - Invalid DataNot plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1831Feature request: Separate NXDOMAIN cache with its own max-ncache-size2024-03-01T10:04:57ZCathy AlmondFeature request: Separate NXDOMAIN cache with its own max-ncache-sizeThis relates to PRSD DDoS attacks, and the effect on participating resolvers when the domain under onslaught is able to keep responding and does not die or rate-limit the resolvers.
The scenario is one in which a very large number of un...This relates to PRSD DDoS attacks, and the effect on participating resolvers when the domain under onslaught is able to keep responding and does not die or rate-limit the resolvers.
The scenario is one in which a very large number of unique names are being queried, the objective being to bypass cached NXDOMAINs in resolvers and to force every name to become a query to the authoritative servers for the domain (or hosting provider) that is being attacked.
Typically, the target servers will either die, or will commence rate-limiting their perceived attackers. In the case of a resolver, this will result in a large number of recursive queries being backlogged while they wait for the server responses that never arrive.
BIND uses fetch-limits to mitigate the non-responding servers scenario.
But in the situation where the servers never die or never rate-limit, the outcome is rather different. Resolvers that can cope with the increase in traffic (which usually isn't actually that much), instead see a rapid increase in memory consumption (and decrease in cache hits!) due to the NXDOMAIN responses that are received and then cached (never to be used again).
One mitigation for resolver operators has been to reduce max-ncache-ttl to silly small values - but the effectiveness of this depends on the structure of the cache nodes and how often opportunistic cache cleaning hits these nodes.
Yes, overmem (LRU-based logic) cache-cleaning will help with this, but for many, it is going to be at the expense of 'positive' cache content, and regular clients will start to suffer with more cache-misses, as well as cache churn increasing as negative and positive cache content keeps being 'swapped'.
Mark suggested keeping negative answers in a separate cache, where they could have their own max-ncache-size and churn all by themselves, without affecting main cache.
This sounds like A Good Idea - but one that we've never get got around to, as part of ongoing DDoS mitigation work.
(Also tagging this as 'Customer' since I can find many a customer ticket where customers have been bitten by this when one specific and well-known DNS hosting company have been under attack, and their servers never falter in sending back NXDOMAIN responses to their 'attackers').Not plannedhttps://gitlab.isc.org/isc-projects/stork/-/issues/270add support for InfluxData TICK Stack for monitoring in parallel to Prometheu...2020-05-12T15:53:40ZMichal Nowikowskiadd support for InfluxData TICK Stack for monitoring in parallel to Prometheus and Grafanaoutstandinghttps://gitlab.isc.org/isc-projects/bind9/-/issues/1818Follow-up from "WIP: Convert documentation from docbook to sphinx-doc syntax"2021-10-05T12:13:46ZOndřej SurýFollow-up from "WIP: Convert documentation from docbook to sphinx-doc syntax"The following discussion from !1761 should be addressed:
- [ ] @each started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/1761#note_63240): (+1 comment)
> - I notice `make doc` doesn't generate PDFs, I...The following discussion from !1761 should be addressed:
- [ ] @each started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/1761#note_63240): (+1 comment)
> - I notice `make doc` doesn't generate PDFs, I had to `make pdf` explicitly.
> - `make all` apparently includes a `make doc` step, now, which seems unnecessary to me.
> - Running `make all` in a clean directory always seems to fail while building doc the first time, though it always seems to succeed on the second attempt. I haven't had time to dig in to what's causing this.
> - Is the directory name `_build` a requirement of sphinx? It's an admittedly minor quibble, but I wish the HTML still appeared in doc/arm, or at most one level down instead of two.
> - Why do we have both `_build/html` and `_build/dirhtml`?
> - `make clean` and `make docclean` both fail to clean up generated documentation.
> - The files in doc/arm/man appear to be generated from RST source, but are checked into git, whereas elsewhere you seem to have removed all the generated documentation; what's the purpose of having these checked in?
> - Chapter numbering is very weird - what used to be section 3.4 on plugins is now a very short chapter by itself, and many the sections of chapter 4 seem to have been split out into separate chapters too. The result is what used to be 8 chapters and 4 appendices is now 19 chapters. The appendices are no longer at the end.
> - (Not a regression, but it's also really weird to have the advanced DNS features turn up in the ARM before the basic configuration directions. We should reorder the chapters.)
>
> Looking over the generated PDF's for the ARM and the release notes:
> - The ISC logo has become kind of tiny now.
> - Many of the grammars in the ARM display weirdly. They're enclosed in boxes but they spill out past the margins, and there are XML tags in them. The ones in the named.conf man page in chapter 19 are fine, but in the earlier parts of the ARM they're broken. I'm guessing this is a bug in the rst-zoneopt.pl and/or rst-grammars.pl scripts.
> - Tables that fill a whole page look weird (see pages 50, 60 and 125 for examples). I don't think this is a regression, though.
> - Odd formatting in a table on page 61 as well.
> - Option definitions should probably have a line break after the option name. (See page 77 for examples of what I mean - the first line of the text describing the option flows right after the option name, and it looks to me like it should start on the line below.)
> - The table defining rrset-order values is positioned incorrectly on page 93.
> - Tables spill over the right margins on pages 127 and 135.
> - Table appears to incorrectly contain another table on page 132.
> - Table formatting error on page 137.
>
> Looking over the generated HTML:
> - The formatting here is generally very nice!
> - The same problem also applies here with grammars - XML tags are visible.
> - RFC references in chapter 17 are formatted oddly, with author names showing up as "VixieP" and "AndrewsM", commas missing.https://gitlab.isc.org/isc-projects/kea/-/issues/1218huge difference of performance gain when memfile is configured with persist f...2022-11-02T15:10:18ZWlodzimierz Wencelhuge difference of performance gain when memfile is configured with persist falseI've executed two tests, multi threading + memfile with persist false (there is no writing to the file).
And results are surprising, revealing possible inefficiency.
results:
* multi threading v6 memfile persist true: 36k leases/s
* mul...I've executed two tests, multi threading + memfile with persist false (there is no writing to the file).
And results are surprising, revealing possible inefficiency.
results:
* multi threading v6 memfile persist true: 36k leases/s
* multi threading v6 memfile persist false: 47k leases/s
* gain: 30%
* multi threading v4 memfile persist true: 20k leases/s
* multi threading v4 memfile persist false: 37k leases/s
* gain: 85%
Please investigate if there is an inefficiency in a way Kea saves v4 leases to the file.backloghttps://gitlab.isc.org/isc-projects/kea/-/issues/1206throwing exceptions on destructors causes call to terminate and should be fixed2020-08-31T13:29:05ZRazvan Becheriuthrowing exceptions on destructors causes call to terminate and should be fixedas stated in:
http://www.cs.technion.ac.il/users/yechiel/c++-faq/dtors-shouldnt-throw.html
throwing exceptions in destructors can result to calling terminate
although this is hard to control or enforce over time, we should I least try ...as stated in:
http://www.cs.technion.ac.il/users/yechiel/c++-faq/dtors-shouldnt-throw.html
throwing exceptions in destructors can result to calling terminate
although this is hard to control or enforce over time, we should I least try to fix this by always adding 'try catch' block on non-trivial destructors
this ticket should at least handle destructors which call functions accessing singleton instances (which usually are complex)outstandinghttps://gitlab.isc.org/isc-projects/stork/-/issues/260Search by IP2022-11-16T11:54:50ZTomek MrugalskiSearch by IPIf new machine is added using its name, e.g. `agent-kea`, then it's not possible to search for it using IP. In my case the agent-kea is a docker container running on 127.0.0.1. I'd like to be able to find it by searching for the IP address.If new machine is added using its name, e.g. `agent-kea`, then it's not possible to search for it using IP. In my case the agent-kea is a docker container running on 127.0.0.1. I'd like to be able to find it by searching for the IP address.backloghttps://gitlab.isc.org/isc-projects/stork/-/issues/257Configurable thresholds for alerting2023-07-25T13:42:33ZVicky Riskvicky@isc.orgConfigurable thresholds for alertingObviously one of the main purposes of a dashboard is to display warnings and alerts, indicators of degradation or failures.
As an administrator I am going to want to adjust the thresholds for these alerts so that they reflect the condit...Obviously one of the main purposes of a dashboard is to display warnings and alerts, indicators of degradation or failures.
As an administrator I am going to want to adjust the thresholds for these alerts so that they reflect the conditions that are specifically alarming to me - which may vary depending on the criticality of the service being monitored (e.g. is it a paid production service, an service for internal users, a test network or something like free guest wifi).
I would like Stork to assign defaults for these thresholds, enable me to adjust the default values (e.g. global to Stork), and override them on a per server basis.
thresholds we may want to enable alarming on, eventually:
* [ ] pool utilization (high - red, approaching high - yellow)
* [ ] LPS (high, low) (Ideally what we want is variance from the usual LPS but I dk if there is any way for us to determine what is usual, given there may be quite a lot of daily and weekly variation.)
* [ ] cpu utilization (on Kea, maybe also on the db backend?)
* [ ] # of rejected leases?
* [ ] is there something we should monitor wrt the LFC, does it build up a backlog or something?
* [ ] database connection quality (delay in responses?)
* [ ] other platform factors (temperature, is that a thing we get?, is there some alarm about low available memory?)
* [ ] report when an updated package is available in the UI (likely that Stork packages will have security vulnerabilities because of web dependencies from outside Stork) - presumably when there is a new package for the same Stork version, that is due to a security issue.
* [ ] conflicts when the operator is running multiple Keas with the same address range, using a shared lease db
* [ ] ring buffer length/size to identify when over-long buffers cause cascading retriesbackloghttps://gitlab.isc.org/isc-projects/bind9/-/issues/1793failed query to a `forward only` forwarder increments `serverquota` counter (...2023-11-02T16:58:14ZCathy Almondfailed query to a `forward only` forwarder increments `serverquota` counter (spilled due to server quota)As observed in [Support ticket #16297](https://support.isc.org/Ticket/Display.html?id=16297)
I was inspecting the stats output and was very surprised to see this:
` 13779 spilled due to server quota`
The server in questi...As observed in [Support ticket #16297](https://support.isc.org/Ticket/Display.html?id=16297)
I was inspecting the stats output and was very surprised to see this:
` 13779 spilled due to server quota`
The server in question does not have `fetches-per-server` configured, so this defaults to zero (unlimited). But yet...
Looking at the code - I suspect there's a failure mode that drops through the 'out' block in fctx_getaddresses() without resetting all_spilled (which starts at 'true').
```c
static isc_result_t
fctx_getaddresses(fetchctx_t *fctx, bool badcache) {
dns_rdata_t rdata = DNS_RDATA_INIT;
isc_result_t result;
dns_resolver_t *res;
isc_stdtime_t now;
unsigned int stdoptions = 0;
dns_forwarder_t *fwd;
dns_adbaddrinfo_t *ai;
bool all_bad;
dns_rdata_ns_t ns;
bool need_alternate = false;
bool all_spilled = true;
```
...
```c
/*
* If all of the addresses found were over the
* fetches-per-server quota, return the configured
* response.
*/
if (all_spilled) {
result = res->quotaresp[dns_quotatype_server];
inc_stats(res, dns_resstatscounter_serverquota);
}
```
This is a server that is using global forwarding, so we skip case 'normal_nses', which is where 'all_spilled' is normally reset from true to false during processing:
```c
if (fctx->fwdpolicy == dns_fwdpolicy_only)
goto out;
```
So I'm guessing that what's been 'counted' and then reported here, is failures in getting responses back from any of the global forwarders (which tallies quite nicely with the problem I'm investigating - even though this wasn't a counter I was expecting to see in the stats!).
The assumption seems to be if it's a failure for any other reason than fetch-limits, that something will reset the 'all_spilled' flag - it would appear that assumption is flawed for some configurations and situations. Could someone have a look at this please - it should be an easy one to fix.
I note that this has also been noticed before on bind-users:
https://lists.isc.org/pipermail/bind-users/2016-June/097011.html
I observed this in 9.11.15-S1, but the code path looks the same still on master.
Requested changes:
- [ ] fix serverquota counter
- [ ] add a new counter for specifically for situation when all forwarders have failedNot planned