BIND issueshttps://gitlab.isc.org/isc-projects/bind9/-/issues2023-05-18T15:18:30Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4076Exceeded time limit waiting for 'too many DNS UPDATEs queued' in ns1/named.ru...2023-05-18T15:18:30ZMichal NowakExceeded time limit waiting for 'too many DNS UPDATEs queued' in ns1/named.run on WindowsThis happens way to often on Windows ([3393969](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3393969) & [3394153](https://gitlab.isc.org/isc-private/bind9/-/jobs/3394153) just today).
```
I:nsupdate:check that update is rejected if ...This happens way to often on Windows ([3393969](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3393969) & [3394153](https://gitlab.isc.org/isc-private/bind9/-/jobs/3394153) just today).
```
I:nsupdate:check that update is rejected if quota is exceeded (62)
I:nsupdate:exceeded time limit waiting for 'too many DNS UPDATEs queued' in ns1/named.run
I:nsupdate:failed
```
In isc-projects/bind9#3846, we bumped the number of parallel queries from 10 to 20, but I don't think bumping it to even more is the way to go.Michal NowakMichal Nowakhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3874The !7530 broke mkeys test on Windows2023-02-27T14:58:05ZOndřej SurýThe !7530 broke mkeys test on Windows(It would be also ok to disable this test on Windows.)
```
S:mkeys:2023-02-15T03:17:19-0800
T:mkeys:1:A
A:mkeys:System test mkeys
I:mkeys:PORTRANGE:10000 - 10099
I:mkeys:starting servers
I:mkeys:check for signed record (1)
I:mkeys:check...(It would be also ok to disable this test on Windows.)
```
S:mkeys:2023-02-15T03:17:19-0800
T:mkeys:1:A
A:mkeys:System test mkeys
I:mkeys:PORTRANGE:10000 - 10099
I:mkeys:starting servers
I:mkeys:check for signed record (1)
I:mkeys:check positive validation with valid trust anchor (2)
I:mkeys:check positive validation using delv (3)
I:mkeys:check for failed validation due to wrong key in managed-keys (4)
I:mkeys:check new trust anchor can be added (5)
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:check new trust anchor can't be added with bad initial key (6)
I:mkeys:ns3 refreshing managed keys for '_default'
I:mkeys:remove untrusted standby key, check timer restarts (7)
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:restore untrusted standby key, revoke original key (8)
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:refresh managed-keys, ensure same result (9)
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:restore revoked key, ensure same result (10)
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:reinitialize trust anchors, add second key to bind.keys
I:mkeys:check that no key from bind.keys is marked as an initializing key (11)
I:mkeys:reinitialize trust anchors, revert to one key in bind.keys
I:mkeys:check that standby key is now trusted (12)
I:mkeys:revoke original key, add new standby (13)
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:revoke standby before it is trusted (14)
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:wait 20 seconds for key add/remove holddowns to expire (15)
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:revoke all keys, confirm roll to insecure (16)
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:check for insecure response (17)
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:reset the root server (18)
I:mkeys:ns1 server reload successful
I:mkeys:reinitialize trust anchors
I:mkeys:check positive validation (19)
I:mkeys:revoke key with bad signature, check revocation is ignored (20)
I:mkeys:ns1 server reload successful
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:check validation fails with bad DNSKEY rrset (21)
I:mkeys:restore DNSKEY rrset, check validation succeeds again (22)
I:mkeys:ns1 server reload successful
I:mkeys:reset the root server with no keys, check for minimal update (23)
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:reset the root server with no signatures, check for minimal update (24)
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:restore root server, check validation succeeds again (25)
I:mkeys:ns1 server reload successful
I:mkeys:ns2 refreshing managed keys for '_default'
I:mkeys:check that trust-anchor-telemetry queries are logged (26)
I:mkeys:check that trust-anchor-telemetry queries are received (27)
I:mkeys:check 'rndc-managed-keys destroy' (28)
I:mkeys:ns2 destroying managed-keys database for '_default'
I:mkeys:check that trust-anchor-telemetry queries contain the correct key (29)
I:mkeys:check initialization fails if managed-keys can't be created (30)
I:mkeys:check failure to contact root servers does not prevent key refreshes after restart (31)
I:mkeys:exceeded time limit waiting for 'Returned from key fetch in keyfetch_done() for 'sub.foo':' in ns5/named.run
I:mkeys:failed
I:mkeys:check 'rndc managed-keys' and islands of trust root unreachable (32)
I:mkeys:failed
I:mkeys:check key refreshes are resumed after root servers become available (33)
I:mkeys:exceeded time limit waiting for 'Returned from key fetch in keyfetch_done() for 'sub.foo': success' in ns5/named.run
I:mkeys:exceeded time limit waiting for 'Returned from key fetch in keyfetch_done() for 'sub.foo': success' in ns5/named.run
I:mkeys:failed
I:mkeys:reinitialize trust anchors, add unsupported algorithm (34)
I:mkeys:ignoring unsupported algorithm in managed-keys (35)
I:mkeys:introduce unsupported algorithm rollover in authoritative zone (36)
I:mkeys:ignoring unsupported algorithm in rollover (37)
I:mkeys:ns1 server reload successful
I:mkeys:ns6 refreshing managed keys for '_default'
I:mkeys:check 'rndc managed-keys' and views (38)
I:mkeys:check 'rndc managed-keys' and islands of trust now that root is reachable (39)
I:mkeys:failed
I:mkeys:exit status: 4
I:mkeys:stopping servers
R:mkeys:FAIL
E:mkeys:2023-02-15T03:20:09-0800
```
See:
* https://gitlab.isc.org/isc-projects/bind9/-/jobs/3159567
* https://gitlab.isc.org/isc-projects/bind9/-/jobs/3159566March 2023 (9.16.39, 9.16.39-S1, 9.18.13, 9.18.13-S1, 9.19.11)Mark AndrewsMark Andrewshttps://gitlab.isc.org/isc-projects/bind9/-/issues/3865Failing windows system tests due to libuv version check2023-03-25T07:25:40ZTom KrizekFailing windows system tests due to libuv version checkNightly windows system tests started to fail with: `c:\builds\isc-projects\bind9\lib\isc\netmgr\netmgr.c:265: fatal error: libuv version too new: running with libuv 1.44.2 when compiled with libuv 1.44.2 will lead to libuv failures`, e.g...Nightly windows system tests started to fail with: `c:\builds\isc-projects\bind9\lib\isc\netmgr\netmgr.c:265: fatal error: libuv version too new: running with libuv 1.44.2 when compiled with libuv 1.44.2 will lead to libuv failures`, e.g. [job#3150313](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3150313)
These failures have started after https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/7483 has been merged. The likely culprit is the addition of `#define MAXIMAL_UV_VERSION UV_VERSION(1, 39, 99)` in [`lib/isc/netmgr/netmgr.c#L249`](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/7483/diffs#882b07ab55f884856324e81fb113e650836a94d9_249_249).
This issue seems to only affect the Windows build, therefore the issue only appear in the `v9_16` branch (which is the last version which supports windows build).
See #3840March 2023 (9.16.39, 9.16.39-S1, 9.18.13, 9.18.13-S1, 9.19.11)https://gitlab.isc.org/isc-projects/bind9/-/issues/2891Missing parenthesis in the `atomic_load_explicit` macro2021-09-01T10:18:23ZArаm SаrgsyаnMissing parenthesis in the `atomic_load_explicit` macroThere are missing parenthesis in the definition of `atomic_load_explicit` macro in `lib/isc/win32/include/isc/stdatomic.h` which results of always evaluating to `sizeof(bool)` instead of getting the size of the actual expression.There are missing parenthesis in the definition of `atomic_load_explicit` macro in `lib/isc/win32/include/isc/stdatomic.h` which results of always evaluating to `sizeof(bool)` instead of getting the size of the actual expression.September 2021 (9.16.21, 9.16.21-S1, 9.17.18)https://gitlab.isc.org/isc-projects/bind9/-/issues/2837BIND 9 doesn't run on Windows when then number of workers in 8 or 122021-09-15T21:06:34Zlegacy1BIND 9 doesn't run on Windows when then number of workers in 8 or 12So this is simple to test with bind 9.16.19 and a CPU with more then 7 cores/threads bind will not start older version like 9.16.15 or 9.17.12 works beyond 7 cores/threads.
Please fix thanksSo this is simple to test with bind 9.16.19 and a CPU with more then 7 cores/threads bind will not start older version like 9.16.15 or 9.17.12 works beyond 7 cores/threads.
Please fix thanksSeptember 2021 (9.16.21, 9.16.21-S1, 9.17.18)Arаm SаrgsyаnArаm Sаrgsyаnhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2727dig package for Windows2023-11-02T16:24:19ZVicky Riskvicky@isc.orgdig package for WindowsWe plan to end support for Windows with 9.18. It seems like there are a number of users who need dig on Windows, so if we can build a dig package for Windows and host that on our website for download, that would be very useful.We plan to end support for Windows with 9.18. It seems like there are a number of users who need dig on Windows, so if we can build a dig package for Windows and host that on our website for download, that would be very useful.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2555journal test fails on Windows2021-03-22T10:46:56ZMichal Nowakjournal test fails on WindowsThe newly added `journal` system test [fails](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1553451) on Windows on `v9_16`:
```
S:journal:2021-03-04T19:27:15-0800
T:journal:1:A
A:journal:System test journal
I:journal:PORTRANGE:8800 - ...The newly added `journal` system test [fails](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1553451) on Windows on `v9_16`:
```
S:journal:2021-03-04T19:27:15-0800
T:journal:1:A
A:journal:System test journal
I:journal:PORTRANGE:8800 - 8899
I:journal:starting servers
I:journal:check outdated journal rolled forward (dynamic) (1)
I:journal:check outdated empty journal did not cause an error (dynamic) (2)
I:journal:check outdated journals were updated or removed (dynamic) (3)
I:journal:failed
I:journal:check updated journal has correct RR count (dynamic) (4)
I:journal:failed
I:journal:check new-format journal rolled forward (dynamic) (5)
I:journal:check new-format empty journal did not cause error (dynamic) (6)
I:journal:check new-format journals were updated or removed (dynamic) (7)
I:journal:check outdated up-to-date journal succeeded (ixfr-from-differences) (8)
I:journal:check outdated journal was updated (ixfr-from-differences) (9)
I:journal:failed
I:journal:check journal with mixed headers succeeded (version 1,2,1,2) (10)
I:journal:check journal with mixed headers was updated (version 1,2,1,2) (11)
I:journal:failed
I:journal:check journal with mixed headers succeeded (version 2,1,2,1) (12)
I:journal:check journal with mixed headers was updated (version 2,1,2,1) (13)
I:journal:failed
I:journal:check there are no journals left un-updated (14)
I:journal:failed
I:journal:check journal downgrade/upgrade (15)
I:journal:check max-journal-size works after journal update (16)
I:journal:failed
I:journal:check max-journal-size works with non-updated journals (17)
I:journal:check journal index consistency (18)
I:journal:exit status: 7
I:journal:stopping servers
I:journal:pytest not installed, skipping python tests
R:journal:FAIL
E:journal:2021-03-04T19:27:25-0800
```
Starting with this test failing:
```
I:journal:check outdated journals were updated or removed (dynamic) (3)
I:journal:failed
```
It's `changed.db.jnl` file has the `;BIND LOG V9` version header but `;BIND LOG V9.2` is expected (as happens on Unix).March 2021 (9.11.29, 9.11.29-S1, 9.16.13, 9.16.13-S1, 9.17.11)Evan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/2514Windows 10 Insider Dev fails to resolve DoH queries against BIND2021-03-29T12:31:49ZtriaticWindows 10 Insider Dev fails to resolve DoH queries against BIND<!--
If the bug you are reporting is potentially security-related - for example,
if it involves an assertion failure or other crash in `named` that can be
triggered repeatedly - then please do *NOT* report it here, but send an
email to [...<!--
If the bug you are reporting is potentially security-related - for example,
if it involves an assertion failure or other crash in `named` that can be
triggered repeatedly - then please do *NOT* report it here, but send an
email to [security-officer@isc.org](security-officer@isc.org).
-->
### Summary
I cannot get DNS over HTTPS in BIND to work with Windows 10 Insider Dev, whereas I have no problems with Unbound. Unbound returns the full chain for my letsencrypt tls certificate whereas BIND does not, which may explain it.
### BIND version used
BIND 9.17.10-1+ubuntu20.10.1+isc+1-Ubuntu (Development Release) <id:>
### Steps to reproduce
Install Windows 10 Insider (Dev channel) and attempt to resolve queries against a BIND DoH server configured with a letsencrypt (or similar) certificate.
### What is the current *bug* behavior?
DNS resolution fails.
### What is the expected *correct* behavior?
DNS resolution should succeed.April 2021 (9.11.30/9.11.31, 9.11.30-S1/9.11.31-S1, 9.16.14/9.16.15, 9.16.14-S1/9.16.15-S1, 9.17.12)Artem BoldarievArtem Boldarievhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2285Windows-specific system test framework glitches2021-11-02T09:30:25ZMichał KępieńWindows-specific system test framework glitchesOn Windows, system tests may (rarely) fail because of system test
framework imperfections.
Two types of intermittent issues have been observed in the past few
months:
1. Issues with starting servers ([example][1]).
```
S:addz...On Windows, system tests may (rarely) fail because of system test
framework imperfections.
Two types of intermittent issues have been observed in the past few
months:
1. Issues with starting servers ([example][1]).
```
S:addzone:2020-07-03T09:45:21+0100
T:addzone:1:A
A:addzone:System test addzone
I:addzone:PORTS:,5150,5151,5152,5153,5154,5155,5156,5157,5158
Value "" invalid for option p (number expected)
I:ns3:ns3/sign.sh
I:addzone:starting servers
Value "" invalid for option port (number expected)
usage: start.pl [--noclean] [--restart] [--port <port>] test-directory [server-directory [server-options]]
I:addzone:starting servers failed
R:addzone:FAIL
E:addzone:2020-07-03T09:45:25+0100
```
This failure mode has not been investigated closely, but it looks
like an issue with the `bin/tests/system/get_ports.sh` script on
`main` (this script is only present on `main`) - it seems that it
can fail to set the `PORT` environment variable in certain
circumstances, which prevents test `named` instances from being
started.
2. Issues with PID reuse ([example][2]).
```
S:rndc:2020-11-16T06:56:52-0800
T:rndc:1:A
A:rndc:System test rndc
I:rndc:PORTRANGE:11000 - 11099
I:rndc:starting servers
I:rndc:preparing (1)
I:rndc:rndc freeze
I:rndc:checking zone was dumped (2)
...
S:rrsetorder:2020-11-16T06:57:52-0800
T:rrsetorder:1:A
A:rrsetorder:System test rrsetorder
I:rrsetorder:PORTRANGE:11500 - 11599
I:rndc:exit status: 0
I:rndc:stopping servers
I:rrsetorder:starting servers
I:rrsetorder:Order 'fixed' disabled at compile time
I:rrsetorder:Checking order fixed behaves as cyclic when disabled (master)
I:rrsetorder:Checking order cyclic (master + additional)
I:rrsetorder:Checking order cyclic (master)
I:rrsetorder:Checking order random (master)
I:rrsetorder:Random selection return 12 of 24 possible orders in 36 samples
I:rrsetorder:Checking order none (primary)
I:rrsetorder:Checking order cyclic (slave + additional)
I:rrsetorder:Checking order cyclic (slave)
I:rrsetorder:Checking order random (slave)
I:rndc:ns4 didn't die when sent a SIGTERM
I:rndc:stopping servers failed
R:rndc:FAIL
I:rrsetorder:Random selection return 12 of 24 possible orders in 36 samples
E:rndc:2020-11-16T06:58:55-0800
I:rrsetorder:Checking order none (secondary)
I:rrsetorder:Shutting down slave
I:rrsetorder:Checking for slave's on disk copy of zone
I:rrsetorder:Re-starting slave
I:rrsetorder:Checking order cyclic (slave + additional, loaded from disk)
I:rrsetorder:Checking order cyclic (slave loaded from disk)
I:rrsetorder:Checking order random (slave loaded from disk)
I:rrsetorder:Random selection return 12 of 24 possible orders in 36 samples
I:rrsetorder:Checking order none (secondary loaded from disk)
I:rrsetorder:Checking order cyclic (cache + additional)
I:rrsetorder:failed
I:rrsetorder:Checking order cyclic (cache)
I:rrsetorder:failed
I:rrsetorder:Checking order random (cache)
I:rrsetorder:Random selection return 0 of 24 possible orders in 36 samples
I:rrsetorder:failed
I:rrsetorder:Checking order none (cache)
I:rrsetorder:failed
I:rrsetorder:Checking default order (cache)
I:rrsetorder:Default selection return 0 of 24 possible orders in 36 samples
I:rrsetorder:failed
I:rrsetorder:Checking default order no match in rrset-order (cache)
I:rrsetorder:failed
I:rrsetorder:exit status: 5
I:rrsetorder:stopping servers
I:rrsetorder:ns1 died before a SIGTERM was sent
I:rrsetorder:stopping servers failed
R:rrsetorder:FAIL
E:rrsetorder:2020-11-16T07:10:23-0800
```
A similar failure mode was triggered in the course of [BIND 9.17.1
release testing][3]. The root cause of this problem is that signal
handlers do not work on Windows and thus when SIGTERM is sent to a
`named` process, it dies immediately without cleaning up its PID
file. To work around this, the system test framework relies on
`kill` returning an error for non-existing PIDs for detecting when a
given `named` instance is no longer alive. However, Windows [tends
to recycle PIDs][4]. If `named` instances belonging to one system
test are shut down while `named` instances belonging to another
system test are just starting up, the system test framework may
"confuse" `named` instances from these two tests with each other:
1. `stop.pl` attempts to stop `named` instance `ns1` for system
test `testA`. It send it a SIGTERM. `ns1` for `testA` exits
without cleaning up its PID file.
2. `start.pl` starts up `named` instance `ns1` for system test
`testB`. It gets assigned the same PID as `ns1` for `testA` which
has just exited.
3. `stop.pl` tests whether `ns1` for `testA` is still alive. It
reads its PID file and attempts to `kill` the PID it read. Since
`ns1` for `testB` has the same PID, `stop.pl` assumes `ns1` for
`testA` is still alive.
4. After 1 minute, `stop.pl` decides to send a SIGABRT to `ns1` for
`testA`, but that one is already long gone - instead, the signal
hits `ns1` for `testB`, killing it (possibly in the middle of
`testB`). `stop.pl` reports that `ns1` for `testA` did not die
when it was sent a SIGTERM (even though it did).
5. `stop.pl` attempts to `kill` `ns1` for `testB`, but it was
already `kill`ed beforehand. `stop.pl` reports that `ns1` for
`testB` died before it was sent a SIGTERM.
[1]: https://gitlab.isc.org/isc-private/bind9/-/jobs/1007156#L153
[2]: https://gitlab.isc.org/isc-private/bind9/-/jobs/1299877#L3517
[3]: https://wiki.isc.org/bin/view/QA/BindQaResults_9_11_18
[4]: https://stackoverflow.com/questions/26301382/does-windows-7-recycle-process-id-pid-numbersBIND 9.17 Backburnerhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2015Shutdown crash on Windows: lib\isc\netmgr\netmgr.c:275: INSIST(r == 0) failed2020-09-30T15:57:30ZMichał KępieńShutdown crash on Windows: lib\isc\netmgr\netmgr.c:275: INSIST(r == 0) failedFirst observed in a scheduled pipeline run for
1dd265df8f5540c3c89d92ff9b364994bf96138d:
https://gitlab.isc.org/isc-projects/bind9/-/jobs/1014592
I am only making the issue confidential out of abundance of caution as
the crash seems to...First observed in a scheduled pipeline run for
1dd265df8f5540c3c89d92ff9b364994bf96138d:
https://gitlab.isc.org/isc-projects/bind9/-/jobs/1014592
I am only making the issue confidential out of abundance of caution as
the crash seems to be caused by a [call][1] to `uv_loop_close()`
returning a non-zero value.
[1]: https://gitlab.isc.org/isc-projects/bind9/-/blob/1dd265df8f5540c3c89d92ff9b364994bf96138d/lib/isc/netmgr/netmgr.c#L274October 2020 (9.11.24, 9.11.24-S1, 9.16.8, 9.16.8-S1, 9.17.6)Witold KrecickiWitold Krecickihttps://gitlab.isc.org/isc-projects/bind9/-/issues/1988bad output rndc dnssec -status on Windows2020-07-16T07:03:47ZMatthijs Mekkingmatthijs@isc.orgbad output rndc dnssec -status on WindowsThe following discussion from !3780 should be addressed:
- [ ] @michal started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/3780#note_144605): (+1 comment)
> This looks fine to me, though I started a p...The following discussion from !3780 should be addressed:
- [ ] @michal started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/3780#note_144605): (+1 comment)
> This looks fine to me, though I started a pipeline including Windows
> system tests that I would like to complete successfully before merging
> this MR:
>
> https://gitlab.isc.org/isc-projects/bind9/pipelines/45755August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Matthijs Mekkingmatthijs@isc.orgMatthijs Mekkingmatthijs@isc.orghttps://gitlab.isc.org/isc-projects/bind9/-/issues/1919Fix documentation install on Windows2020-09-03T10:09:46ZOndřej SurýFix documentation install on WindowsCurrently the `libisc.vcxproj` contains stuff like this:
```
echo Copying the ARM and the Installation Notes.
copy ..\COPYRIGHT ..\Build\Release
copy ..\README ..\Build\Release
copy ..\HISTORY ..\Build\Release
copy readme1st.txt ..\Buil...Currently the `libisc.vcxproj` contains stuff like this:
```
echo Copying the ARM and the Installation Notes.
copy ..\COPYRIGHT ..\Build\Release
copy ..\README ..\Build\Release
copy ..\HISTORY ..\Build\Release
copy readme1st.txt ..\Build\Release
copy index.html ..\Build\Release
copy ..\doc\arm\*.html ..\Build\Release
copy ..\doc\arm\Bv9ARM.pdf ..\Build\Release
copy ..\CHANGES ..\Build\Release
if Exist ..\CHANGES.SE copy ..\CHANGES.SE ..\Build\Release
copy ..\FAQ ..\Build\Release
```
As we currently don't build the documentation on Windows and we don't store it in git, I think the right way forward is to cherry-pick the artifact files from the `docs` build into the Windows zip file.September 2020 (9.11.23, 9.11.23-S1, 9.16.7, 9.17.5)Michał KępieńMichał Kępieńhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1899TCP Accept Refactoring broke Windows2020-06-08T12:19:30ZOndřej SurýTCP Accept Refactoring broke WindowsThe !3320 that got merged to master broke TCP connections on Windows. This needs to be fixed on master (before we release next 9.17.2) and also before we merged the backport to the BIND 9.16 branch.The !3320 that got merged to master broke TCP connections on Windows. This needs to be fixed on master (before we release next 9.17.2) and also before we merged the backport to the BIND 9.16 branch.June 2020 (9.11.20, 9.11.20-S1, 9.16.4, 9.17.2)Witold KrecickiWitold Krecickihttps://gitlab.isc.org/isc-projects/bind9/-/issues/1762Collect crash dumps on Windows2021-06-14T06:36:42ZOndřej SurýCollect crash dumps on WindowsThe crash dumps on Windows are by default collected outside the CI_BUILD_DIR, so those are not part of the artifact keeping.The crash dumps on Windows are by default collected outside the CI_BUILD_DIR, so those are not part of the artifact keeping.BIND 9.17 Backburnerhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1757Workaround the signed atomic operations because of Windows stdatomic.h shim2021-09-03T05:12:44ZOndřej SurýWorkaround the signed atomic operations because of Windows stdatomic.h shimWe found out that our win32 stdatomic.h shim converts all signed integers to unsigned. The proper workaround for this is to load the value into a local variable first and then use the variable instead of directly calling `atomic_load()`...We found out that our win32 stdatomic.h shim converts all signed integers to unsigned. The proper workaround for this is to load the value into a local variable first and then use the variable instead of directly calling `atomic_load()`. We need to investigate the rest of the `atomic_int_*` usage in BIND 9 source code for similar errors:
```
lib/isc/hp.c:static atomic_int_fast32_t tid_v_base = ATOMIC_VAR_INIT(0);
lib/isc/include/isc/mutexatomic.h:typedef struct atomic_int_fast32 {
lib/isc/include/isc/mutexatomic.h:} atomic_int_fast32_t;
lib/isc/include/isc/mutexatomic.h:typedef struct atomic_int_fast64 {
lib/isc/include/isc/mutexatomic.h:} atomic_int_fast64_t;
lib/isc/include/isc/rwlock.h: atomic_int_fast32_t spins;
lib/isc/include/isc/rwlock.h: atomic_int_fast32_t write_requests;
lib/isc/include/isc/rwlock.h: atomic_int_fast32_t write_completions;
lib/isc/include/isc/rwlock.h: atomic_int_fast32_t cnt_and_flag;
lib/isc/log.c: atomic_int_fast32_t debug_level;
lib/isc/log.c: atomic_int_fast32_t highest_level;
lib/isc/netmgr/netmgr-int.h: atomic_int_fast64_t pktcount;
lib/isc/netmgr/netmgr-int.h: atomic_int_fast32_t rchildren;
lib/isc/netmgr/netmgr-int.h: atomic_int_fast32_t ah;
lib/isc/stats.c:typedef atomic_int_fast32_t isc__atomic_statcounter_t;
lib/isc/stats.c:typedef atomic_int_fast64_t isc__atomic_statcounter_t;
lib/isc/tests/task_test.c:atomic_int_fast32_t counter;
lib/isc/tests/task_test.c: atomic_int_fast32_t *value = (atomic_int_fast32_t *)event->ev_arg;
lib/isc/tests/task_test.c: atomic_int_fast32_t *value = (atomic_int_fast32_t *)event->ev_arg;
lib/isc/tests/task_test.c: atomic_int_fast32_t a, b;
lib/isc/tests/task_test.c: atomic_int_fast32_t a, b, c, d, e;
lib/isc/tests/task_test.c: atomic_int_fast32_t a, b, c, d, e; /* non valid states */
lib/isc/tests/timer_test.c:static atomic_int_fast32_t eventcnt;
```September 2021 (9.16.21, 9.16.21-S1, 9.17.18)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1742Logging is broken on Windows2020-04-08T13:28:32ZMichał KępieńLogging is broken on Windows3a24eacbb619b89eacf87281b4d1a73e68c19471 (!3321) seems to be the
culprit.
Last scheduled *master* pipeline for which Windows (mostly) worked:
https://gitlab.isc.org/isc-projects/bind9/pipelines/38307
- https://gitlab.isc.org/isc-pro...3a24eacbb619b89eacf87281b4d1a73e68c19471 (!3321) seems to be the
culprit.
Last scheduled *master* pipeline for which Windows (mostly) worked:
https://gitlab.isc.org/isc-projects/bind9/pipelines/38307
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/802733
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/802734
First scheduled *master* pipeline for which Windows is broken:
https://gitlab.isc.org/isc-projects/bind9/pipelines/38401
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/804958
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/804959
I narrowed it down to 3a24eacbb619b89eacf87281b4d1a73e68c19471 by
testing the merge requests merged between these two scheduled pipelines.April 2020 (9.11.18, 9.16.2, 9.17.1)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1698Converting isc_log to RWLOCK broke Windows2020-03-24T04:47:48ZOndřej SurýConverting isc_log to RWLOCK broke WindowsThe !3229 broke Windows system tests - the tests are stuck indefinitely. See https://gitlab.isc.org/isc-projects/bind9/-/jobs/777703 (ondrej/msvc-stuck-v1 branch) as example vs https://gitlab.isc.org/isc-projects/bind9/-/jobs/777784 (on...The !3229 broke Windows system tests - the tests are stuck indefinitely. See https://gitlab.isc.org/isc-projects/bind9/-/jobs/777703 (ondrej/msvc-stuck-v1 branch) as example vs https://gitlab.isc.org/isc-projects/bind9/-/jobs/777784 (ondrej/msvc-stuck-v2 branch).
There's nothing really obvious, but converting the rwlocks to native Windows SRWLock doesn't help, so it must be a way how we are using the locks on Windows.April 2020 (9.11.18, 9.16.2, 9.17.1)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1669"kasp" system test is failing consistently on Windows2020-04-09T06:50:14ZMichał Kępień"kasp" system test is failing consistently on WindowsThe same four checks always fail on Windows:
```
kasp1.log-I:kasp:check next key event for zone step2.algorithm-roll.kasp (570)
kasp1.log-I:kasp:error: bad next key event time 20909 for zone step2.algorithm-roll.kasp (expect 21600)
kasp...The same four checks always fail on Windows:
```
kasp1.log-I:kasp:check next key event for zone step2.algorithm-roll.kasp (570)
kasp1.log-I:kasp:error: bad next key event time 20909 for zone step2.algorithm-roll.kasp (expect 21600)
kasp1.log:I:kasp:failed
--
kasp1.log-I:kasp:check next key event for zone step5.algorithm-roll.kasp (594)
kasp1.log-I:kasp:error: bad next key event time 24513 for zone step5.algorithm-roll.kasp (expect 25200)
kasp1.log:I:kasp:failed
--
kasp1.log-I:kasp:check next key event for zone step2.csk-algorithm-roll.kasp (618)
kasp1.log-I:kasp:error: bad next key event time 20916 for zone step2.csk-algorithm-roll.kasp (expect 21600)
kasp1.log:I:kasp:failed
--
kasp1.log-I:kasp:check next key event for zone step5.csk-algorithm-roll.kasp (642)
kasp1.log-I:kasp:error: bad next key event time 24519 for zone step5.csk-algorithm-roll.kasp (expect 25200)
kasp1.log:I:kasp:failed
--
kasp2.log-I:kasp:check next key event for zone step2.algorithm-roll.kasp (570)
kasp2.log-I:kasp:error: bad next key event time 20956 for zone step2.algorithm-roll.kasp (expect 21600)
kasp2.log:I:kasp:failed
--
kasp2.log-I:kasp:check next key event for zone step5.algorithm-roll.kasp (594)
kasp2.log-I:kasp:error: bad next key event time 24559 for zone step5.algorithm-roll.kasp (expect 25200)
kasp2.log:I:kasp:failed
--
kasp2.log-I:kasp:check next key event for zone step2.csk-algorithm-roll.kasp (618)
kasp2.log-I:kasp:error: bad next key event time 20962 for zone step2.csk-algorithm-roll.kasp (expect 21600)
kasp2.log:I:kasp:failed
--
kasp2.log-I:kasp:check next key event for zone step5.csk-algorithm-roll.kasp (642)
kasp2.log-I:kasp:error: bad next key event time 24564 for zone step5.csk-algorithm-roll.kasp (expect 25200)
kasp2.log:I:kasp:failed
--
kasp3.log-I:kasp:check next key event for zone step2.algorithm-roll.kasp (570)
kasp3.log-I:kasp:error: bad next key event time 20936 for zone step2.algorithm-roll.kasp (expect 21600)
kasp3.log:I:kasp:failed
--
kasp3.log-I:kasp:check next key event for zone step5.algorithm-roll.kasp (594)
kasp3.log-I:kasp:error: bad next key event time 24539 for zone step5.algorithm-roll.kasp (expect 25200)
kasp3.log:I:kasp:failed
--
kasp3.log-I:kasp:check next key event for zone step2.csk-algorithm-roll.kasp (618)
kasp3.log-I:kasp:error: bad next key event time 20943 for zone step2.csk-algorithm-roll.kasp (expect 21600)
kasp3.log:I:kasp:failed
--
kasp3.log-I:kasp:check next key event for zone step5.csk-algorithm-roll.kasp (642)
kasp3.log-I:kasp:error: bad next key event time 24545 for zone step5.csk-algorithm-roll.kasp (expect 25200)
kasp3.log:I:kasp:failed
```
I am not sure how long this has been going on because another Windows-specific issue (!3184) has been masking this problem.
@matthijs: I have not yet looked at this problem closely, please take a look if you have some time. Keep in mind that the `kasp` test takes a lot more to run on Windows than on Unix systems (about 15 minutes; yes, you read that right). We will need to get this fixed before tagging or else we will have trouble producing release tarballs (as CI pipelines will not be able to pass).April 2020 (9.11.18, 9.16.2, 9.17.1)https://gitlab.isc.org/isc-projects/bind9/-/issues/1310Fix system tests on Windows after merging libuv work2019-11-29T07:48:12ZMichał KępieńFix system tests on Windows after merging libuv workMerging !2528 broke a significant number of system tests on Windows.
In Jenkins, the following tests failed (I only performed one run so far):
- `autosign`
- `digdelv`
- `dnssec`
- `keepalive`
- `legacy`
- `mirror`
- `mkeys`
- ...Merging !2528 broke a significant number of system tests on Windows.
In Jenkins, the following tests failed (I only performed one run so far):
- `autosign`
- `digdelv`
- `dnssec`
- `keepalive`
- `legacy`
- `mirror`
- `mkeys`
- `nsupdate`
- `padding`
- `pipelined`
- `rpz`
- `rrl`
- `statistics`
- `stub`
- `synthfromdnssec`
- `tkey`
- `upforwd`
- `wildcard`
- `zero`
See: https://jenkins.isc.org/view/BIND_Parameterized/job/bind9-parameterized-win2012-x64/353/console
I also did one test run in GitLab CI[^1] and the failed tests were those listed above + the `pending` system test.
Some notes:
- The `zero` system test alone takes some 15-20 *minutes* to complete on Windows. I recall having issues with this test when I was first trying to add Windows to GitLab CI. From what I recall, the issue was that a significant number of queries are sent during that test and Windows is just unable to keep up with logging at `-d 99`. I initially worked around it by putting `named.args` files in place that did *not* include `-d 99` because all those logs are not really needed in that test. However, I eventually came up with a different fix (!2398) that seemed to be good enough until now. Perhaps decreasing logging verbosity is what we will need in the end? I have not yet investigated why that test takes so long to complete with current *master*.
- In the `nsupdate` test, `named` hangs in weird ways - both in Jenkins and GitLab CI, I had to "intervene" by killing binaries manually or else the test was stuck. FWIW, I have not yet tried running that test on its own, it was always run as part of the whole test suite.
**While we can skip a single development release on Windows, releasing BIND 9.16.0 will require sorting all these issues out.**
[^1]: !2556 is a prerequisite for *any* system test to work in GitLab CIDecember 2019 (9.11.14, 9.14.9, 9.15.7)https://gitlab.isc.org/isc-projects/bind9/-/issues/1114GeoIP2 support breaks compilation on Windows2019-07-03T16:53:20ZMichał KępieńGeoIP2 support breaks compilation on Windows```
c:\cygwin64\home\jenkins\workspace\bind9-master-win2012-x64-vs2017\lib\dns\include\dns\geoip.h(110): error C2016: C requires that a struct or union has at least one member [c:\cygwin64\home\jenkins\workspace\bind9-master-win2012-x64-...```
c:\cygwin64\home\jenkins\workspace\bind9-master-win2012-x64-vs2017\lib\dns\include\dns\geoip.h(110): error C2016: C requires that a struct or union has at least one member [c:\cygwin64\home\jenkins\workspace\bind9-master-win2012-x64-vs2017\lib\ns\win32\libns.vcxproj]
```
https://jenkins.isc.org/job/bind9-master-win2012-x64-vs2017/247/BIND 9.15.2