-
-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fatal error: Could not initialize thread / error: reading root hints #1038
fatal error: Could not initialize thread / error: reading root hints #1038
Comments
From what I can see, the threads each open a file with fopen(..), and then everything is a thread local variable, and then they call getc(file) on their own file. And in this manner the threads read the file. That is the same file for every thread, but they have their own fopen(..) call and FILE object. They call their own getc(..) call and the variables are otherwise owned by themselves. So there could be no race condition on the contents of the file. If I try it here, on Linux and on FreeBSD, I get no race conditions. So it does not reproduce, either. There is development in a code branch, where the iterator stub hints are unshared from the threads, and as a result would be read only once, at reload time. This is in #1015 . The change would likely, also if you did not use the feature implemented there, remove the issue here. But it is better if we could find out what the cause is. How can I reproduce the failure, just a number of threads and root hints does not trigger it for me? Perhaps use valgrind --tool=helgrind on it and report if it finds failures, apart from other output, about pp_init and about default_event_base, that we do not need right now, about the root hints read? |
@wcawijngaards
In my experience, there is a better chance of reproducing on a bare-metal install (it didn’t reproduce on a VM for me) and giving at least some load (i run a couple of scripts with a looped nslookups from client). And restart unbound every 5 minutes. The error is realy sporadic. Under the described conditions, it may take a day before the error appears |
The issue seems to be that opening the same file in several places is not supported by the standard, and not portable. This makes it misbehave. The logs are very helpful in finding the problem, thank you. The changes to unshare the hints structure should provide a fix for this. Then, the root hints are read once, from one thread, after reload. |
@wcawijngaards thanks for looking into it. do you have any ETA for #1041 in an upcoming Unbound release? We can get by for the time being in OPNsense. BTW, this fast reload is pretty neat... been looking forward to something like this for years. Happy to see it progress! Cheers, |
After some review and running tests I think that the change would likely move to the code repository. Then this change is going to be part of the next release. But there is no release date for that right now. |
Ok, don't need specifics. Sounds good. :) |
@wcawijngaards sounds great! |
The fix should result in the issue no longer creating a problem, since there is one structure that is read once. |
* nlnet/master: (45 commits) - Fix for NLnetLabs#1062: declaration before statement, avoid print of null, and redundant check for array size. And changelog note for merge of NLnetLabs#1062. Fix potential overflow bug while parsing port in function cfg_mark_ports - Set version number to 1.20.0 for release. - Fix for the DNSBomb vulnerability CVE-2024-33655. Thanks to Xiang Li from the Network and Information Security Lab of Tsinghua University for reporting it. - Fix doxygen comment for errinf_to_str_bogus. - Cleanup unnecessary strdup calls for EDE strings. - Man page entry for unbound-checkconf -q. - Fix NLnetLabs#876: [FR] can unbound-checkconf be silenced when configuration is valid? - Add unit tests for cachedb and subnet cache expired data. - Fix cachedb with serve-expired-client-timeout disabled. The edns subnet module deletes global cache and cachedb cache when it stores a result, and serve-expired is enabled, so that the global reply, that is older than the ecs reply, does not return after the ecs reply expires. - Fix doc unit test for out of directory build. - Fix to disable fragmentation on systems with IP_DONTFRAG, with a nonzero value for the socket option argument. Changelog note for NLnetLabs#1041 and NLnetLabs#1038. - Merge NLnetLabs#1041: Stub and Forward unshare. This has one structure for them and fixes NLnetLabs#1038: fatal error: Could not initialize thread / error: reading root hints. Update locking management for iter_fwd and iter_hints methods. (NLnetLabs#1054) - Fix configure flto check error, by finding grep for it. - Fix ci workflow for macos for moved install locations. - Merge NLnetLabs#1053: Remove child delegations from cache when grandchild delegations are returned from parent. - When a granchild delegation is returned, remove any cached child delegations up to parent to not cause delegation invalidation because of an expired child delegation that would never be updated. Most likely to happen without qname-minimisation. Reported by Roland van Rijswijk-Deij. - Fix edns subnet to sort rrset references when storing messages in the cache. This fixes a race condition in the rrset locks. - Add checklock feature verbose_locking to trace locks and unlocks. ...
Hi!
Describe the bug
Sporadic fatal errors possible at unbound start if the
root-hints
option is used.The server may become unresponsive as a result.
The copy of the https://www.internic.net/domain/named.root is used for root-hints.
Errors seen (can occur either during one start attempt or be limited to one error per unsuccessful attempt):
error: reading root hints /root.hints 19:1: Empty label
error: reading root hints /root.hints 2:37: Syntax error, could not parse the RR's type
error: reading root hints /root.hints 2:9: Syntax error, could not parse the RR's type
error: reading root hints /root.hints 8:14: Syntax error, could not parse the RR's class
error: reading root hints /root.hints 13:15: Conversion error, ip6 addr expected
and so on..
so it ends up with:
error: Could not set root or stub hints
andfatal error: Could not initialize thread
(for one or more threads)The sporadic nature of the errors and their content (error in reading the record type in a comment line, mismatch between the record type and the address type) suggest that there may be a thread race when reading a file?
To reproduce
Steps to reproduce the behavior:
root-hints
optionExpected behavior
A clear and concise description of what you expected to happen.
System:
unbound -V
output:Additional information
not yet
Thanks!
cc @fichtner
The text was updated successfully, but these errors were encountered: