apexLookupRule mapped every findApex failure to Crit, including transport
and resolver faults like "lookup nemunai.re on 127.0.0.11:53: server
misbehaving" — a flaky recursive resolver, not a broken delegation. That
made the check flap into Crit whenever the resolver hiccuped, the same
class of false negative the chain path already fixed.
Mark apex-lookup failures that stem from a transport/resolver fault
(resolveZoneNSAddrs net errors, recursiveExchange transport errors, and
SERVFAIL/REFUSED seen during the SOA walk) as transient via a typed
error, surface it as ApexLookupTransient, and have apexLookupRule report
Unknown for those. Definitive failures (NXDOMAIN-only walk, no resolvable
NS) still drive Crit.
A transport-level query failure (connection refused, timeout, network
unreachable) means the alias state could not be observed, not that the
alias is misconfigured. Mapping it to Warn made the check flap whenever a
flaky auth server alternated between refusing connections (Warn) and
answering SERVFAIL (Crit). Report TermQueryErr as Unknown so only
definitive DNS evidence drives Warn/Crit.
Extract querySiblings from observeCoexistence so both CNAME and DNAME
coexistence checks share the same parallel RRset scan. Add
observeDNAMECoexistence (called from Collect) that populates
AliasData.DNAMECoexistence for each DNAME node in DNAMESubstitutions.
Add the dname_coexistence rule (RFC 6672 §2.3) that flags any sibling
RRsets at a DNAME owner as CRIT, with matching tests.