Registered on the default FlagSet; ListenAndServe intercepts it and
exits 0/1 after probing /health on the -listen address. Lets checker-*
Dockerfiles add HEALTHCHECK without any main.go change, even though
the runtime image is scratch (no shell, no curl, no wget).
Reporters can now read rule output via ctx.States() instead of
re-deriving severity/hints from the raw payload, keeping the rules
screen and the HTML report aligned on a single source of truth.
Move server.go and interactive.go (and their tests) from the root
checker/ package into checker/server/. Plugin and builtin consumers of
the SDK now import only checker/ and no longer drag net/http,
html/template, or the form-rendering code into their artifacts: on
checker-dane.so this drops the binary by ~1.2 MB and removes 170
html/template symbols along with the net/http contribution that came
from the SDK itself.
Breaking for standalone consumers (main.go):
NewServer(p) -> server.New(p)
CheckerInteractive -> server.Interactive
InteractiveRelatedProviders -> server.Siblings
Providers that only satisfy the interactive interfaces structurally
(method set match, no explicit type reference) need no source change;
only main.go has to switch its import path and the constructor name.
Add an opt-in InteractiveRelatedProviders interface: a checker served
by /check can declare sibling ObservationProviders the SDK will run
in-process after the primary Collect. The SDK auto-fills any sibling
option tagged AutoFill==AutoFillDiscoveryEntries from the primary's
DiscoverEntries output, mirroring the host's AutoFill wiring so
siblings that work in-host work here too. Results are exposed through
ObservationGetter.GetRelated and ReportContext.Related, unblocking
cross-checker rules (e.g. DANE reading checker-tls probes) on the
standalone /check flow.
Providers that implement the new CheckerInteractive interface
(RenderForm + ParseForm) get a built-in HTML form on GET /check and
a consolidated result page on POST /check that runs the standard
Collect -> Evaluate -> GetHTMLReport / ExtractMetrics pipeline. This
lets a checker be used directly from a browser outside of happyDomain,
with the checker itself resolving what the host would normally
auto-fill (typically via its own DNS queries).
Also guards NewServer against a nil Definition() so providers that
advertise CheckerDefinitionProvider without a ready definition no
longer panic at registration.
Rules that iterate over multiple elements (certificates, CAA records,
nameservers, …) previously had to squash per-element results into a
single concatenated message. Evaluate now returns []CheckState and
CheckState carries an opaque Subject, so each element gets its own
structured state. The server injects a StatusUnknown placeholder when
a rule returns nothing, to avoid silently dropping the rule.
Add the plumbing that lets a checker receive (at evaluation, report
rendering, and metrics extraction) observations produced by other
checkers on DiscoveryEntry records it originally published.
Surface changes:
- RelatedObservation struct: one downstream observation, tagged with
the producing CheckerID and the Ref matching the DiscoveryEntry
it covers.
- ObservationGetter gains GetRelated(ctx, key), so rules can opt in
to cross-checker composition. mapObservationGetter (remote
/evaluate path) returns empty; the host owns lineage resolution.
- ReportContext interface: Data() + Related(key). Reporters consume
it instead of a raw json.RawMessage, which collapses the former
legacy/Ctx duplicate and gives one uniform signature:
GetHTMLReport(ctx ReportContext) (string, error)
ExtractMetrics(ctx ReportContext, t time.Time) ([]CheckMetric, error)
- NewReportContext(data, related) and StaticReportContext(data) build
fixed-payload contexts for entry points without an ObservationContext.
- ExternalReportRequest gains a Related map so the host can ship
pre-composed lineage to a remote checker over /report. The SDK's
/report handler threads it through to the reporter via
NewReportContext, closing the wire gap that previously forced
remote reports to a StaticReportContext with no related data.
Tests cover the Related map round-trip end-to-end via a peeking provider.
Introduce a DiscoveryEntry struct and an optional DiscoveryPublisher
interface that providers can co-implement to declare things worth
probing by other checkers (TLS endpoints, HTTP probes, ACME challenges,
DNSSEC keys, ...) without having to re-parse raw observations.
DiscoveryEntry carries an opaque Payload: the SDK does not interpret
it. Producers and consumers agree on the Payload schema through a
separate contract (eg. a small shared Go package imported by
both) identified by the free-form Type string. This keeps the SDK
free of protocol-specific concepts; new entry families can appear
without touching it.
The /collect HTTP handler type-asserts the provider against
DiscoveryPublisher immediately after Collect and forwards the
resulting entries in ExternalCollectResponse.Entries.
Adds HealthResponse carrying inflight count, total requests, 1/5/15-min
EWMA load averages, uptime, and NumCPU so a scheduler can pick the least
busy worker. A background sampler updates the load averages every 5s,
stopped by a new idempotent Close method. Work endpoints (/collect,
/evaluate, /report) are wrapped with a trackWork middleware; /health
and /definition are excluded so polling traffic does not pollute the
signal.
The previous implementation skipped empty fields, which meant targets
differing only in which fields were populated could produce the same
string (e.g. {UserId:"A"} and {DomainId:"A"} both gave "A"). This
caused key collisions when the string was used in storage index keys.
Keep Status as a plain integer on the wire so that OpenAPI/swaggo
can generate correct enum definitions without needing to handle
the custom string encoding. The String() method is preserved for
logging and debugging.
Errors from provider.Collect() and json.Marshal were returned with HTTP
200, making failures invisible to monitoring, proxies, and clients that
check status codes. Return 500 Internal Server Error so HTTP-level
tooling can detect failures without parsing the response body.
Prevent silent bugs where a CheckerDefinition with an unset ID
would be registered under the empty string key, becoming unfindable
and potentially colliding with other empty-ID registrations.
Make StatusUnknown the zero value (0) so an uninitialized CheckState
reads as "no signal yet" rather than as healthy. Push StatusOK and
StatusInfo to negative values so the natural int ordering matches
severity ordering: aggregators can simply take max() to compute the
worst status, and Unknown correctly sits above OK/Info but below Warn.
Status now (un)marshals as its string name ("OK", "WARN", ...) so the
wire format is stable across any future renumbering. UnmarshalJSON
still accepts raw ints for backward compatibility with older snapshots
and clients.
RegisterChecker and RegisterObservationProvider previously overwrote
existing entries silently, which lets a misconfigured plugin shadow a
built-in (or another plugin) without any signal. Refuse the duplicate
and keep the existing entry instead.
RegisterExternalizableChecker now performs the dedup check before
appending the "endpoint" AdminOpt, so a rejected re-registration no
longer mutates the live definition.