Commit graph

2,092 commits

Author SHA1 Message Date
b27b2745d6 compliance: MTA-STS record sync validators
Adds checks for svcs.MTA_STS against RFC 8461 sec. 3.1.

The validator surfaces:

- Wrong owner name (must be _mta-sts.<domain>).
- Missing or non-STSv1 v= tag.
- Missing id= tag.
- id= containing characters outside [A-Za-z0-9] or longer than 32 chars.
2026-04-28 17:57:26 +07:00
99dace151e compliance: DMARC record validators
Adds compliance checks for svcs.DMARC against RFC 7489.

The validator parses the published TXT and surfaces:

- Wrong owner name (record must live at _dmarc.<domain>).
- Missing or non-DMARC1 v= tag.
- Missing, unknown, or "monitoring-only" p= policy.
- Invalid sp= subdomain policy.
- Invalid adkim/aspf alignment values.
- pct= out of [0..100] (error) and pct < 100 (info, partial deployment).
- Non-positive or non-numeric ri=.
- Unknown fo= entries (0 / 1 / d / s) and unknown rf= formats (afrf).
- Empty or malformed rua/ruf URIs (mailto and http(s) accepted; mailto
  size suffix !N preserved).
2026-04-28 17:57:26 +07:00
e2bf286a56 compliance: SPF async recursive flatten
Wires the new POST /api/resolver/spf-flatten endpoint into the SPF
validator. The async path runs after the local checks, debounced and
cancellable through EditorCompliance, and surfaces:

- spf.recursive-many-lookups / spf.recursive-too-many-lookups based on
  the recursive lookupCount returned by the backend
- spf.too-many-void-lookups when more than 2 NXDOMAIN/NoData responses
  occur during the walk (RFC 7208 §4.6.4)
- per-include diagnostics: spf.include-loop, spf.include-no-spf,
  spf.include-resolver-error, spf.include-error — pointing at the exact
  domain and mechanism that failed
2026-04-28 17:57:26 +07:00
5f8862c384 resolver: POST /api/resolver/spf-flatten endpoint
Adds a recursive SPF flatten endpoint sized for the compliance UI:

- happydns.SPFFlattenRequest accepts a {domain, record?} pair so the UI
  can preview an unsaved record without persisting it first; the optional
  inline record bypasses the root TXT lookup.
- happydns.SPFFlattenResponse returns the recursive tree with per-node
  Mechanism / Domain / Record / LookupsHere / Error fields, plus the
  RFC 7208 §4.6.4 budget counters (LookupCount, VoidLookups, Exceeded,
  VoidExceeded, Truncated).
- Hard caps at 10 lookups, 2 void lookups, depth 12, 2s per query and
  10s overall. Cycle detection via the visited-domain set.
- Resolver selection mirrors ResolveQuestion (local / custom / default
  to 1.1.1.1) with the same IPv6-bracket handling.
2026-04-28 17:57:08 +07:00
48530dd212 compliance: SPF local validators
Extracts the SPF parser/serializer out of the editor into
$lib/services/spf.ts (matching dmarc.ts / mta_sts.ts) and adds a sync
validator that flags non-recursive issues against RFC 7208:

- missing or wrong v=spf1
- absence / multiplicity / non-final placement of ‘all’
- redirect= combined with ‘all’ or duplicated
- ptr deprecation (RFC 7208 §5.5)
- local DNS-lookup budget (warn ≥8, error >10) — recursive flatten will
  come later via an async backend endpoint
- mechanisms missing values, empty terms, duplicates, length cap
2026-04-28 17:55:40 +07:00
faf21f63b1 compliance: DKIM record sync validators
Validates a DKIM TXT record (svcs.DKIMRecord) at edit time:

- Selector: must be present, must match the label charset.
- Version: only "DKIM1" is accepted (RFC 6376 sec. 3.6.1).
- Public key: detects missing p=, empty p= (revocation, warning), and
  non-base64 payloads. Warns on RSA keys shorter than ~2048 bits and
  errors on RSA keys shorter than ~1024 bits per RFC 8301.
- Algorithms: warns on SHA-1 (RFC 8301) and unknown hashes; flags
  unknown key types or service types.
- Flags: surfaces t=y (testing) as info; warns on unknown flags.
- Granularity: marks g= as deprecated since RFC 6376.
2026-04-28 17:55:40 +07:00
893597688e compliance: add records-compliance infra (types, registry, editor panel)
Introduces the frontend-only compliance framework that lets each editor
contribute spec-conformance checks.

No validators are registered yet.
2026-04-28 17:55:40 +07:00
4d37d77772 compliance: Document common wanted scenarios 2026-04-28 17:55:40 +07:00
7251d93619 web: simplify service editors with single-source-of-truth state
Replace dueling parse/stringify $effects across service editors with
one-time top-level init plus a single write-back $effect. Remount
editors via {#key value} in ServiceEditor so children no longer need
inbound-sync logic.
2026-04-28 17:55:40 +07:00
93e148e8c0 Add external_whois checker for dangling pointer expiry
All checks were successful
continuous-integration/drone/push Build is passing
Subscribes to dangling.external-target.v1 entries via AutoFillDiscoveryEntries
and runs RDAP per registrable domain (deduped, parallelised, capped at 8
concurrent), publishing a per-Ref Facts map consumed by checker-dangling.
2026-04-28 13:06:23 +07:00
cb400979c2 Add legacy records checker 2026-04-28 13:06:23 +07:00
4a4505c9cc Add TLS checker 2026-04-28 13:06:23 +07:00
2e6c5c6be6 Bump checker-sdk to v1.5.0 2026-04-28 13:06:23 +07:00
1297898628 storage: widen ListDiscoveryEntriesByTarget to narrower scopes
Domain- and user-scoped consumers were missing every discovery entry
published below their scope. The exact-match prefix dscent-tgt|{u/d/s}|
introduced in 9c6398b1b only returned entries stored at the literal
target string, so a domain-scoped consumer like checker-tls or
checker-caa never received the tls.endpoint.v1 entries that
service-scoped producers (checker-dane, checker-smtp, checker-sip,
checker-srv, checker-stun-turn) publish under the same domain. The
symptom on the consumer side was "No TLS endpoints have been discovered
for this target yet." even when producers had run.

Drop the trailing "|" from the iteration prefix when the target lacks a
ServiceId (and the DomainId for user scope) so the prefix scan matches
narrower scopes too. RawURLEncoded identifiers contain neither "/" nor
"|", so slash boundaries in the encoded "u/d/s" target form remain
unambiguous. Service-scoped lookups stay exact. Each matching key is
parsed back into its actual stored target before fetching the primary
record, so the returned StoredDiscoveryEntry.Target reflects where the
entry was published, not the (broader) target that found it.
2026-04-28 13:06:23 +07:00
392e7fb344 checkers: thread rule States into ReportContext for reports
Propagate the persisted CheckEvaluation.States through BuildReportContext
and the HTTP report transport so reporters can render rule-driven
sections (hints, severity) without re-deriving them from raw data. When
no evaluation is available the context carries nil states, matching the
SDK's documented nil-safe fallback to data-only rendering.
2026-04-28 13:06:23 +07:00
fa781a537d checkers: migrate to checker-sdk-go v1.2.0 []CheckState signature
Rules now return []CheckState, the engine stamps RuleName from the rule,
and the HTTP rule-result lookup matches on RuleName rather than Code.
domain_contact emits one state per role (Subject) instead of a
concatenated single-state message.
2026-04-28 13:06:23 +07:00
ce29074530 backup: include cross-checker discovery entries in backup/restore
Extend Backup to carry the two new KV indexes introduced by the
discovery mechanism.
2026-04-28 13:06:23 +07:00
5f38d1454f checkers: resolve Related in ReportContext for HTML + metrics reports
Complete the ReportContext composition path so reporters can fold
downstream observations into their output:

  - checker.BuildReportContext wraps a raw payload plus the engine's
    RelatedObservationLookup in a lazy ReportContext: Related(key) is
    resolved on first access and cached. When no lookup is wired the
    context falls back to sdk.StaticReportContext, matching the
    pre-existing behaviour.
  - GetHTMLReportWithContext / GetMetricsWithContext: new helpers that
    accept a pre-built ReportContext, for callers that want to feed
    Related into a reporter explicitly.
  - The execution controller now builds a ReportContext via the
    engine's RelatedLookup method before calling the HTML reporter.
    When the engine is wired with discovery storage, the reporter sees
    the producer's consumer lineage through ctx.Related(consumerKey).
  - HTTPObservationProvider implements CheckerHTMLReporter and
    CheckerMetricsReporter: both forward to POST /report with
    ExternalReportRequest{Key, Data, Related}. A 501 response is
    surfaced as an explicit "does not support /report" error. These
    methods are available for callers that want to route reports to
    remote checkers; the default in-process reporter dispatch is
    unchanged.
2026-04-28 13:06:23 +07:00
69c263c56e checkers: compose cross-checker observations via GetRelated
Close the discovery loop described in docs/checker-discovery.md: entries
published in commit 3 now feed consumer checkers, and their observations
flow back to the original producer.

Three tightly-coupled changes:

  - CheckerOptionsUsecase gains an optional DiscoveryEntryStorage
    dependency (WithDiscoveryEntryStore). When a checker declares
    AutoFill="discovery_entries" on an option,
    BuildMergedCheckerOptionsWithAutoFill populates it with the entries
    stored for the target: all producers, no host-side filtering by
    Type. The method also returns the concrete list of entries injected
    so the engine can persist lineage for them.

  - CheckerEngine records a DiscoveryObservationRef per (entry, obs key)
    tuple after the snapshot is stored. The ref namespaces back to the
    *producer* (ProducerID, Target, Ref) while carrying the consumer's
    key and the snapshot pointer, so a later GetRelated from the
    producer can reach the consumer's observation in one lookup.

  - ObservationContext exposes SetRelatedLookup (called once per run by
    the engine) and implements GetRelated on top of the installed
    closure. The engine's closure walks the producer's published
    entries, resolves each ref's observation refs, loads the snapshots,
    and materialises []RelatedObservation. Stale refs (entry gone,
    snapshot TTL'd) are skipped silently: implicit GC, as the doc
    permits.
2026-04-28 13:06:23 +07:00
901227e10f checkers: harvest discovery entries during collection
Wire the newly-added DiscoveryEntryStorage into the execution pipeline:

  - ObservationContext tracks DiscoveryEntry records published by each
    provider. After Collect, providers that implement DiscoveryPublisher
    are asked for their entries (on the native Go value, no JSON round
    trip), and the results are cached by observation key.
  - HTTPObservationProvider also implements DiscoveryPublisher: it
    records the Entries field of the remote /collect response and
    surfaces them through DiscoverEntries. Each override instance is
    scoped to a single execution run, so no locking is needed.
  - CheckerEngine.runPipeline calls ReplaceDiscoveryEntries after
    persisting the snapshot, always replacing the previous set for
    (checkerID, target), including when a run produces none, so stale
    entries from earlier cycles self-heal.
2026-04-28 13:06:23 +07:00
b7da1f5e23 checkers: add storage for discovery entries and observation lineage
Introduce the two KV indexes that back the cross-checker discovery
mechanism described in docs/checker-discovery.md:

  dscent|{producer}|{target}|{type}|{ref}         primary record
  dscent-tgt|{target}|{producer}|{type}|{ref}     target lookup (auto-fill)
  dscobs|{producer}|{target}|{ref}|{consumer}|{k} observation lineage
  dscobs-snap|{snapshotId}|...                     cascade on snapshot delete

ReplaceDiscoveryEntries is the canonical publication path: the whole
set previously stored for (producer, target) is cleared, then the new
set is written. The observation-lineage side uses a single upsert per
(producer, target, ref, consumer, key) tuple, with a snapshot-scoped
reverse index so deleting a snapshot cascades cleanly. Putting a ref
under a new snapshot removes the previous snap-index so a later
cascade on the old snapshot does not wipe the refreshed primary.

Adds StoredDiscoveryEntry and DiscoveryObservationRef to the host-only
model, DiscoveryEntryStorage / DiscoveryObservationStorage to the
checker usecase storage surface, embeds both in storage.Storage, and
regenerates the instrumented wrapper. Unit tests cover round-trip,
atomic replace, multi-producer aggregation, upsert, and cascade
delete.

No pipeline wiring yet.
2026-04-28 13:06:23 +07:00
10d0e81c10 checkers: adopt ReportContext-based reporter signatures
Update happyDomain to the new checker-sdk-go reporter contract, where
CheckerHTMLReporter.GetHTMLReport and CheckerMetricsReporter.ExtractMetrics
take a ReportContext instead of a raw json.RawMessage. The ReportContext
will later carry cross-checker related observations; for now every call
site wraps the raw payload via sdk.StaticReportContext, so behavior is
unchanged.

Also re-export the new discovery-related SDK types (DiscoveryEntry,
DiscoveryPublisher, RelatedObservation, ReportContext,
AutoFillDiscoveryEntries) as aliases under happydns, and satisfy the
extended ObservationGetter interface on ObservationContext and the
test stub with a no-op GetRelated.

No new behavior: plumbing for the upcoming discovery pipeline.
2026-04-28 13:06:23 +07:00
240b0819a5 checker: treat ApplyToZone as ApplyToDomain in scheduler and status
All checks were successful
continuous-integration/drone/push Build is passing
CheckTarget has no zone identifier, so zone-scoped checkers were
silently dropped by the scheduler and ListCheckerStatuses, leaving
external_whois (the only ApplyToZone checker) never planned nor
listed. Surface them at the domain scope, matching the existing
treatment in checker_options_usecase.
2026-04-28 13:06:23 +07:00
2a9848962a docs: document domain-level checkers (expiry, lock, contact) 2026-04-28 13:06:23 +07:00
b651714dcf chore(deps): lock file maintenance
Some checks are pending
continuous-integration/drone/push Build is pending
2026-04-28 12:37:17 +07:00
9472e7fe2e log: render happydns.Identifier values via .String() in log messages
Some checks are pending
continuous-integration/drone/push Build is pending
2026-04-25 21:50:57 +07:00
c411d6b4ea Handle 2 edge cases in database migration 2026-04-25 21:50:57 +07:00
47fd9cd066 tidy go mod 2026-04-25 21:50:57 +07:00
2a37c2db43 fix(deps): update module github.com/oracle/nosql-go-sdk to v1.4.8
All checks were successful
continuous-integration/drone/push Build is passing
2026-04-22 20:11:35 +00:00
99f53084fb scheduler: extract serviceCheckerApplies helper
All checks were successful
continuous-integration/drone/push Build is passing
The predicate guarding service-checker auto-scheduling was duplicated
across buildQueue and two sites in NotifyDomainChange. Pull it into a
single helper so the rule lives in one place.
2026-04-22 16:09:13 +07:00
cd957a7667 scheduler: only auto-schedule service checkers with LimitToServices
Service-level checkers without LimitToServices no longer get enqueued
for every matching service: they must be activated explicitly via a
CheckPlan. Domain checkers and service checkers that declare a
LimitToServices whitelist keep their previous auto-discovery behavior.
2026-04-22 16:08:27 +07:00
fa7700355a backup: include core checker entities in backup/restore
All checks were successful
continuous-integration/drone/push Build is passing
Extend the admin backup to cover checker configurations, plans,
evaluations and executions — previously these were stored but silently
lost on restore. Add RestoreX storage methods so primary records keep
their original Id and secondary indexes are rebuilt (Create* generates
new IDs, Update* requires an existing record to clean stale indexes).
2026-04-22 12:45:45 +07:00
da1eb33faf tidy: add drop_invalid flag to delete undecodable records
Thread a dropInvalid bool through every TidyUpUseCase method and
expose it as a drop_invalid query parameter on POST /tidy (default
true). When set, Tidy deletes records that fail to decode — e.g.
legacy executions and evaluations whose CheckState.Status was stored
as a string before the SDK switched it to int — instead of leaving
them stuck in the store to log on every iteration.

Also reset KVIterator.err on exhaustion so a prior decode failure
does not surface as a spurious iteration error.
2026-04-22 12:45:45 +07:00
9ef5717f5b chore(deps): lock file maintenance
All checks were successful
continuous-integration/drone/push Build is passing
2026-04-20 00:12:07 +00:00
0bee7695bc web: sort sessions by last used date in SessionsManager
All checks were successful
continuous-integration/drone/push Build is passing
Fixes: https://feedback.happydomain.org/posts/23/current-connection-sorted-list
2026-04-17 18:04:04 +07:00
a4f3595142 Use nodeJS workspace
All checks were successful
continuous-integration/drone/push Build is passing
2026-04-17 12:58:17 +07:00
63bf11f1f5 CI: Generate a NOTICE file to keep dependancies licenses 2026-04-17 12:38:10 +07:00
a3aa74a7bf CI: Fix path to files to deploy with release
Some checks reported errors
continuous-integration/drone/push Build was killed
continuous-integration/drone/tag Build is passing
2026-04-17 11:17:56 +07:00
66d420b737 Fix WHOIS lookup not detecting non-existent .com domains
Some checks failed
continuous-integration/drone/push Build is passing
continuous-integration/drone/tag Build is failing
v0.8.0-rc
The whoisparser library does not return ErrNotFoundDomain for Verisign
"No match" responses — it parses them into a result with an empty
Domain field. Add a post-parse check to detect this case and return
ErrDomainDoesNotExist.
2026-04-17 09:55:39 +07:00
4f20a2ff06 Add Prometheus export documentation
Some checks failed
continuous-integration/drone/push Build is failing
continuous-integration/drone/tag Build is failing
2026-04-16 17:08:05 +07:00
57de739f80 web: add Prometheus metrics URL link to checker config page 2026-04-16 17:08:05 +07:00
e857b1fb99 web-admin: wire dashboard to /metrics with collapsible details
Replaces the three REST count calls with a single Prometheus scrape that
auto-refreshes every 15s, surfaces queue/worker/in-flight/RSS/version/uptime
as featured cards, and tucks counters and Go runtime stats under a
"Show more metrics" Collapse.
2026-04-16 17:08:05 +07:00
504660367e checkers: add filter predicate to ListExecutionsBy* storage methods
Metrics endpoints now skip incomplete/planned executions by passing a
`doneExecution` filter so only fully-evaluated runs contribute to the
Prometheus output.
2026-04-16 17:08:01 +07:00
35d4d84004 checkers: add Prometheus text format for metrics export
The metrics endpoints now negotiate response format via the Accept
header: application/json returns the JSON array, anything else returns
the Prometheus text exposition format.
2026-04-16 17:02:31 +07:00
1ca806852e Instrument check scheduler with Prometheus metrics
Track queue depth on enqueue and pop, active worker count, check execution
duration per checker, and check result status counters.
2026-04-16 17:02:31 +07:00
7899b2f0e8 Instrument DNS provider adapter with Prometheus metrics
Add providerName field to DNSControlAdapterNSProvider and wrap GetZoneRecords,
GetZoneCorrections, CreateDomain, and ListZones with timing and call counters
using happydomain_provider_api_calls_total and happydomain_provider_api_duration_seconds.
2026-04-16 17:02:31 +07:00
5660003311 Add storage stats Prometheus collector for business entity counts
Expose four live gauges queried at each scrape via a custom Collector:
- happydomain_registered_users_total
- happydomain_domains_total
- happydomain_zones_total
- happydomain_providers_total
2026-04-16 17:02:31 +07:00
7ac13175c6 Wire metrics into app: HTTP middleware, storage instrumentation, build info
- Add HTTP metrics middleware to public router in setupRouter()
- Wrap storage with InstrumentedStorage after initialization
- Set build info metric from main() with actual version string
- Promote prometheus/client_golang to direct dependency
2026-04-16 17:02:31 +07:00
b4c6492936 Expose /metrics endpoint on admin socket via promhttp 2026-04-16 17:02:31 +07:00
b7beefed3f Add Prometheus metrics package with HTTP middleware and storage instrumentation
- internal/metrics/metrics.go: defines all metric variables (http, scheduler,
  provider, storage, build info) using promauto for zero-config registration
- internal/metrics/http.go: Gin middleware recording request count, duration,
  and in-flight gauge using c.FullPath() to avoid high-cardinality labels
- internal/app/instrumented_storage.go: InstrumentedStorage wrapper implementing
  storage.Storage, recording operation counts and durations for all entities
2026-04-16 17:02:31 +07:00