feat!: v2 architecture — Source interface, deferred YAML, secrets resolver, namespaced types

Closes #4.

What changed

postern v2: collapse source model, defer config parsing, set v1 contracts before user adoption locks them in. Builds on @graysongordon-gl's foundation in !9 (closed) and goes further than that issue scoped -- intentionally, while the window is open.

The shape

v1 v2
Push and pull adapters are separate types; server type-switches One Source interface (Run(ctx, emit) error + Ready()); push parks on ctx, pull loops
Adapter types listed in a central switch Adapters self-register from init() into a registry; new adapter = 1 file
config.Source carries typed fields per adapter (Azure *Azure, GCP *GCP, ...) SourceBlock captures the adapter subtree as opaque yaml.Node; each factory decodes its own struct
Adapters call os.Getenv directly secrets.Resolver interface; EnvResolver today; Vault/file/K8s drop-in later
Hand-rolled shutdown ordering in main errgroup for concurrent run; explicit shutdown choreographer drains dispatcher last
/healthz only /livez (liveness) + /readyz (ANDs every source's Ready())
Single postern -validate -config x.yml flag mode Subcommands: `postern run
Prometheus metrics only Prometheus + optional OTel traces+logs via OTEL_EXPORTER_OTLP_ENDPOINT (eager-init, zero cost when unset). OTel semconv resource attributes
Stats plumbed by typed struct fields Stater interface; /status aggregates per-component

Source type names

Renamed for namespace clarity (room for azure.servicebus, gcp.eventarc, aws.sns later):

  • azure -> azure.eventgrid
  • gcp -> gcp.pubsub
  • sqs -> aws.sqs
  • cloudevents (unchanged)
  • kafka (unchanged; protocol-neutral)

Validation on GCP

Deployed to Cloud Run us-central1, fired through Pub/Sub push subscription with OIDC, triggered a real pipeline:

postern logs:

event matched route   route=validation-route event_id=19071643724467287
pipeline triggered    route=validation-route project=82254617 ref=main variables=4

Triggered pipeline (2530045716) job log:

postern v2 fired this pipeline
BUCKET=validation-bucket
OBJECT=manifest.json
EVENT_TIME=2026-05-16 13:05:00 +0000 UTC
MESSAGE_ID=19071643724467287

All four extracted variables flowed end-to-end: Pub/Sub message attributes -> postern's gcp.pubsub adapter (OIDC verified, base64 decoded) -> normalized CloudEvent -> route extract (dot-paths + envelope shortcuts) -> dispatcher worker -> GitLab trigger API -> CI variables.

Breaking changes (acceptable; no users)

  • Config schema requires version: 1 and uses deferred parse with flattened adapter blocks.
  • Source type names renamed (see above).
  • /healthz replaced by /livez + /readyz.
  • CLI uses subcommands.
  • Chart probe paths updated.

Test coverage

go test -race ./... green:

  • internal/config -- version handling, deferred parse, validation
  • internal/dispatch -- start idempotence, drain, backpressure
  • internal/health -- livez, readyz (ready + one-not-ready), status aggregator
  • internal/ratelimit -- per-source, global fallback, override
  • internal/router -- preserved
  • internal/source/azure -- 8 tests covering validation handshake, dispatch, batch, secret check
  • internal/source/cloudevents -- 9 tests covering bearer auth, parse, dispatch reject
  • internal/trigger -- prescriptive errors, dry-run, source label

Pull adapter tests (source/sqs, source/kafka, source/gcp) are smoke-only at the type-registration level; their network-touching paths are validated end-to-end via the GCP run above. Follow-ups can deepen these.

What's next (separate MRs)

  • README/docs expansion of the OTel + readyz story.
  • Subspace-relay in starfleet-engineering rewired to point at this deployment as the production-shape bridge.
  • Per-adapter test deepening for pull sources.

For @graysongordon-gl

This extends what you started in !9 (closed) substantially. Two reasons we went further in one MR instead of staging:

  1. We have no users, so breaking the config schema and source-type names is free. Every week we delay, that gets harder.
  2. The dispatcher's dual-context shutdown, errgroup-with-choreographer pattern, and pure-factory rule are interlocking -- doing them piecemeal would leave half-broken intermediate states.

Welcome your review. The big design decisions are in internal/source/source.go (the contract) and cmd/postern/run.go (the orchestrator). Everything else flows from those two.

Merge request reports

Loading