Skip to content

Lifecycle and hooks

A Kumiko process moves through four lifecycle states in fixed order:

starting → ready → draining → stopped

Each transition is observable, the order never reverses, and the /health/ready endpoint reflects the current state. The framework runs this lifecycle for every process — API server, worker, outbox poller — and there is no per-feature variant.

This page is about the framework lifecycle, the boot-time validation it performs, and the hook points where feature code can attach. The detailed hook contract — phases, ordering, transactional behaviour — lives on Events and projections.

Boot is sixteen phases in a fixed order. Each one waits for the previous, each one has a timeout, and each one decides what happens on failure:

1. Load configuration (kumiko.config.ts, env, defaults)
2. Initialise observability (tracing, metrics, structured logs)
3. Initialise secrets provider (env, vault, KMS)
4. Runtime checks (Bun version, polyfills)
5. Register features (every defineFeature() runs)
6. Boot validation (the feature graph is checked)
7. Connect database (with retry/backoff)
8. Schema baseline check (api-evolution, optional)
9. Migration check (apply / warn / exit on pending)
10. Connect Redis (with retry/backoff)
11. Initialise search adapter (Meilisearch healthcheck)
12. Initialise file storage (S3-compatible)
13. Start outbox poller
14. Start jobs worker (if jobs feature is loaded)
15. Bind API listener
16. State → ready

The fail behaviour is per phase. Configuration errors are immediate exit; network connectivity gets retried with exponential backoff; observability falls back to a no-op provider rather than blocking boot. The point of the phase model is that the order of failure is deterministic — operators reading a boot log always see the same sequence, and a particular phase failing always means the same thing.

Two flags affect this in production. KUMIKO_STARTUP_TIMEOUT (default two minutes) caps the total boot time; a process that hasn’t reached ready by then exits, preventing zombie workers stuck on unreachable infrastructure. The migrations.mode setting controls phase 9: exit on pending, auto-apply, or warn — pick what fits your deploy strategy.

Phase 6 — boot validation — is where Kumiko earns its “no runtime surprises” claim. Before any handler runs, the framework walks the entire feature graph and confirms it is internally consistent:

CheckWhat it catches
r.requires("x") resolvesMissing dependency that would crash on first call
Cross-feature handler reference existsctx.write("orders:create") when orders has no such handler
No circular dependenciesA requires B requires A would otherwise loop
Config keys read by features are declaredTypos turn into immediate boot errors, not runtime null
No entity name collisionsTwo features each declaring r.entity("user", …)
encrypted and searchable are mutually exclusiveEncrypted columns cannot be indexed in plaintext
Registrar extension referenced without r.requiresr.customFields(…) without owning the dependency
$user.* ownership bindings existTypo in $user.teamId becomes a boot error

Anything that fails is reported as a list — not the first error, all of them. The exit code is non-zero, the log carries the full details, and the process never reaches ready. Production never sees any of these because production never starts with them.

This is the single most consequential property of the lifecycle model: configuration mistakes cannot ship. They become CI failures, not incident reports.

The framework gives feature authors three families of attachment points across the lifecycle. Each one runs at a different phase and has a different transactional contract.

Boot-time hooks. A feature’s defineFeature body runs in phase 5. Anything you do there — registering entities, declaring events, attaching hooks, computing derived configuration — is a boot-time hook. By the time phase 6 runs, every feature has finished registering, and the registrar is frozen. There is no “register a handler at runtime” path.

Lifecycle hooks (per write). A r.hook("postSave", "incident", …) attaches to the lifecycle of an entity write, not the process lifecycle. It runs once per matching write, in inTransaction or afterCommit phase as you choose. The full contract is on Events and projections — for the purposes of this page, the relevant point is that they fire while the process is in the ready state, not at boot or during shutdown.

Background workers. Jobs and projections that run outside the request path are owned by the framework, not by feature code. A feature declares them — r.job("daily-report", { trigger: { cron: "0 9 * * *" } }, handler), r.multiStreamProjection({ … }) — and the framework starts them in phases 13-14, supervises them via heartbeat, and stops them during shutdown. Jobs accept three trigger shapes: { on: eventDef } (event-driven), { cron: "…" } (scheduled), or { manual: true } (queue-only).

Feature code never calls lifecycle.registerStartupPhase(…). The process-lifecycle API is internal, and the available registrar methods are the public surface.

When the process receives SIGTERM (the orchestrator’s “stop, please”), the lifecycle moves to draining and the shutdown sequence runs:

draining state begins
/health/ready returns 503 (load balancer drains traffic)
Linger 3 seconds (give the LB time to react)
Close the API listener (no new connections)
Wait for in-flight requests (drain timeout, default 30s)
Close SSE broker connections
Stop the outbox poller (finish current batch)
Stop the jobs worker (finish current jobs)
Close Redis and database pools
Flush observability (send pending traces and metrics)
state → stopped, exit(0)

The whole sequence has a hard timeout (KUMIKO_SHUTDOWN_TIMEOUT, default 40 seconds). After that, the process force-exits with a warning log. The hard cap exists because Kubernetes will send SIGKILL after its own grace period — better to exit cleanly with a logged warning than to be killed in the middle of a flush.

Background components register shutdown hooks during boot. They run in LIFO order: the last component to start is the first one to stop. Feature code does not register shutdown hooks; that surface is reserved for core features like core-jobs that own background work.

Two endpoints expose the lifecycle to the outside world:

  • /health is liveness. It returns 200 as long as the process is alive, regardless of state. Orchestrators use this to decide whether to restart the container.
  • /health/ready is readiness. It returns 200 only in the ready state, with all dependencies healthy. Orchestrators use this to decide whether to send traffic.

Readiness includes per-component checks: database latency, Redis connectivity, search adapter, outbox poller heartbeat, jobs worker heartbeat, scheduler leader status. A 503 from /health/ready carries the failing checks in the body, so an operator looking at one HTTP response sees which subsystem is unhealthy.

A Kumiko deployment with multiple processes needs to ensure that scheduled jobs run once. A daily report at 09:00 should fire on one worker, not on all of them. The framework runs a Redis-backed leader-election: every jobs worker tries to claim the leader lock at boot; the holder refreshes every five seconds; followers wait. Only the leader runs cron-scheduled jobs. Workers that pick up event-triggered or manually-triggered jobs do so via the queue, which fans out work correctly across all workers.

Feature code has no am-I-leader accessor. The election is a process concern, and the abstraction is “schedule this; the framework runs it once across the cluster”.

The fixed-order startup, boot-time validation, and graceful shutdown add up to two operational properties:

  • Misconfiguration cannot reach production. The feature graph, required handlers, registered config keys, and access rules are all checked before the API listener binds. CI exits 1; production never sees the broken state.
  • Deploys are uneventful. The shutdown sequence and health endpoints are written for rolling and blue-green deployments. New instance comes up, /health/ready flips to 200, traffic routes there; old instance receives SIGTERM, drains, exits clean.

Everything around the handler — process state, retries, leader election, hook ordering — is the framework’s job. The handler body is just the business decision.

Two validation hooks (one per write handler) plus a postSave entity hook that logs every save in the same transaction:

// --- Validation hook on create: reject banned words + length ---
r.hook("validation", articleCreate, (data) => {
const title = data["title"] as string;
if (title.toLowerCase().includes("spam")) {
return [{ field: "title", error: "title_contains_banned_word" }];
}
if (title.length > 200) {
return [{ field: "title", error: "title_too_long" }];
}
return null;
});
// --- Validation hook on update: length check on title changes ---
r.hook("validation", articleUpdate, (data) => {
const changes = data["changes"] as Record<string, unknown> | undefined;
const title = changes?.["title"] as string | undefined;
if (title && title.length > 200) {
return [{ field: "title", error: "title_too_long" }];
}
return null;
});
// --- postSave entity hook: log all saves ---
r.entityHook("postSave", article, async (result: SaveContext) => {
hookLog.push({
type: result.isNew ? "created" : "updated",
data: { id: result.id, changes: result.changes },
});
});

Full source: samples/recipes/lifecycle-hooks — covers preDelete and postDelete too.