Appearance
Operations / Workers
Overview
Operations owns backend entrypoints, local commands, worker process shape, migrations, heartbeat reporting, and health diagnostics. Business rules stay in the modules each worker calls.
Use backend/RUNBOOK.md for step-by-step local commands and incident checks.
Responsibilities
- Route CLI commands from
backend/src/index.ts. - Run SQLite migrations before API/worker work.
- Provide long-running and run-once worker entrypoints.
- Record worker heartbeats for schedulers that use
instrumentRun. - Expose internal worker health and ingestion health surfaces.
- Keep worker cadence and process defaults in environment config.
Boundary Rules
- Workers orchestrate module services; they should not contain durable product rules.
- Worker heartbeat state belongs to Operations; module state belongs to module-owned tables.
- Worker commands should be safe to run locally after
npm run db:migrate. - Run-once commands should reuse the same service path as long-running workers.
Runtime Flow
Most workers follow this shape:
text
Load env
-> open SQLite
-> run migrations
-> construct module dependencies
-> run one cycle immediately
-> repeat on interval unless run-once
-> log cycle summary or error
-> close database on shutdowningest is the exception: it starts multiple source schedulers and stream clients, then stops them on shutdown. It records source health in ingestion_runs, not worker_heartbeats.
Main Code Paths
| Path | Purpose |
|---|---|
backend/src/index.ts | CLI command router for API, migrations, workers, status checks, and one-off commands. |
backend/src/app/env.ts | Worker interval defaults and runtime config parsing. |
backend/src/observability/worker-heartbeats.ts | Heartbeat insert/update helper and health read model. |
backend/src/workers/ingest.ts | Long-running ingestion process. |
backend/src/workers/decisions.ts | Clone-scoped debug decision worker. |
backend/src/workers/prop-decisions.ts | Prop-account decision worker. |
backend/src/workers/mark-to-market.ts | Prop-account mark-to-market worker. |
backend/src/workers/backtesting.ts | Backtest queue worker. |
backend/src/workers/audition-evaluator.ts | Due-audition evaluator worker. |
backend/src/workers/arena-entry.ts | Promotion Queue / Arena Entry sync worker. |
backend/src/db/migrator.ts | SQLite migration runner. |
State And Tables
| Table | Owner | Purpose |
|---|---|---|
worker_heartbeats | Operations | Last start/completion/success/error, status, summary, latency, consecutive errors, total runs/errors. |
ingestion_runs | Data Ingestion | Source/channel run status and freshness for source schedulers. |
backtest_jobs | Backtesting | Queue lease, progress, attempt, result, and failure state for async backtests. |
Workers also write module-owned tables such as clone_decision_runs, prop_*, audition_runs, promotion_queue_entries, and arena_clone_memberships.
Routes And Workers
For local single-terminal runtime work, npm run workers:all starts the non-ingestion runtime workers: prop decisions, mark-to-market, backtests, audition evaluation, and Arena Entry sync.
| Worker | Command | Default cadence | Unit of work | Health behavior |
|---|---|---|---|---|
| Ingestion | npm run ingest / npm run ingest:local | Source-specific | Hyperliquid stream/features/maintenance, DefiLlama, Polymarket | Records ingestion_runs; no worker_heartbeats row. |
| Clone decisions | npm run decisions:worker | 60s | Active clone decision subjects, or --clone-id subset | Whole cycle heartbeat as clone_decisions. |
| Prop decisions | npm run prop:decisions:worker | 60s | Active prop accounts, default limit 100 | Whole cycle heartbeat as prop_decisions; per-account failures are logged and isolated. |
| Mark-to-market | npm run prop:mark-to-market:worker | 60s | Active prop accounts, default limit 100 | Heartbeat as prop_mark_to_market; per-account failures become ledger/audit outcomes where possible. |
| Backtests | npm run backtests:worker | 15s | Leased queued backtest jobs | Heartbeat as backtesting; leases default 5 minutes, max concurrent 2, max per user 1. |
| Audition evaluator | npm run lifecycle:auditions:worker | 5m | Due audition runs | Heartbeat as lifecycle_audition_evaluator. |
| Arena Entry | npm run lifecycle:arena-entry:worker | 5m | Passed auditions into Queue/Arena slots | Heartbeat as lifecycle_arena_entry. |
Run-once commands:
bash
cd backend
npm run decisions:run-once -- --clone-id=42 --force
npm run prop:decisions:run-once
npm run prop:mark-to-market:once
npm run backtests:run-once
npm run lifecycle:auditions:evaluate-due
npm run lifecycle:arena-entry:run-onceHealth routes:
| Route | Auth | Purpose |
|---|---|---|
GET /api/v1/internal/health/workers | Service token | Worker heartbeat health and staleness. |
GET /api/v1/ingestion/health | Bearer auth | Source/channel ingestion health. |
GET /api/v1/ingestion/dashboard | Public currently | Source dashboard summary. |
Failure Behavior
instrumentRunmarks a workerrunningat cycle start andsucceededorfailedat completion.- Failed cycles increment
consecutive_errorsand preserve the last error string. - Backtest workers requeue expired running jobs when leases lapse.
- Prop decision worker catches failures per account so one bad account does not stop the cycle.
- Prop mark-to-market catches account-level failures and records failure details when possible.
- Source ingestion records source/channel failures in
ingestion_runs; source schedulers use in-process running flags to avoid overlapping runs. - There is no general dead-letter queue yet.
Debugging Notes
npm run cadenceprints ingestion cadence definitions.npm run ingestion:statusandnpm run ingestion:checksummarize source health.GET /api/v1/internal/health/workersshows worker staleness and consecutive errors.- If a worker looks alive but no module state changes, check the module's due/lease/idempotency condition before assuming the process is stuck.
- For backtests, check
backtest_jobs.leased_by,lease_expires_at_ms,attempt_count,status, andprogress_pct.
Tests
Worker tests are intentionally split by blast radius:
| Layer | Scope | Default use |
|---|---|---|
| Entry smoke | Boot each run-once worker against an empty migrated SQLite database. | Fast guard that worker composition still loads. |
| Worker smoke E2E | Seed one realistic domain condition, run the worker once, then assert module-owned state and the heartbeat row. | Main local/CI confidence check for worker behavior. |
| Process smoke | Start API and/or long-running worker processes as child processes, wait for readiness or heartbeat, then shut down. | Slower optional check for CLI/process wiring, not required for the default suite. |
Current worker smoke E2E scenarios:
| Worker | Given | When | Then |
|---|---|---|---|
| Prop mark-to-market | Active prop account with an open BTC position and a newer mid price. | runMarkToMarketWorker({ once: true }). | Balance equity is repriced and prop_mark_to_market heartbeat succeeds. |
| Prop decisions | Active prop account with a default strategy draft and noop model provider. | runPropDecisionWorker({ once: true }). | A prop-account decision run is recorded with a skipped hold action and prop_decisions heartbeat succeeds. |
| Audition evaluator | Running audition whose window has elapsed. | runAuditionEvaluatorWorker({ once: true }). | Audition transitions to passed and lifecycle_audition_evaluator heartbeat succeeds. |
| Arena Entry | Passed audition with an approved public profile. | runArenaEntryWorker({ once: true }). | Clone is promoted into Arena membership and lifecycle_arena_entry heartbeat succeeds. |
| Backtests | Queued backtest job with a default strategy graph. | runBacktestWorker({ once: true }). | Job completes through the queue path, metrics are written, and backtesting heartbeat succeeds. |
bash
cd backend
npm run typecheck
npm run test:workers:smokeKnown Gaps
- Production process supervision and alert routing.
- Generic retry/dead-letter handling for non-backtest workers.
- Backup/restore checks and reconciliation dashboards.
- Pending-purchase and payout reconciliation workers for future Economy V2.