Architecture¶
The platform splits into three loosely-coupled tiers and one metadata index. Each tier holds one type of trust: the Control Plane decides; the Agent executes; the Providers own the values.
flowchart LR
subgraph user_plane ["User / Dev plane"]
UI["UI SPA"]
Dev["Developer"]
Approver["Approver"]
Dev --> UI
Approver --> UI
end
subgraph control_plane ["Control Plane (Secrets Bridge)"]
API["api — Fiber v3"]
Postgres[("Postgres<br/>append-only audit")]
Redis[("Redis<br/>locks, cache,<br/>rate limit")]
Worker["worker<br/>sweepers + gitops"]
KMS["KMS backend<br/>local / vault-transit / aws-kms"]
end
subgraph workload ["Workload network (per cluster)"]
Agent["agent<br/>outbound only"]
Vault[("HashiCorp Vault")]
AWSSM[("AWS Secrets Manager")]
AzureKV[("Azure Key Vault")]
GCPSM[("GCP Secret Manager")]
end
UI <--> API
API <--> Postgres
API <--> Redis
API <--> KMS
Worker <--> Postgres
Worker <--> Redis
Agent -->|"outbound HTTPS"| API
Agent --> Vault
Agent --> AWSSM
Agent --> AzureKV
Agent --> GCPSM
Trust boundaries — what holds what¶
| Component | What it holds | What it must never hold |
|---|---|---|
| UI | A short-lived JWT in React state | Any provider credential, any plaintext value beyond the reveal-once modal lifetime |
| API | KMS-wrapped envelopes (single-shot, TTL'd), audit rows, policy rules, role catalogs | Raw plaintext values, provider master credentials |
| Postgres | Hashes + envelopes + metadata + append-only audit | The decryption key — that lives in the KMS backend, outside Postgres |
| Redis | Idempotency tokens, distributed locks, rate-limit windows, pub/sub messages | Any value-bearing payload (the runtime package doesn't even expose a generic Set() ) |
| Worker | The same Postgres + Redis access as the API | Provider credentials, plaintext values |
| Agent | Provider credentials for its own network boundary, plaintext values for the duration of one request | Anything from another cluster / account; no inbound listener; no Postgres / Redis client |
| Providers | The actual secret values | n/a — these are unchanged systems |
The four request lifecycles¶
sequenceDiagram
participant Dev
participant UI
participant API
participant Approver
participant Agent
participant Vault
Dev->>UI: Submit read request<br/>(prod/db/password, justification)
UI->>API: POST /requests/read
API->>API: Resolve policy → workflow
API-->>UI: status: pending
Approver->>UI: Click Approve
UI->>API: POST /requests/:id/approve
API->>API: Enqueue read job
API-->>UI: status: approved
Agent->>API: POST /jobs/claim
API-->>Agent: read job payload
Agent->>Vault: GetValue prod/db/password
Vault-->>Agent: bundle with DB_PASSWORD
Agent->>API: POST /agents/:id/wraps<br/>(envelope-encrypted)
Agent->>API: POST /jobs/:id/complete (succeeded)
API-->>Agent: 204
Dev->>UI: Click Reveal (one-time)
UI->>API: GET /requests/:id/wraps/:wrap_id<br/>?user_id=dev
API->>API: Atomic consume + KMS decrypt
API-->>UI: plaintext (base64, single-shot)
UI->>UI: Show in modal, clear on close
sequenceDiagram
participant Dev
participant UI
participant API
participant Approver
participant Agent
participant Vault
Dev->>UI: Submit patch request<br/>prod/db/password = new-value
UI->>API: POST /requests with key_values
API->>API: KMS-wrap each value<br/>resolve policy → workflow
API-->>UI: status: pending
Approver->>UI: Approve
UI->>API: POST /requests/:id/approve
API->>API: Enqueue patch job
Agent->>API: POST /jobs/claim
API-->>Agent: patch job +<br/>list of wrap ids
loop For each wrap
Agent->>API: GET /agents/:id/wraps/:wrap_id
API->>API: KMS-decrypt + single-shot consume
API-->>Agent: plaintext (envelope-sealed if<br/>agent has registered pubkey)
end
Agent->>Vault: GetValue(prod/db/password)
Vault-->>Agent: existing bundle
Agent->>Vault: PutValue (merged bundle)
Agent->>API: POST /jobs/:id/complete (succeeded)
API-->>UI: status: executed
sequenceDiagram
participant Admin
participant API
participant Agent
participant Vault
Admin->>API: POST /jobs job_type=discover
Agent->>API: POST /jobs/claim
API-->>Agent: discover job payload
Agent->>Vault: ListMetadata(scope)
Vault-->>Agent: 47 secrets with<br/>custom_metadata tags
Agent->>API: POST /agents/:id/secrets/bulk<br/>cluster + provider + items
API->>API: Upsert on cluster+provider+ref<br/>preserve first_seen_at<br/>refresh labels jsonb
Agent->>API: POST /jobs/:id/complete (succeeded)
Admin->>API: GET /secrets?label=Environment:prod<br/>&label=Team:billing
API-->>Admin: filtered list (GIN index)
Off by default. Enable with SB_GITOPS_ENABLED=true on the api
AND SB_WORKER_GITOPS_ENABLED=true on the worker.
sequenceDiagram
participant API
participant Worker
participant ArgoCD
Note over API: Patch request transitions to executed
API->>API: Fan out observations for every<br/>app mapping bound to this secret
loop Every 15s
Worker->>API: Claim next observation
Worker->>ArgoCD: GET /api/v1/applications/:name/resource-tree<br/>read-only transport, no POST
ArgoCD-->>Worker: app status (filtered)
Worker->>API: Update observed_state
alt Healthy AND Synced
Worker->>API: Transition to applied
else OperationPhase failed
Worker->>API: Transition to failed
else timeout_at reached
Worker->>API: Transition to applied_unverified
end
end
Defense-in-depth on the wire¶
Three layers of encryption protect every value-bearing flow:
TLS (mandatory) — passive sniffers, compromised system CA
↓
Wire envelope (Piece 8b) — TLS-terminating proxies, CP-side log leaks
↓
Storage envelope (Piece 8a) — DB exfiltration, offline backup theft
| Direction | Wire envelope scheme |
|---|---|
| CP → Agent | X25519 ECDH + HKDF-SHA256 + AES-256-GCM, sealed to the agent's registered public key. Even a TLS-terminating gateway sees only ciphertext. |
| Agent → CP | CP issues a KMS-wrapped data-encryption key; agent uses it for AES-256-GCM and zeroes the plaintext DEK immediately. The CP unwraps via the KMS backend. |
| At rest | KMS-envelope: per-row data key, wrapped by the configured KeyManager (LocalKMS for dev, Vault Transit or AWS KMS for production). |
Why polyrepo¶
Eight repositories, one product. The core library is infra-free —
no Postgres driver, no Redis client, no Fiber import — so the
agent and controller can depend on it without pulling in
storage. The api and worker share Postgres/Redis but the worker
imports api/pkg/* only (never internal/*), so the dependency
direction is one-way and reviewable.
CI in every repo greps go.sum to enforce these boundaries — for
example the agent's CI fails the build if it ever picks up
jackc/pgx, lib/pq, redis/go-redis, or similar drivers.
Read more: REFACTOR_PLAN.md §4 polyrepo dependency graph.