This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
OSAPI is a Linux system management REST API and CLI written in Go 1.25. It uses NATS JetStream for distributed async job processing with a KV-first, stream-notification architecture.
For setup, building, testing, and contributing, see the Docusaurus docs:
- @docs/docs/sidebar/development/development.md - Prerequisites, setup, code style, testing, commit conventions
- @docs/docs/sidebar/development/contributing.md - PR workflow and contribution guidelines
- @docs/docs/sidebar/development/testing.md - How to run tests and list just recipes
- @docs/docs/sidebar/development/ui-development.md - UI prerequisites, setup, code style, components
- @docs/docs/sidebar/architecture/principles.md - Guiding principles (simplicity, minimalism, design philosophy)
- @docs/docs/sidebar/architecture/api-guidelines.md - API design guidelines (REST conventions, endpoint structure)
- @docs/docs/sidebar/architecture/ui.md - UI architecture, embedding, component layers
- @docs/docs/sidebar/usage/configuration.md - Configuration reference (osapi.yaml, env overrides)
- @docs/docs/sidebar/architecture/architecture.md - Architecture overview (links to system and job architecture)
Quick reference for common commands:
just deps # Install all dependencies
just build # Build production binary (React UI + Go)
just test # Run all tests (lint + unit + coverage)
just go::unit # Run unit tests only
just go::unit-int # Run integration tests (requires running osapi)
just go::vet # Run golangci-lint
just go::fmt # Auto-format (gofumpt + golines)
go test -run TestName -v ./internal/job/... # Run a single test
just react::dev # Start UI dev server (http://localhost:5173)
just react::build # Production UI build
just react::lint # Run ESLint on UI
just react::fmt # Format UI with Prettiercmd/- Cobra CLI commands (client,node agent,controller.api,nats server)internal/controller/api/- Echo REST API. Node-targeted handlers nest undernode/{domain}/. Controller-only handlers are top-level (job/,health/, etc.). Each domain has its owngen/with OpenAPI spec. Combined spec:api/gen/api.yamlinternal/job/- Job domain types, subject routing.client/for high-level opsinternal/agent/- Node agent: consumer/handler/processor pipeline for job executioninternal/telemetry/tracing/- OpenTelemetry tracer initialization, slog trace handler, context propagation\internal/telemetry/metrics/- Per-component Prometheus metrics server with isolated registries\internal/provider/- Operation implementations organized by category then domain. Browse the directory to see current providersinternal/telemetry/process/- Agent self-metrics (CPU%, RSS, goroutines) and process condition evaluation for heartbeatinternal/controller/notify/- Pluggable condition notification system: watches registry KV for condition transitions, dispatches viaNotifierinterface (logbackend)internal/config/- Viper-based config fromosapi.yaml. Struct fields usevalidatetags (same validator as API handlers). Defaults are set viaviper.SetDefault()incmd/root.gopkg/sdk/- Go SDK for programmatic REST API access (client/client library). See @docs/docs/sidebar/sdk/guidelines.md for SDK development rules- Shared
nats-clientandnats-serverare sibling repos linked viareplaceingo.mod
ui/- React 19 + TypeScript + Vite + Tailwind CSS v4. Embedded into the Go binary at build time.ui/src/sdk/gen/- Generated TypeScript SDK (orval) from the combined OpenAPI spec. DO NOT EDIT.ui/src/sdk/fetch.ts- Hand-written fetch mutator: auth token + base URL wiring for orval.ui/src/components/ui/- Reusable UI primitives. Every visual pattern is a component.ui/src/components/layout/- Page structure (Navbar, PageLayout, ContentArea, NetworkMapBackground).ui/src/components/domain/- Domain-specific components (blocks, cards, pickers).ui/src/hooks/- Data fetching, state, keyboard navigation hooks.ui/src/lib/-cn.ts,auth.tsx,permissions.ts,features.ts.ui/embed.go-//go:embed dist/*directive exposingui.Assets.internal/controller/api/ui/- SPA serving handler (static files +index.htmlfallback for client-side routing).- Config:
controller.ui.enabledinosapi.yaml(defaulttrue). //go:embed dist/*requiresui/dist/to have files at compile time. Always usejust build/just test/just ready— these runjust react::buildfirst. Runninggo build/go testdirectly without a prior UI build will fail.
- One component per file. Use
cvafor variants,cn()for conditional classes. - Icons from lucide-react only. No inline styles — Tailwind only.
- Always use the
Textcomponent for styled text. Always useDropdown— never<select>. - Tailwind scale only (
text-xs,text-sm). Never arbitrary pixel values. - File naming: components
kebab-case.tsx, hooksuse-kebab-case.ts, utilitieskebab-case.ts. - Colors defined in
ui/src/index.cssvia Tailwind v4@theme. Never use raw hex values. - Block = single operation (command, file deploy, cron create, etc.); Stack = saved composition of one or more blocks with targets. Do NOT use "runlist" — that term has been replaced.
- Regenerate the TypeScript SDK via
just generateat the repository root (runsredocly join,go generate, copies the combined spec toui/src/sdk/gen/api.yaml, and runsorval).
When adding a new domain, follow existing domains as reference.
Node-targeted operations live under
internal/controller/api/node/{domain}/. Controller-only operations
live under internal/controller/api/{domain}/. Read existing
domains before creating new ones — the codebase IS the reference.
Every domain MUST be consistent across all layers: provider, agent
processor, API handler, SDK service, CLI commands, docs, and tests.
When adding a new domain, look at a recently completed domain (like
ntp or sysctl) and replicate the same set of artifacts across
every layer. If something exists for sysctl but not for your new
domain, it's missing.
The principle: pick any existing domain and find/grep for it
across the codebase. Your new domain should appear in all the same
places. This includes code, tests, examples, SDK docs, CLI docs,
feature docs, docusaurus config, and permissions tables.
Providers are the operations layer — they execute the actual work on
agent hosts. Every operation under /node/{hostname}/... is backed
by a provider. The request flows:
CLI → SDK → REST API → Job Client → NATS → Agent → Provider
The provider runs on the agent, not the controller. It receives parameters from the job payload and returns a result.
Three provider patterns exist. Check existing providers for examples of each:
Direct providers interact with the system directly via commands or system calls. No file management.
Meta providers delegate file writes to file.Deployer for
SHA tracking, idempotency, and template rendering.
Direct-write providers manage their own config files via
avfs.VFS without file.Deployer. They use the osapi- filename
prefix to identify managed files.
Meta providers depend on file.Deployer (the narrow interface):
type Deployer interface {
Deploy(ctx context.Context, req DeployRequest) (*DeployResult, error)
Undeploy(ctx context.Context, req UndeployRequest) (*UndeployResult, error)
}Meta providers store domain-specific metadata in the
FileState.Metadata map (e.g., schedule, interval, user for cron).
The file provider persists this in the file-state KV alongside SHA,
path, and mode — one KV bucket for all providers.
Reference: look at existing meta providers in the codebase.
Platform-specific providers:
internal/provider/{category}/{domain}/
types.go — Provider interface + domain types
debian.go — Debian-family implementation
debian_{operation}.go — Per-operation file (large methods)
debian_docker.go — Container-aware variant (if needed)
darwin.go — macOS stub (returns ErrUnsupported)
linux.go — Generic Linux stub (returns ErrUnsupported)
mocks/ — Generated gomock mocks
generate.go — //go:generate mockgen directive
SDK-based providers (no platform variants):
internal/provider/{category}/{domain}/
types.go — Provider interface + domain types
{domain}.go — Single implementation (e.g., docker.go)
client.go — API client interface for testing
mocks/ — Generated gomock mocks
generate.go — //go:generate mockgen directive
For top-level providers: internal/provider/{domain}/.
For categorized providers: internal/provider/{category}/{domain}/.
Look at existing providers to see both patterns.
// types.go — package {domain}
type Provider interface {
List(ctx context.Context) ([]Entry, error)
Get(ctx context.Context, name string) (*Entry, error)
Create(ctx context.Context, entry Entry) (*CreateResult, error)
Update(ctx context.Context, entry Entry) (*UpdateResult, error)
Delete(ctx context.Context, name string) (*DeleteResult, error)
}Every method takes context.Context as the first parameter.
Result types include Changed bool for mutations and Error string
for per-operation error reporting.
All provider mutations follow Ansible-style desired-state semantics. Operations MUST be idempotent:
| Operation | Resource exists | Resource absent |
|---|---|---|
| Create | Changed: false, nil |
Creates it |
| Update | Updates it | Error (not found) |
| Delete | Removes it | Changed: false, nil |
- Create when the resource already exists returns success with
Changed: false— the desired state (present) is already met. - Delete when the resource doesn't exist returns success with
Changed: false— the desired state (absent) is already met. - Update when the resource doesn't exist returns an error — there is nothing to update.
ErrUnsupported(wrong OS family) maps toStatusSkippedat the agent layer, which is distinct fromChanged: false.
Every concrete provider struct MUST embed provider.FactsAware and
include a compile-time check:
// Compile-time check: Debian must satisfy FactsSetter.
var _ provider.FactsSetter = (*Debian)(nil)
type Debian struct {
provider.FactsAware
logger *slog.Logger
// ...
}The provider must also be passed to provider.WireProviderFacts()
in internal/agent/agent.go so facts are injected at startup.
OSAPI follows Ansible's OS family naming. Implementations are
selected at runtime via platform.Detect():
debian.go— Debian family (Ubuntu, Debian, Raspbian)darwin.go— macOS (for development)linux.go— generic Linux fallback
Unsupported platforms return provider.ErrUnsupported. The agent
marks the job as skipped (not failed) so the caller knows the
operation isn't available on that host rather than broken.
// darwin.go
func (d *Darwin) List(
_ context.Context,
) ([]Entry, error) {
return nil, fmt.Errorf("cron: %w", provider.ErrUnsupported)
}There are three provider implementation patterns. The naming convention determines the struct name, constructor, and file layout.
1. Platform-specific providers (most common)
One struct per OS family, each in its own file. Constructor names
follow New{Platform}Provider(). Methods that are large or testable
go in separate files named {platform}_{operation}.go.
| Struct | Constructor | File(s) |
|---|---|---|
Debian |
NewDebianProvider(...) |
debian.go, debian_get_*.go |
Darwin |
NewDarwinProvider(...) |
darwin.go, darwin_get_*.go |
Linux |
NewLinuxProvider() |
linux.go, linux_get_*.go |
Most providers under node/ and network/ follow this pattern.
2. Container-aware platform providers
When a provider's behavior differs inside a Docker container (e.g.,
hostname is read-only, DNS uses /etc/resolv.conf instead of
resolvectl), add a DebianDocker variant alongside the regular
Debian struct. The agent selects it via platform.IsContainer().
| Struct | Constructor | File(s) |
|---|---|---|
DebianDocker |
NewDebianDockerProvider(...) |
debian_docker.go, debian_docker_*.go |
DebianDocker either embeds Debian (delegating reads, overriding
writes) or stands alone. It satisfies the same Provider interface.
// agent_setup.go wiring
case "debian":
if platform.IsContainer() {
hostProvider = nodeHost.NewDebianDockerProvider()
} else {
hostProvider = nodeHost.NewDebianProvider(execManager)
}Examples: node/host (embeds Debian, blocks UpdateHostname),
network/dns (standalone, reads /etc/resolv.conf directly).
3. SDK-based providers (no platform variants)
Providers that talk to an external API (not the OS) use a single
Client struct with New() / NewWithClient() constructors.
No debian.go / darwin.go / linux.go files — the provider
works the same on all platforms. Availability is checked at startup
(e.g., Docker daemon ping).
| Struct | Constructor | File(s) |
|---|---|---|
Client |
New() |
docker.go |
NewWithClient(c) |
(same file, testing) |
// agent_setup.go wiring — no platform switch
dockerClient, err := dockerNewFn()
if err == nil {
if pingErr := dockerClient.Ping(ctx); pingErr == nil {
dockerProvider = dockerClient
}
}Examples: container/docker.
Embed provider.FactsAware in the provider struct to access agent
facts (OS family, architecture, hostname, network interfaces) at
runtime. The agent wires facts via provider.WireProviderFacts().
type Debian struct {
provider.FactsAware
logger *slog.Logger
fs avfs.VFS
}Facts are available in template rendering via {{ .Facts.os_family }}
when using the file provider's template support.
Two files connect a provider to the agent:
-
internal/agent/processor_{domain}.go— create helper functions that dispatch sub-operations to the provider. If the domain gets its own category (likeschedule,docker), create aNewXxxProcessorfactory. If the domain belongs under an existing category (likenode), add acaseto that category's processor and delegate to helpers in a new file:// processor_{domain}.go func process{Domain}Operation( provider {domain}.Provider, logger *slog.Logger, req job.Request, ) (json.RawMessage, error) { // switch on sub-operation, call provider, marshal result }
-
cmd/agent_setup.go— create the provider and register it with theProviderRegistry. For new categories, use a separateRegistercall. For existing categories (e.g.,node), pass the provider to the existing processor factory:// New category example (like schedule, docker): registry.Register("mydomain", agent.NewMyDomainProcessor(myProv, log), myProv) // Existing category example (node): // Add your provider as a parameter to NewNodeProcessor // and include it in the providers list for FactsAware wiring. // Read cmd/agent_setup.go to see the current parameter list.
That's it. No changes to agent/types.go, agent/agent.go, or
the JobClient interface. The registry handles dispatch and
FactsAware wiring automatically.
- Filesystem: Use
avfs—memfs.New()for in-memory,failfs.New()for targeted error injection. Never useafero. - Mocks: Use gomock for all interfaces (
FileDeployer,KeyValue,ObjectStore). Generated mocks live in{package}/mocks/. - Platform stubs: Test that Darwin and Linux stubs return
ErrUnsupportedfor every method. - export_test.go: Use for testing unexported variable swaps
(e.g.,
marshalJSON). Public tests import via the bridge. - Table-driven: One suite method per provider method, all scenarios as rows.
For node-targeted domains, create
internal/controller/api/node/{domain}/gen/ with three hand-written
files. For controller-only domains, create
internal/controller/api/{domain}/gen/ instead:
api.yaml— OpenAPI spec with paths, schemas, andBearerAuthsecuritycfg.yaml— oapi-codegen config (strict-server: true, import-mapping forcommon/gen)generate.go—//go:generatedirective
Mutable domains MUST use separate verbs for create and update:
POST— create a new resource (key/name in request body)PUT— update an existing resource (key/name from path parameter)GET— read/list resourcesDELETE— remove a resource
Do NOT combine create and update into a single "set" or "upsert"
endpoint. The cron domain is the reference: POST creates,
PUT /{name} updates. This separation gives clear 404 semantics
(update fails if not found, create fails if already exists) and
matches REST conventions.
The OpenAPI spec is the source of truth for input validation. All user input must be validated, and the spec must declare how:
- Request body properties: Add
x-oapi-codegen-extra-tagswithvalidate:tags. These generate Go struct tags thatvalidation.Struct()enforces at runtime.properties: address: type: string x-oapi-codegen-extra-tags: validate: required,ip
- Path parameters (UUID): Use
format: uuidon the schema. This causes oapi-codegen to generateopenapi_types.UUIDtype, and the router validates the format before the handler runs. No manual validation needed in the handler.parameters: - name: id in: path required: true schema: type: string format: uuid
- Query parameters: Place
x-oapi-codegen-extra-tagsat the parameter level (sibling ofname/in/schema), NOT insideschema:. At parameter level, oapi-codegen generatesvalidate:tags on the*Paramsstruct fields. Useenumfor constrained string values (generatesoneofvalidation).Then in the handler, validate with a single call:parameters: - name: limit in: query required: false x-oapi-codegen-extra-tags: validate: omitempty,min=1,max=100 schema: type: integer default: 20 minimum: 1 maximum: 100
NOTE:if errMsg, ok := validation.Struct(request.Params); !ok { return gen.GetFoo400JSONResponse{Error: &errMsg}, nil }
x-oapi-codegen-extra-tagson path parameters does NOT generate tags onRequestObjectstructs in strict-server mode (upstream limitation — see oapi-codegen issue). Keep thex-oapi-codegen-extra-tagsin the spec for documentation and add a YAML comment noting validation is handled manually. Path params that need validation beyondformat: uuid(e.g.,valid_target) use a shared helper likenode.validateHostname()which callsvalidation.Var().
IMPORTANT — every endpoint with user input MUST have:
x-oapi-codegen-extra-tagswithvalidate:tags on all request body properties and query params in the OpenAPI specvalidation.Struct(request.Params)in the handler for query params,validation.Struct(request.Body)for request bodies- A
400response defined in the OpenAPI spec for endpoints that accept user input - HTTP wiring tests (
TestXxxHTTP/TestXxxRBACHTTPmethods in the*_public_test.gosuite) that send raw HTTP through the full Echo middleware stack and verify:- Validation errors return correct status codes and error messages
- RBAC: 401 (no token), 403 (wrong permissions), 200 (valid token)
Defense-in-depth validation: When validation calls cannot
currently fail (e.g., all fields use omitempty), keep the call
but add a comment explaining why. This guards against future field
additions breaking validation silently:
// Defense in depth: current fields use omitempty so validation
// always passes, but guards against future field additions.
if errMsg, ok := validation.Struct(request.Body); !ok {
return gen.PostFoo400JSONResponse{Error: &errMsg}, nil
}For node-targeted domains, create
internal/controller/api/node/{domain}/. For controller-only
domains, create internal/controller/api/{domain}/:
types.go— domain struct, dependency interfaces (e.g.,Checker){domain}.go—New()factory, compile-time interface check:var _ gen.StrictServerInterface = (*Domain)(nil)- One file per endpoint (e.g.,
{operation}_get.go). Every handler that accepts user input MUST callvalidation.Struct()and return a 400 on failure. - Tests:
{operation}_get_public_test.go(testify/suite, table-driven). Must cover validation failures (400), success, and error paths. Each public test suite also includes HTTP wiring methods:TestXxxHTTP— sends raw HTTP through the full Echo middleware stack to verify validation (valid input, invalid input → 400).TestXxxRBACHTTP— verifies auth middleware: no token (401), wrong permissions (403), valid token (200). Usesapi.New()+{domain}.Handler()+server.RegisterHandlers()to wire throughScopeMiddleware. Follow existing handler test files in the codebase.
Every operation under /node/{hostname}/... MUST support broadcast
targeting (_all, _any, hostname, label selectors). The handler
checks job.IsBroadcastTarget(hostname) and routes to a broadcast
function. Both single-target and broadcast paths return the same
collection response shape.
Response pattern — all node-targeted operations return:
{
"job_id": "...",
"results": [
{"hostname": "web-01", "error": "", ...domain fields...},
{"hostname": "web-02", "error": "unsupported", ...}
]
}Every result item MUST have hostname and error fields.
Single-target returns 1 result; broadcast returns N results.
Failed/skipped agents appear as entries with error set.
Handler pattern:
func (s *Handler) PostOperation(ctx, request) {
validate(request)
hostname := request.Hostname
if job.IsBroadcastTarget(hostname) {
return s.postOperationBroadcast(ctx, hostname, ...)
}
// Single-target: wrap in collection with 1 result.
}Job client — the JobClient interface has 4 generic methods:
Query, QueryBroadcast, Modify, ModifyBroadcast. Handlers
call these with a category string and operation constant. No new
methods are needed when adding operations. Example:
jobID, resp, err := s.JobClient.Modify(
ctx, hostname, "node", job.OperationSysctlCreate, data)Read existing handlers in the codebase for reference.
Each domain package exports a Handler() function that creates the
handler, wraps it with auth middleware, and returns route
registration closures. No changes to the Server struct are needed.
Create handler.go in your domain package:
// internal/controller/api/node/{domain}/handler.go
package {domain}
func Handler(
logger *slog.Logger,
jobClient client.JobClient,
signingKey string,
customRoles map[string][]string,
) []func(e *echo.Echo) {
var tokenManager api.TokenValidator = authtoken.New(logger)
h := New(logger, jobClient)
strictHandler := gen.NewStrictHandler(h,
[]gen.StrictMiddlewareFunc{
func(handler strictecho.StrictEchoHandlerFunc,
_ string,
) strictecho.StrictEchoHandlerFunc {
return api.ScopeMiddleware(
handler, tokenManager, signingKey,
gen.BearerAuthScopes, customRoles,
)
},
},
)
return []func(e *echo.Echo){
func(e *echo.Echo) {
gen.RegisterHandlers(e, strictHandler)
},
}
}Add a handler_public_test.go that tests route registration and
middleware execution. Follow existing domain handler tests.
cmd/controller_setup.go— add one line toregisterControllerHandlers:Add the import for your domain package.handlers = append(handlers, {domain}API.Handler(log, jc, signingKey, customRoles)...)
The SDK client library lives in pkg/sdk/client/. Its generated HTTP client
uses the same combined OpenAPI spec as the server
(internal/controller/api/gen/api.yaml). Follow the rules in
@docs/docs/sidebar/sdk/guidelines.md — especially: never expose gen
types in public method signatures, add JSON tags to all result types,
and wrap errors with context.
When modifying existing API specs:
- Make changes to the domain's
gen/api.yaml(underapi/node/{domain}/for node-targeted domains orapi/{domain}/for controller-only domains) - Run
just generateto regenerate server code (this also regenerates the combined spec viaredocly join) - Run
go generate ./pkg/sdk/client/gen/...to regenerate the SDK client - Update the SDK service wrappers in
pkg/sdk/client/{domain}.goif new response codes were added - Update CLI switch blocks in
cmd/if new response codes were added
When adding a new API domain:
- Add a service with four files in
pkg/sdk/client/:{service}.go—{Service}Servicestruct + methods{service}_types.go— SDK result types + gen→SDK conversions{service}_public_test.go— service method tests{service}_types_public_test.go— conversion function tests Each service gets its own files — do NOT add methods or types to an existing service's files.
- Add a field to the
Clientstruct inosapi.goand wire it inNew() - Run
go generate ./pkg/sdk/client/gen/...to pick up the new domain's spec from the combinedapi.yaml - Add an SDK example in
examples/sdk/client/{service}.go— one file per SDK service (e.g.,hostname.go,disk.go,ntp.go). The example file name matches the Client field name in lowercase. - Add an SDK doc page under the appropriate category subdirectory
in
docs/docs/sidebar/sdk/client/. SDK docs are grouped by concern (e.g.,node-info/,system-config/,operations/). Place the new page in the matching group — look at the existing directory structure to find the right one. Use the Client field name as the page title (e.g.,# Power), NOT the Go struct name. Updateclient.mdto add the service to its category table. - Add the new service to the SDK navbar dropdown in
docs/docusaurus.config.tsunder the matching category header. The dropdown is grouped the same way as the sidebar.
Method names MUST be clean verbs — NEVER repeat the service name.
The service struct already provides the namespace. Stuttering like
SysctlService.SysctlGet() is wrong — use SysctlService.Get().
Standard verbs:
| Verb | HTTP | Description |
|---|---|---|
List |
GET | List collection |
Get |
GET | Get single resource / read state |
Create |
POST | Create new resource |
Update |
PUT | Update existing resource |
Delete |
DEL | Remove resource |
Rare exceptions for action operations (no persistent resource):
Ping.Do()— one-shot actionCommand.Exec(),Command.Shell()— execute commands
Examples:
// GOOD — clean verbs, no stuttering
client.Sysctl.Get(ctx, host, key)
client.Cron.Create(ctx, host, opts)
client.Hostname.Update(ctx, host, name)
client.NTP.Delete(ctx, host)
client.Timezone.Get(ctx, host)
// BAD — stuttering, repeats service name
client.Sysctl.SysctlGet(ctx, host, key)
client.NTP.NtpCreate(ctx, host, opts)SDK examples live in examples/sdk/client/, one file per SDK
service. Follow the same principles as the orchestrator examples:
- One service per file: demonstrate the service's SDK operations. Don't mix in other services.
- Self-contained: for read-only operations, just call and print. For mutating operations, cleanup at the start so the example is repeatable.
- Print results: decode and print at least one result so the example isn't silent.
- Keep it short: under ~100 lines of code (excluding license).
- Handle errors inline: use
log.Fatalffor unexpected errors. For operations that may fail on some platforms, check the error and print a message instead of crashing.
cmd/client_node_{domain}.go— parent command registered underclientNodeCmd(for node-targeted domains)cmd/client_node_{domain}_{operation}.go— one subcommand per endpoint (e.g.,client_node_sysctl_get.go)- All commands support
--jsonfor raw output - Use
printKVfor inline key-value output andprintStyledTablefor multi-row tabular data (both incmd/ui.go) - Use flags (e.g.,
--job-id,--audit-id) instead of positional args for resource IDs - Handle all API response codes in the
switch resp.StatusCode()block: 200, 400 (handleUnknownError), 401/403 (handleAuthError), 404 (handleUnknownError), 500 (handleUnknownError). Match the responses declared in the OpenAPI spec.
docs/docs/sidebar/features/{domain}-management.md— feature page. Follow existing feature pages for the template.docs/docs/sidebar/usage/cli/client/node/{domain}/{domain}.md— CLI landing page with<DocCardList />docs/docs/sidebar/usage/cli/client/node/{domain}/{verb}.md— one page per CLI subcommand (e.g.,get.md,create.md,update.md)- Update
docs/docusaurus.config.ts:- Add the new feature to the "Features" navbar dropdown
- Add the new SDK service to the "SDK" → "Client Library" dropdown
- Update
docs/docs/sidebar/features/features.md— add the new domain to the features landing page table - Update
docs/docs/sidebar/usage/configuration.md— add any new permissions to the roles table and permissions comments in the YAML reference - Update
docs/docs/sidebar/features/authentication.md— add new permissions to the roles/permissions tables - Update
docs/docs/sidebar/architecture/architecture.md— add link to the new feature page in the features list - Update
docs/docs/sidebar/architecture/api-guidelines.md— add new endpoints to the path pattern table - Update
docs/docs/sidebar/architecture/system-architecture.md— add endpoints to the health/endpoint tables if applicable
just generate # regenerate specs + code
go build ./... # compiles
just go::unit # tests pass
just go::vet # lint passesALL function signatures MUST use multi-line format:
func FunctionName(
param1 type1,
param2 type2,
) (returnType, error) {
}Three test layers:
- Unit tests (
*_test.go,*_public_test.go) — fast, mocked dependencies, run withjust go::unit. IncludesTestXxxHTTP/TestXxxRBACHTTPmethods that send raw HTTP through real Echo middleware with mocked backends. - Integration tests (
test/integration/) — build and start a realosapibinary, exercise CLI commands end-to-end. Guarded by//go:build integrationtag, run withjust go::unit-int. New API domains should include a{domain}_test.gosmoke suite. Write tests (mutations) must be guarded byskipWrite(s.T())so CI can run read-only tests by default (OSAPI_INTEGRATION_WRITES=1enables writes).
Conventions:
- ALL tests MUST use
testify/suitewith table-driven patterns - Public tests:
*_public_test.goin test package (e.g.,package job_test) for exported functions. This is the default — all new tests should be public tests. - Suite naming:
*_public_test.go→{Name}PublicTestSuite - Table-driven structure with
validateFunccallbacks - One suite method per function under test — all scenarios (success, errors, edge cases) as rows in one table
- Avoid generic file names like
helpers.goorutils.go— name files after what they contain types.gois for types only:types.gofiles MUST contain only type definitions (structs, interfaces, constants, type aliases). Never put functions or methods intypes.go— put them in a file named after what they do (e.g.,nats.gofor NATS config methods,options.gofor option functions).- Test file naming: every test file MUST have a corresponding
production file with a matching name.
foo_public_test.gotestsfoo.go. Never create test files with names that don't match a production file (e.g., don't createcheck_error_public_test.goif the code lives inresponse.go— name itresponse_public_test.go). If a production file is too large and you want to split tests by concern, split the production file first (e.g.,agent.go→agent.go+agent_drain.go+agent_timeline.go), then create matching test files.
- Always use gomock (
go:generate mockgen) for interface mocks. Generated mocks live in{package}/mocks/directories alongside their source interfaces. Never hand-roll mock structs. - export_test.go pattern for testing unexported internals: create
an
export_test.gofile in the production package that exposes unexported variables or functions to the_testpackage:Public tests then call// export_test.go — package file package file func SetMarshalJSON(fn func(interface{}) ([]byte, error)) { marshalJSON = fn } func ResetMarshalJSON() { marshalJSON = json.Marshal }
file.SetMarshalJSON(...)anddefer file.ResetMarshalJSON(). This avoids internal tests, import cycles, and hand-rolled stubs. - TearDownSubTest — use
suite.TearDownSubTest()to reset swapped variables between table-driven sub-tests, notdeferinside the loop. - Filesystem testing — use
avfs(memfs.New()for in-memory,failfs.New()for targeted error injection). Never useafero. The only exception for hand-rolled types is stdlib interfaces likefs.FSornet.Connwhere gomock is impractical.
- Non-blocking lifecycle:
Start()returns immediately,Stop(ctx)shuts down with deadline - Error wrapping:
fmt.Errorf("context: %w", err) - Early returns over nested if-else
- Unused parameters: rename to
_ - Import order: stdlib, third-party, local (blank-line separated)
All logging uses Go's log/slog structured logger. Follow these rules:
- Subsystem labels: Every component that holds a logger MUST wrap it
with
logger.With(slog.String("subsystem", "..."))at construction time. This auto-tags every log line from that component. Examples:"agent","agent.seed","api.schedule","provider.file","job.client","metrics","controller.heartbeat". - Always use typed attributes: Use
slog.String("key", val),slog.Int("key", val),slog.Bool("key", val),slog.Any("key", val). Never use positional pairs like"key", val— they compile but bypass type safety and are inconsistent with the codebase. - Standard field names:
errorfor errors,hostnamefor hosts,pathfor file paths,job_idfor job IDs,namefor entry names,addrfor addresses. - Error fields: Use
slog.String("error", err.Error())for string context orslog.Any("error", err)to preserve the error type. - Log levels:
Debugfor operation dispatch and idempotency skips,Infofor lifecycle events and state changes,Warnfor degraded but functional states,Errorfor failures that need attention.
golangci-lint with: errcheck, errname, goimports, govet, prealloc, predeclared, revive, staticcheck. Generated files (*.gen.go, *.pb.go) are excluded from formatting.
See @docs/docs/sidebar/development/development.md#branching for full conventions.
When committing changes via /commit, create a feature branch first if
currently on main. Branch names use the pattern type/short-description
(e.g., feat/add-dns-retry, fix/memory-leak, docs/update-readme).
See @docs/docs/sidebar/development/development.md#commit-messages for full conventions.
Follow Conventional Commits with the
50/72 rule. Format: type(scope): description.
When committing via Claude Code, end with:
🤖 Generated with [Claude Code](https://claude.ai/code)Co-Authored-By: Claude <noreply@anthropic.com>
Implementation planning and execution uses the superpowers plugin workflows
(writing-plans and executing-plans). Plans live in docs/plans/.