feat: add Kubernetes-style metadata to all domain objects

## Problem Statement

OpenShell's top-level domain objects (Sandbox, Provider, etc.) lack a consistent metadata structure. Each object should have a human-readable name, a set of labels (key-value pairs) for filtering similar to Kubernetes ObjectMeta, and a creation timestamp. Currently, metadata is tracked inconsistently—some fields exist only in the database layer, not exposed in the API, and there's no label-based filtering capability.

This feature would enable users to organize and query resources using labels (e.g., `openshell sandbox list --selector env=prod,tier=frontend`), improving resource management and automation workflows.

## Technical Context

OpenShell already tracks `name`, `id`, `created_at_ms`, and `updated_at_ms` in the persistence layer (the `objects` table), but this metadata is not consistently exposed in proto messages. The `Sandbox` proto has `created_at_ms`, but `Provider` does not. Labels are completely absent—the only label-like feature is `SandboxTemplate.labels`, which passes labels through to Kubernetes Pods, not for filtering OpenShell resources.

The system uses a generic `ObjectRecord` persistence model with a `UNIQUE (object_type, name)` constraint, making it well-suited for metadata extension. The current list operations (`list(object_type, limit, offset)`) have no filtering capabilities beyond pagination.

## Affected Components

| Component | Key Files | Role |
|-----------|-----------|------|
| Proto definitions | `proto/openshell.proto`, `proto/datamodel.proto` | Define API surface for Sandbox, Provider, and other domain objects |
| Persistence layer | `crates/openshell-server/src/persistence/mod.rs`, `migrations/{sqlite,postgres}/` | Store and query objects, would need label column and filtering logic |
| gRPC handlers | `crates/openshell-server/src/grpc/sandbox.rs`, `crates/openshell-server/src/grpc/provider.rs` | Populate metadata in responses, parse labels in create/update requests |
| CLI | `crates/openshell-cli/src/main.rs`, `crates/openshell-cli/src/run.rs` | Add `--label` and `--selector` flags |
| Python SDK | `python/openshell/sandbox.py` | Expose labels in Python API |

## Technical Investigation

### Architecture Overview

OpenShell persists domain objects in a generic `objects` table with this schema:

```sql
CREATE TABLE objects (
    object_type TEXT NOT NULL,      -- "sandbox", "provider", "ssh_session", etc.
    id TEXT NOT NULL PRIMARY KEY,
    name TEXT NOT NULL,              -- human-friendly name (unique per type)
    payload BYTEA NOT NULL,          -- protobuf-encoded message
    created_at_ms BIGINT NOT NULL,   -- creation timestamp
    updated_at_ms BIGINT NOT NULL,   -- last update timestamp
    UNIQUE (object_type, name)
);
```

The persistence layer abstracts this via `ObjectRecord` and generic `put_message<T>()` / `get_message<T>()` methods that serialize/deserialize proto messages. List operations are simple: `list(object_type, limit, offset)` ordered by `created_at_ms ASC, name ASC`.

**Top-level domain objects:**
- **Sandbox** (`proto/openshell.proto:156`) — currently has `id`, `name`, `namespace`, `created_at_ms`, `current_policy_version`
- **Provider** (`proto/datamodel.proto:9`) — currently has `id`, `name`, `type`, `credentials`, `config` (no timestamps)
- **SshSession** (`proto/openshell.proto:405`) — has `id`, `name`, `sandbox_id`, `created_at_ms`, `expires_at_ms`, `revoked`
- **InferenceRoute** (Rust-only, `crates/openshell-server/src/inference.rs:35`) — has `id`, `name`, `provider_name`, `base_url`

The system already has Kubernetes influence: it uses K3s as the compute driver, and `SandboxTemplate` has `labels` and `annotations` that are passed through to Kubernetes Pods (not used for filtering OpenShell resources).

### Code References

| Location | Description |
|----------|-------------|
| `proto/openshell.proto:156` | Sandbox message definition |
| `proto/datamodel.proto:9` | Provider message definition |
| `crates/openshell-server/src/persistence/mod.rs:18` | `ObjectRecord` struct and persistence abstraction |
| `crates/openshell-server/src/persistence/mod.rs:131` | `list()` method signature (no filtering) |
| `crates/openshell-server/src/persistence/sqlite.rs:155` | SQLite list implementation |
| `crates/openshell-server/src/persistence/postgres.rs:132` | Postgres list implementation |
| `crates/openshell-server/src/grpc/sandbox.rs:46` | `CreateSandbox` handler |
| `crates/openshell-server/src/grpc/sandbox.rs:145` | `ListSandboxes` handler |
| `crates/openshell-server/src/grpc/provider.rs:297` | `ListProviders` handler |
| `crates/openshell-server/migrations/sqlite/001_create_objects.sql:1` | Database schema |
| `crates/openshell-cli/src/main.rs:1084` | CLI sandbox create command |
| `crates/openshell-cli/src/main.rs:651` | CLI provider create command |

### Current Behavior

**Creating a resource:**
- User calls `openshell sandbox create my-sandbox --image=...`
- CLI builds a `CreateSandboxRequest` with `name`, `spec`, etc.
- gRPC handler generates an `id` (UUID), sets `created_at_ms = now()`, stores in DB
- The `created_at_ms` is **manually set** in the handler (`grpc/sandbox.rs:147`) and returned in the response, but not consistently done for all object types

**Listing resources:**
- User calls `openshell sandbox list`
- CLI sends `ListSandboxesRequest` with `limit` and `offset`
- Handler queries `list("sandbox", limit, offset)` from DB
- Returns all sandboxes ordered by creation time
- **No filtering by labels or any other field**

**Metadata gaps:**
- `Provider` messages have no timestamp fields at all
- Labels don't exist anywhere (can't filter resources by labels)
- No shared metadata structure across objects

### What Would Need to Change

**1. Proto definitions:**

Add a shared `ObjectMeta` message and refactor domain objects to use it:

```protobuf
// proto/openshell.proto
message ObjectMeta {
  string id = 1;
  string name = 2;
  int64 created_at_ms = 3;
  map<string, string> labels = 4;
  int64 resource_version = 5;  // Incremented on each update for optimistic concurrency control
}

message Sandbox {
  ObjectMeta metadata = 1;          // NEW: replaces inline id, name, created_at_ms
  SandboxSpec spec = 2;              // renumbered from 4
  SandboxStatus status = 3;          // renumbered from 5
  SandboxPhase phase = 4;            // renumbered from 6
  uint32 current_policy_version = 5; // renumbered from 8
  // REMOVED: namespace field (now internal to compute driver only)
}

message Provider {
  ObjectMeta metadata = 1;          // NEW: replaces inline id, name, adds timestamps and labels
  string type = 2;                   // renumbered from 3
  map<string, string> credentials = 3; // renumbered from 4
  map<string, string> config = 4;    // renumbered from 5
}

message SshSession {
  ObjectMeta metadata = 1;          // NEW: replaces inline id, name, created_at_ms
  string sandbox_id = 2;             // renumbered from 4
  string token = 3;                  // renumbered from 5
  int64 expires_at_ms = 4;           // renumbered from 6
  bool revoked = 5;                  // renumbered from 7
}
```

**Note on `namespace` removal:**
The `namespace` field is being removed from the public `Sandbox` message because:
- It's not user-controllable (automatically set from server config)
- It's specific to the Kubernetes driver implementation
- It remains in `compute_driver.proto` (`DriverSandbox.namespace`) as an internal driver detail
- If needed for observability, it can be exposed later as driver-specific status information

**2. Database schema:**

Add `labels` and `resource_version` columns to the `objects` table:

```sql
-- migrations/{sqlite,postgres}/00X_add_labels.sql
ALTER TABLE objects ADD COLUMN labels TEXT;         -- SQLite: JSON string
ALTER TABLE objects ADD COLUMN resource_version BIGINT NOT NULL DEFAULT 1;
-- OR
ALTER TABLE objects ADD COLUMN labels JSONB;        -- Postgres: native JSON
ALTER TABLE objects ADD COLUMN resource_version BIGINT NOT NULL DEFAULT 1;
```

Backfill existing rows: `UPDATE objects SET labels = '{}' WHERE labels IS NULL;`

**3. Persistence layer:**

Update `persistence/mod.rs`:
- Modify `ObjectRecord` to include `labels: Option<String>` (JSON-serialized) and `resource_version: i64`
- Update `put_message<T>()` to extract labels from proto, store in DB column, and increment `resource_version` on updates
- Update `get_message<T>()` to deserialize labels from DB and populate proto field
- Add `list_with_selector(object_type, label_selector, limit, offset)` method
- Parse simple label selector syntax: `key=value,key2=value2` (comma-separated equality matches)

Update SQLite/Postgres implementations:
- **Postgres:** Use `jsonb @> '{"key": "value"}'::jsonb` for filtering
- **SQLite:** Parse JSON in application layer (or use `json_extract()` for simple cases)

**4. Label validation:**

Enforce Kubernetes-style label validation at the gRPC handler boundary:
- Keys and values must be alphanumeric + `-._/`
- Maximum 63 characters per segment (prefix/name split by `/`)
- Reject invalid labels with descriptive error messages

**5. gRPC handlers:**

Update `grpc/sandbox.rs` and `grpc/provider.rs`:
- Validate labels on create/update requests (enforce Kubernetes conventions)
- Populate `metadata.labels` and `metadata.resource_version` in create responses
- Accept labels in create/update requests
- Add `label_selector` field to `ListSandboxesRequest` / `ListProvidersRequest`
- Call `list_with_selector()` instead of `list()`
- Implement optimistic concurrency: check `resource_version` on updates, return conflict error if mismatch

**6. CLI:**

Add flags to `openshell-cli/src/main.rs`:
- `openshell sandbox create --label key=value` — set labels on creation (repeatable flag)
- `openshell sandbox list --selector key=value,key2=value2` — filter by labels (simple equality syntax)
- Similar for `provider` commands

Parse label syntax and send in gRPC requests.

**7. Python SDK:**

Expose `sandbox.metadata.labels` and `sandbox.metadata.resource_version` in `python/openshell/sandbox.py`.

**8. Namespace field removal:**

Since `namespace` is being removed from the public `Sandbox` message:
- Update `grpc/sandbox.rs:89` — remove the line that sets `namespace` from server config
- Update `grpc/sandbox.rs:94` — remove `namespace` from `Sandbox` struct initialization
- Update `compute/mod.rs:434` and `compute/mod.rs:472` — remove `namespace` field assignments
- Update `cli/src/run.rs:2754` — remove namespace display from sandbox details view
- Update `cli/src/run.rs:3009` — remove namespace column from sandbox list table
- The `namespace` field remains in `DriverSandbox` (compute driver proto) for internal use by the Kubernetes driver

### Alternative Approaches Considered

**Shared `ObjectMeta` vs. Inline metadata fields:**

The design uses a **shared `ObjectMeta` message** (Kubernetes-like pattern) because:
- Provides consistent metadata structure across all domain objects
- Easy to extend in the future (add `annotations`, `deletion_timestamp`, etc.)
- Enables future identity tracking (e.g., `created_by`, `updated_by` fields) once the control plane integrates authentication/authorization — having a shared metadata structure means identity fields can be added once and apply to all resources
- Matches Kubernetes mental model (familiar to users)
- Eliminates field duplication across messages
- Since this is a new project, breaking changes are acceptable in favor of clean design

**Label storage:**
- **JSONB (Postgres)** — native indexing, fast queries, supports rich querying
- **TEXT (SQLite)** — stored as JSON string, parsed in application or via `json_extract()`

**Recommendation:** Use JSONB for Postgres, TEXT for SQLite. Document Postgres as recommended for production if label filtering performance matters.

**Namespace field removal:**

The `namespace` field is being removed from the public `Sandbox` message because:
- It's not user-controllable (set from server config `sandbox_namespace`, defaults to `"default"`)
- It's specific to the Kubernetes driver (doesn't apply to VM or other drivers)
- It remains in `compute_driver.proto` as `DriverSandbox.namespace` for internal driver use
- Removing it from the public API reduces clutter and keeps implementation details internal
- If needed for debugging/observability, it can be exposed later as driver-specific status

### Patterns to Follow

**Timestamp handling:**
- The codebase uses `int64` milliseconds (`created_at_ms`, `updated_at_ms`) consistently
- Continue this pattern rather than introducing RFC3339 strings or `google.protobuf.Timestamp`

**Label validation:**
- Kubernetes label rules: alphanumeric + `-._/`, max 63 chars per segment
- Reject invalid labels at API boundary (gRPC handler)

**Existing metadata traits:**
- `ObjectType`, `ObjectId`, `ObjectName` traits in `compute/mod.rs` — extend these to include `labels()` method
- Keep the `UNIQUE (object_type, name)` constraint (essential for human-friendly references)

## Proposed Approach

1. **Add shared `ObjectMeta` proto message** — define once, use across all domain objects (Sandbox, Provider, SshSession, etc.), including `resource_version` for optimistic concurrency control
2. **Refactor domain object messages** — replace inline `id`, `name`, `created_at_ms` with `ObjectMeta metadata` field
3. **Remove `namespace` from public Sandbox API** — keep it internal to `DriverSandbox` in `compute_driver.proto`
4. **Add labels and resource_version columns to database** — nullable JSONB (Postgres) / TEXT (SQLite) for labels, BIGINT for resource_version, backfill with `{}` and `1`
5. **Extend persistence layer** — serialize/deserialize labels between proto and DB, implement simple label-based filtering (`key=value,key2=value2` syntax), increment resource_version on updates
6. **Enforce strict label validation** — Kubernetes conventions (alphanumeric + `-._/`, max 63 chars) at gRPC handler boundary
7. **Update gRPC handlers** — populate `metadata.labels` and `metadata.resource_version` in responses, accept labels in create requests, support simple `label_selector` in list requests, implement optimistic concurrency checks
8. **Add CLI flags** — `--label key=value` for create commands (repeatable), `--selector key=value,key2=value2` for list commands
9. **Update Python SDK** — expose `sandbox.metadata.labels` and `sandbox.metadata.resource_version`

This approach prioritizes clean design and consistency over backward compatibility (acceptable for a new project).

## Scope Assessment

- **Complexity:** Medium
- **Confidence:** High — clear path, existing persistence layer is well-suited for this change
- **Estimated files to change:** 12-15
- **Issue type:** `feat`

## Risks & Open Questions

**Risks:**
- **SQLite label filtering performance:** No native JSONB indexing. For large datasets, filtering may be slow. Mitigate by recommending Postgres for production, or implement in-memory filtering.
- **Proto field numbering:** Renumbering fields is a breaking change. Mitigate by versioning the proto package (`openshell.v2`) or deprecating old fields.
- **SQL injection (CWE-89):** Label selector parsing must use parameterized queries, not string concatenation.
- **Resource exhaustion (CWE-400):** Label selectors should have a complexity limit (e.g., max 10 key-value pairs) to prevent DoS.

**Design decisions resolved:**
- ✅ **Use shared `ObjectMeta`** — clean design prioritized over backward compatibility
- ✅ **Remove `namespace` from public Sandbox API** — keep it internal to compute driver
- ✅ **Timestamp format:** Stick with `int64 created_at_ms` for consistency
- ✅ **Label storage:** JSONB (Postgres), TEXT (SQLite)
- ✅ **Label selector syntax:** Simple equality matching (`key=value,key2=value2`) — no complex operators on day one
- ✅ **Label validation:** Enforce strict Kubernetes conventions (alphanumeric + `-._/`, max 63 chars)
- ✅ **Add `resourceVersion` to `ObjectMeta`** — implement now for optimistic concurrency control

**Open questions:**
- **Index on labels column:** Add GIN index on Postgres JSONB for performance, or wait for benchmarks?
- **Should `updated_at_ms` be exposed in `ObjectMeta`?** Currently DB-only, could be useful for change tracking

## Test Considerations

- **Unit tests:**
  - Label serialization/deserialization in persistence layer
  - Label selector parsing (valid and invalid syntax)
  - Label validation (reject invalid labels per Kubernetes rules)
  - Resource version incrementation on updates

- **Integration tests:**
  - Create sandbox with labels, verify stored correctly
  - List with label selector, verify filtering works
  - SQLite vs. Postgres label filtering behavior
  - Optimistic concurrency: concurrent updates should trigger conflict errors

- **E2E tests:**
  - CLI: `openshell sandbox create --label env=prod`, then `list --selector env=prod`
  - gRPC: `CreateSandboxRequest` with labels, `ListSandboxesRequest` with `label_selector`
  - Verify resource_version increments on updates

- **Migration tests:**
  - Verify migration is idempotent
  - Verify existing objects have empty labels and resource_version=1 after migration
  - Verify rollback doesn't break existing data

- **Test patterns to follow:**
  - Existing persistence layer tests in `crates/openshell-server/src/persistence/tests.rs`
  - CLI tests use `assert_cmd` crate pattern
  - E2E tests in `tests/` directory use running cluster

---

*Created by spike investigation. Use `build-from-issue` to plan and implement.*

Location	Description
`proto/openshell.proto:156`	Sandbox message definition
`proto/datamodel.proto:9`	Provider message definition
`crates/openshell-server/src/persistence/mod.rs:18`	`ObjectRecord` struct and persistence abstraction
`crates/openshell-server/src/persistence/mod.rs:131`	`list()` method signature (no filtering)
`crates/openshell-server/src/persistence/sqlite.rs:155`	SQLite list implementation
`crates/openshell-server/src/persistence/postgres.rs:132`	Postgres list implementation
`crates/openshell-server/src/grpc/sandbox.rs:46`	`CreateSandbox` handler
`crates/openshell-server/src/grpc/sandbox.rs:145`	`ListSandboxes` handler
`crates/openshell-server/src/grpc/provider.rs:297`	`ListProviders` handler
`crates/openshell-server/migrations/sqlite/001_create_objects.sql:1`	Database schema
`crates/openshell-cli/src/main.rs:1084`	CLI sandbox create command
`crates/openshell-cli/src/main.rs:651`	CLI provider create command

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Kubernetes-style metadata to all domain objects #864

Problem Statement

Technical Context

Affected Components

Technical Investigation

Architecture Overview

Code References

Current Behavior

What Would Need to Change

Alternative Approaches Considered

Patterns to Follow

Proposed Approach

Scope Assessment

Risks & Open Questions

Test Considerations

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Component	Key Files	Role
Proto definitions	`proto/openshell.proto`, `proto/datamodel.proto`	Define API surface for Sandbox, Provider, and other domain objects
Persistence layer	`crates/openshell-server/src/persistence/mod.rs`, `migrations/{sqlite,postgres}/`	Store and query objects, would need label column and filtering logic
gRPC handlers	`crates/openshell-server/src/grpc/sandbox.rs`, `crates/openshell-server/src/grpc/provider.rs`	Populate metadata in responses, parse labels in create/update requests
CLI	`crates/openshell-cli/src/main.rs`, `crates/openshell-cli/src/run.rs`	Add `--label` and `--selector` flags
Python SDK	`python/openshell/sandbox.py`	Expose labels in Python API

feat: add Kubernetes-style metadata to all domain objects #864

Description

Problem Statement

Technical Context

Affected Components

Technical Investigation

Architecture Overview

Code References

Current Behavior

What Would Need to Change

Alternative Approaches Considered

Patterns to Follow

Proposed Approach

Scope Assessment

Risks & Open Questions

Test Considerations

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions