notifications (1/5): proto definitions and generated code by illegalprime · Pull Request #452 · block/proto-fleet

illegalprime · 2026-06-15T15:53:20Z

Summary

This is the bottom of a 5-PR stack introducing Notifications to Fleet — an operator-facing capability to manage where alerts are delivered, which alert rules are active, when alerts are temporarily suppressed during maintenance, and to review what has already fired. This PR delivers only the contract: the protobuf service/message definitions plus the generated Go and TypeScript code. It defines four services — ChannelService, RuleService, MaintenanceWindowService, and HistoryService — under notifications.v1. Nothing is wired up or served yet; these are the type/contract definitions consumed by the later PRs in the stack (history DB layer → authz → Grafana-backed service + handlers → client UI).

How it works

There is no runtime behavior in this PR. It establishes the API contract from which code is generated, and that generated code is what downstream PRs consume.

The shape of the eventual system, which this contract describes:

The contract lives in proto/notifications/v1/notifications.proto. It is compiled by buf into two consumer-facing artifacts: Go (message types + Connect handler interfaces/stubs) for the fleet-api server, and TypeScript (message types + client) for the web UI.
fleet-api is intended to be the security boundary: every RPC will be scoped to the caller's organization (note organization_id on Channel, Rule, MaintenanceWindow), and fleet-api will proxy a Grafana sidecar that actually performs delivery, rule evaluation, and suppression.
The four services map to four Grafana-backed concerns:
- ChannelService — alert delivery destinations, backed by Grafana contact points. Channels carry a kind (webhook / SMTP / Slack) and a corresponding config message. Secrets are write-only: zeroed on reads, with Channel.has_secret signalling presence so the UI can render a masked "set" state. TestChannel sends a synthetic alert through either a saved channel or an unsaved definition.
- RuleService — read-only plus pause/resume. Rules are provisioned by ops YAML, so there is deliberately no create/update/delete. Rule.id is the Grafana alert-rule UID (stable across deploys because the YAML pins it); enabled is the inverse of Grafana's isPaused.
- MaintenanceWindowService — temporary suppression of alerts during planned maintenance. A MaintenanceWindow carries a MaintenanceWindowScope that targets a rule, group, site, or set of devices, plus a time window and comment.
- HistoryService — read-only access to the delivered-notification log, newest-first and keyset-paginated on the entry id (before_id cursor + has_more). NotificationHistoryEntry fields are denormalized free-form strings sourced directly from Grafana alerts.
Input constraints are declared inline via buf.validate field options (e.g. channel name length 1–255, comment max 1024, non-negative page_size, non-empty ids), so validation is part of the contract rather than hand-rolled later.

The end-to-end request path (realized in later PRs, shown here for context):

flowchart LR
    UI["Web UI<br/>(generated TS client)"] -->|"Connect-RPC over HTTP"| API["fleet-api<br/>(generated Go handlers)"]
    API -->|"scope to caller org"| AUTHZ["authz / org scoping"]
    AUTHZ --> GRAFANA["Grafana sidecar"]
    GRAFANA --> CP["contact points<br/>(channels)"]
    GRAFANA --> RULES["provisioned alert rules"]
    GRAFANA --> MW["maintenance windows<br/>(alert suppression)"]
    API -->|"history read path"| DB[("notification_history<br/>(keyset paginated)")]

This PR contributes the leftmost and the fleet-api boundary box as type definitions only.

Contract shape

flowchart TB
    subgraph svc["notifications.v1 services"]
        CS["ChannelService<br/>List / Create / Update / Delete / Test"]
        RS["RuleService<br/>List / Pause / Resume"]
        MWS["MaintenanceWindowService<br/>List / Create / Update / Delete"]
        HS["HistoryService<br/>ListNotifications"]
    end
    CS --> Channel["Channel<br/>+ Webhook/Smtp/Slack config<br/>(write-only secrets)"]
    RS --> Rule["Rule<br/>(RuleTemplate, enabled)"]
    MWS --> MaintenanceWindow["MaintenanceWindow<br/>+ MaintenanceWindowScope<br/>(rule/group/site/device)"]
    HS --> Entry["NotificationHistoryEntry<br/>(keyset by id)"]

The TestChannel flow is the one RPC with non-obvious selection logic:

sequenceDiagram
    participant UI
    participant API as "fleet-api (future)"
    participant G as "Grafana (future)"
    UI->>API: "TestChannelRequest (id OR inline kind+config)"
    Note over API: "unsaved definition wins when both set"
    API->>G: "send synthetic alert"
    G-->>API: "delivery result"
    API-->>UI: "TestChannelResponse{ok, error, response_code}"

Areas of the code involved

Area / package / file	What changed	Why it matters for review
`proto/notifications/v1/notifications.proto`	New. Four services, their request/response messages, `Channel`/`Rule`/`MaintenanceWindow`/`NotificationHistoryEntry` types, enums (`ChannelKind`, `ValidationState`, `RuleTemplate`, `MaintenanceWindowScopeKind`), and `buf.validate` field constraints. The file is comment-free — there are no proto doc comments to read.	The only hand-authored file to review. This is the entire contract; everything else is derived from it.
`server/generated/grpc/notifications/v1/notifications.pb.go`	New. Generated Go message types.	Generated — skip. Any comments here are generator boilerplate (`// Code generated ... DO NOT EDIT`), not authored proto docs.
`server/generated/grpc/notifications/v1/notificationsv1connect/notifications.connect.go`	New. Generated Connect handler interfaces + client stubs for the four services.	Generated — skip. Confirms the four services map to handler stubs the later PRs implement.
`client/src/protoFleet/api/generated/notifications/v1/notifications_pb.ts`	New. Generated TypeScript message types / client.	Generated — skip. Comments here are generator boilerplate, not authored proto docs.

Key technical decisions & trade-offs

RuleService is read-only + pause/resume only — chosen over a full CRUD rule surface, because rules are provisioned by ops YAML; the API never authors them.
Secrets are write-only with a has_secret boolean — chosen over returning masked secret values, so secret material never leaves the server while the UI can still show a "set" state.
History pagination is keyset on entry id (before_id / has_more) — chosen over offset/limit, to keep newest-first paging stable and cheap as the log grows.
Channel config is a separate message per kind (Webhook/Smtp/Slack) rather than a generic key/value map — gives each destination type an explicit, validated schema at the cost of one message per new kind.
Validation lives in the proto via buf.validate — chosen over hand-written server checks, so constraints travel with the contract and apply uniformly to generated consumers.
Rule.id reuses the Grafana alert-rule UID rather than a Fleet-minted id — relies on the YAML pinning the UID so it stays stable across deploys.
TestChannelRequest accepts an id or an inline definition, with the inline definition winning — lets the UI test before save, at the cost of a slightly ambiguous request shape.
The proto is stripped of doc comments — the contract is expressed entirely through names and buf.validate constraints, keeping the hand-authored file minimal; semantic documentation lives in this stack's PRs and the downstream service code.

Testing & validation

This PR adds no runtime behavior, so there is nothing to unit-test here directly. Correctness is "the contract compiles and generates": the generated Go and TypeScript are present and build, which is the gate for the rest of the stack.
Not covered: org scoping/authz (notifications (3/5): notification:read and notification:manage permissions #454), the Grafana-backed service implementation + handler wiring (notifications (4/5): Grafana-backed service, handlers, and server wiring #455), the history DB read path (notifications (2/5): notification_history keyset listing #453), and the client UI (notifications (5/5): notifications UI #456). No handler is served or routed by this PR, and the buf.validate constraints are declared but not yet enforced at runtime until handlers exist.

github-actions · 2026-06-15T15:55:43Z

🔐 Codex Security Review

Note: This is an automated security-focused code review generated by Codex.
It should be used as a supplementary check alongside human review.
False positives are possible - use your judgment.

Scope summary

Reviewed pull request diff only (3f14d374f5c67e14b70014c14d582c2c27273fee...cd175c127b14bc0386b38e0f36e910d5d3111753, exact PR three-dot diff)

Model: gpt-5.5

💡 Click "edited" above to see previous reviews for this PR.

Review Summary

Overall Risk: HIGH

Findings

[HIGH] Channel list/response types can expose stored notification secrets

Category: gRPC
Location: proto/notifications/v1/notifications.proto:89
Description: ListChannelsResponse returns full Channel objects, and Channel embeds WebhookConfig, SmtpConfig, and SlackConfig. Those config messages contain secret-bearing fields such as bearer_header, SMTP password, and Slack webhook_url. The presence of has_secret suggests redaction is intended, but the response schema still provides normal fields for the raw secrets.
Impact: A caller that can list or receive channels could exfiltrate Slack webhook URLs, SMTP passwords, and bearer tokens if handlers populate these fields. These values may also end up in frontend state or logs.
Recommendation: Split request/write config from response/read models. Return a redacted ChannelSummary that omits secret values and includes only safe metadata plus has_secret. Do not reuse secret-bearing config messages in list/create/update responses.

[HIGH] Test/channel config accepts arbitrary outbound targets without validation

Category: gRPC
Location: proto/notifications/v1/notifications.proto:120
Description: TestChannelRequest can carry arbitrary webhook, Slack webhook, or SMTP host/port data. The target fields have no URI, scheme, host, port, length, or private-network restrictions.
Impact: A straightforward implementation of TestChannel becomes an SSRF primitive from the server: probing internal services, loopback, link-local/cloud metadata endpoints, or sending attacker-controlled requests with configured headers.
Recommendation: Add strict validation and server-side canonicalization before any outbound request: require HTTPS for webhooks/Slack, reject userinfo and redirects to disallowed networks, block loopback/private/link-local/multicast targets, constrain SMTP ports/hosts, set short timeouts, and rate-limit tests.

[MEDIUM] Channel create/update schema allows invalid or conflicting channel configurations

Category: gRPC
Location: proto/notifications/v1/notifications.proto:92
Description: CreateChannelRequest and UpdateChannelRequest accept kind, webhook, smtp, and slack as independent fields. kind permits CHANNEL_KIND_UNSPECIFIED, and clients can send no config or multiple configs at once.
Impact: Handlers must guess which config is authoritative, creating room for dropped secrets, wrong validation, inconsistent persistence, or bypasses where a request’s kind does not match the supplied secret config.
Recommendation: Use a required oneof for channel config, or add Buf validate/CEL rules enforcing exactly one config and matching kind. Add enum.defined_only and not_in: [0] for kind.

[MEDIUM] Maintenance windows can be created without required scope/time invariants

Category: Reliability
Location: proto/notifications/v1/notifications.proto:208
Description: CreateMaintenanceWindowRequest and UpdateMaintenanceWindowRequest do not require scope, starts_at, or ends_at, and there is no validation that ends_at is after starts_at. MaintenanceWindowScope also allows an unspecified kind and multiple/missing target fields.
Impact: Invalid windows can accidentally silence alerts too broadly or indefinitely, depending on handler defaults. Missing message fields can also lead to handler panics if dereferenced without checks.
Recommendation: Mark scope and timestamps required, reject unspecified scope kinds, enforce exactly one target field for the selected scope, cap device_ids, and add a CEL rule requiring ends_at > starts_at plus any maximum-duration policy.

[MEDIUM] Notification history page size is unbounded

Category: Reliability
Location: proto/notifications/v1/notifications.proto:253
Description: ListNotificationsRequest.page_size only validates gte = 0; there is no upper bound.
Impact: An authenticated caller can request very large notification pages, forcing large DB reads and large protobuf/JSON responses. Notification history is naturally unbounded over time, so this can become a memory/CPU pressure point.
Recommendation: Add an lte cap consistent with other list APIs, and have the handler clamp 0 to a safe default.

Notes

No hand-written server handlers, SQL, frontend views, plugin code, Docker files, or infrastructure changes are present in this diff. I did not find cryptostealing/pool-hijack logic in the changed files. The main risk is that the new protobuf contract bakes in unsafe defaults before the RPC implementation lands.

_{Generated by Codex Security Review |

Triggered by: @illegalprime |

Review workflow run}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c64a5c2fd8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d032a2dfba

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

illegalprime · 2026-06-15T17:39:03Z

Both Codex findings on this PR are resolved downstream in #455 (server layer), not here — they read as open only because Codex reviews this PR's proto diff in isolation:

Secret-carrying RPCs not log-redacted — notifications (4/5): Grafana-backed service, handlers, and server wiring #455 adds ChannelService Create/Update/Test to RedactedRequestProcedures in interceptors/config.go.
Read responses reuse secret configs — notifications (4/5): Grafana-backed service, handlers, and server wiring #455's service redacts secrets on read (empty placeholders + redactWebhookURL); ListChannels/CreateChannel/UpdateChannel never echo bearer headers, SMTP passwords, or Slack URLs back.

No change needed on this PR.

illegalprime · 2026-06-15T18:16:21Z

codex-connector triage:

[P1] Scope rule pauses per organization — false positive. PauseRule (service.go) does not flip Grafana's global isPaused; it "mutes a rule via a marker pause-silence" — buildPauseSilence(orgID, ruleID, …) is org-scoped, and pauseSilencedRules derives enabled per-org. So a pause by org A does not affect org B. No change.
[P2] Explicit secret-preservation semantics — handled in notifications (4/5): Grafana-backed service, handlers, and server wiring #455's service impl: requestHasNewSecret + carrySecretSettings carry the stored secret forward only when the destination is unchanged. The shared message shape is intentional (reads return redacted/empty secrets). No proto change.

flesher

Looks good. Where did we land on for the naming of "Silences" / how is it being reflected in the UI? I thought the general consensus was maintenance windows but forget where it ended up... Regardless we should try to align the api to whatever the UI calls them

illegalprime · 2026-06-15T19:05:31Z

Great point. Thanks!

Adds the notifications.proto service definitions (Channel, Rule, Silence, History) and the generated Go (Connect handlers + messages) and TypeScript clients. Generated code only; no behavior is wired up yet. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cd175c127b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

illegalprime requested a review from a team as a code owner June 15, 2026 15:53

github-actions Bot added javascript Pull requests that update javascript code client server shared labels Jun 15, 2026

github-actions Bot assigned illegalprime Jun 15, 2026

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread proto/notifications/v1/notifications.proto Outdated

illegalprime force-pushed the eden/notifications-1-proto branch from c64a5c2 to d032a2d Compare June 15, 2026 16:02

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread proto/notifications/v1/notifications.proto

illegalprime mentioned this pull request Jun 15, 2026

notifications: preliminary UI work #306

Closed

illegalprime force-pushed the eden/notifications-1-proto branch from d032a2d to 7a2ceb2 Compare June 15, 2026 18:35

flesher approved these changes Jun 15, 2026

View reviewed changes

illegalprime force-pushed the eden/notifications-1-proto branch from 7a2ceb2 to cd175c1 Compare June 15, 2026 19:24

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread proto/notifications/v1/notifications.proto

This was referenced Jun 16, 2026

notifications (2/5): notification_history keyset listing #453

Merged

notifications (3/5): notification:read and notification:manage permissions #454

Merged

notifications (4/5): Grafana-backed service, handlers, and server wiring #455

Closed

illegalprime merged commit 591590e into main Jun 16, 2026
76 checks passed

illegalprime deleted the eden/notifications-1-proto branch June 16, 2026 14:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

notifications (1/5): proto definitions and generated code#452

notifications (1/5): proto definitions and generated code#452
illegalprime merged 1 commit into
mainfrom
eden/notifications-1-proto

illegalprime commented Jun 15, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 15, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

illegalprime commented Jun 15, 2026

Uh oh!

illegalprime commented Jun 15, 2026

Uh oh!

flesher left a comment

Uh oh!

illegalprime commented Jun 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

illegalprime commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Contract shape

Areas of the code involved

Key technical decisions & trade-offs

Testing & validation

Uh oh!

github-actions Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔐 Codex Security Review

Review Summary

Findings

[HIGH] Channel list/response types can expose stored notification secrets

[HIGH] Test/channel config accepts arbitrary outbound targets without validation

[MEDIUM] Channel create/update schema allows invalid or conflicting channel configurations

[MEDIUM] Maintenance windows can be created without required scope/time invariants

[MEDIUM] Notification history page size is unbounded

Notes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

illegalprime commented Jun 15, 2026

Uh oh!

illegalprime commented Jun 15, 2026

Uh oh!

flesher left a comment

Choose a reason for hiding this comment

Uh oh!

illegalprime commented Jun 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

illegalprime commented Jun 15, 2026 •

edited

Loading

github-actions Bot commented Jun 15, 2026 •

edited

Loading