notifications (1/5): proto definitions and generated code#452
Conversation
🔐 Codex Security Review
Review SummaryOverall Risk: HIGH Findings[HIGH] Channel list/response types can expose stored notification secrets
[HIGH] Test/channel config accepts arbitrary outbound targets without validation
[MEDIUM] Channel create/update schema allows invalid or conflicting channel configurations
[MEDIUM] Maintenance windows can be created without required scope/time invariants
[MEDIUM] Notification history page size is unbounded
NotesNo hand-written server handlers, SQL, frontend views, plugin code, Docker files, or infrastructure changes are present in this diff. I did not find cryptostealing/pool-hijack logic in the changed files. The main risk is that the new protobuf contract bakes in unsafe defaults before the RPC implementation lands. Generated by Codex Security Review | |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c64a5c2fd8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
c64a5c2 to
d032a2d
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d032a2dfba
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
Both Codex findings on this PR are resolved downstream in #455 (server layer), not here — they read as open only because Codex reviews this PR's proto diff in isolation:
No change needed on this PR. |
|
codex-connector triage:
|
d032a2d to
7a2ceb2
Compare
flesher
left a comment
There was a problem hiding this comment.
Looks good. Where did we land on for the naming of "Silences" / how is it being reflected in the UI? I thought the general consensus was maintenance windows but forget where it ended up... Regardless we should try to align the api to whatever the UI calls them
|
Great point. Thanks! |
Adds the notifications.proto service definitions (Channel, Rule, Silence, History) and the generated Go (Connect handlers + messages) and TypeScript clients. Generated code only; no behavior is wired up yet. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
7a2ceb2 to
cd175c1
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cd175c127b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Summary
This is the bottom of a 5-PR stack introducing Notifications to Fleet — an operator-facing capability to manage where alerts are delivered, which alert rules are active, when alerts are temporarily suppressed during maintenance, and to review what has already fired. This PR delivers only the contract: the protobuf service/message definitions plus the generated Go and TypeScript code. It defines four services —
ChannelService,RuleService,MaintenanceWindowService, andHistoryService— undernotifications.v1. Nothing is wired up or served yet; these are the type/contract definitions consumed by the later PRs in the stack (history DB layer → authz → Grafana-backed service + handlers → client UI).How it works
There is no runtime behavior in this PR. It establishes the API contract from which code is generated, and that generated code is what downstream PRs consume.
The shape of the eventual system, which this contract describes:
proto/notifications/v1/notifications.proto. It is compiled bybufinto two consumer-facing artifacts: Go (message types + Connect handler interfaces/stubs) for thefleet-apiserver, and TypeScript (message types + client) for the web UI.fleet-apiis intended to be the security boundary: every RPC will be scoped to the caller's organization (noteorganization_idonChannel,Rule,MaintenanceWindow), andfleet-apiwill proxy a Grafana sidecar that actually performs delivery, rule evaluation, and suppression.kind(webhook / SMTP / Slack) and a corresponding config message. Secrets are write-only: zeroed on reads, withChannel.has_secretsignalling presence so the UI can render a masked "set" state.TestChannelsends a synthetic alert through either a saved channel or an unsaved definition.Rule.idis the Grafana alert-rule UID (stable across deploys because the YAML pins it);enabledis the inverse of Grafana'sisPaused.MaintenanceWindowcarries aMaintenanceWindowScopethat targets a rule, group, site, or set of devices, plus a time window and comment.before_idcursor +has_more).NotificationHistoryEntryfields are denormalized free-form strings sourced directly from Grafana alerts.buf.validatefield options (e.g. channelnamelength 1–255, comment max 1024, non-negativepage_size, non-empty ids), so validation is part of the contract rather than hand-rolled later.The end-to-end request path (realized in later PRs, shown here for context):
flowchart LR UI["Web UI<br/>(generated TS client)"] -->|"Connect-RPC over HTTP"| API["fleet-api<br/>(generated Go handlers)"] API -->|"scope to caller org"| AUTHZ["authz / org scoping"] AUTHZ --> GRAFANA["Grafana sidecar"] GRAFANA --> CP["contact points<br/>(channels)"] GRAFANA --> RULES["provisioned alert rules"] GRAFANA --> MW["maintenance windows<br/>(alert suppression)"] API -->|"history read path"| DB[("notification_history<br/>(keyset paginated)")]This PR contributes the leftmost and the
fleet-apiboundary box as type definitions only.Contract shape
flowchart TB subgraph svc["notifications.v1 services"] CS["ChannelService<br/>List / Create / Update / Delete / Test"] RS["RuleService<br/>List / Pause / Resume"] MWS["MaintenanceWindowService<br/>List / Create / Update / Delete"] HS["HistoryService<br/>ListNotifications"] end CS --> Channel["Channel<br/>+ Webhook/Smtp/Slack config<br/>(write-only secrets)"] RS --> Rule["Rule<br/>(RuleTemplate, enabled)"] MWS --> MaintenanceWindow["MaintenanceWindow<br/>+ MaintenanceWindowScope<br/>(rule/group/site/device)"] HS --> Entry["NotificationHistoryEntry<br/>(keyset by id)"]The TestChannel flow is the one RPC with non-obvious selection logic:
sequenceDiagram participant UI participant API as "fleet-api (future)" participant G as "Grafana (future)" UI->>API: "TestChannelRequest (id OR inline kind+config)" Note over API: "unsaved definition wins when both set" API->>G: "send synthetic alert" G-->>API: "delivery result" API-->>UI: "TestChannelResponse{ok, error, response_code}"Areas of the code involved
proto/notifications/v1/notifications.protoChannel/Rule/MaintenanceWindow/NotificationHistoryEntrytypes, enums (ChannelKind,ValidationState,RuleTemplate,MaintenanceWindowScopeKind), andbuf.validatefield constraints. The file is comment-free — there are no proto doc comments to read.server/generated/grpc/notifications/v1/notifications.pb.go// Code generated ... DO NOT EDIT), not authored proto docs.server/generated/grpc/notifications/v1/notificationsv1connect/notifications.connect.goclient/src/protoFleet/api/generated/notifications/v1/notifications_pb.tsKey technical decisions & trade-offs
has_secretboolean — chosen over returning masked secret values, so secret material never leaves the server while the UI can still show a "set" state.before_id/has_more) — chosen over offset/limit, to keep newest-first paging stable and cheap as the log grows.Channelconfig is a separate message perkind(Webhook/Smtp/Slack) rather than a generic key/value map — gives each destination type an explicit, validated schema at the cost of one message per new kind.buf.validate— chosen over hand-written server checks, so constraints travel with the contract and apply uniformly to generated consumers.Rule.idreuses the Grafana alert-rule UID rather than a Fleet-minted id — relies on the YAML pinning the UID so it stays stable across deploys.TestChannelRequestaccepts an id or an inline definition, with the inline definition winning — lets the UI test before save, at the cost of a slightly ambiguous request shape.buf.validateconstraints, keeping the hand-authored file minimal; semantic documentation lives in this stack's PRs and the downstream service code.Testing & validation
buf.validateconstraints are declared but not yet enforced at runtime until handlers exist.