Conversation
The Bee node returns 429 when stamp purchases are made back-to-back. This adds a 15-second delay between consecutive purchases in the replenishment loop and retry logic (3 attempts with backoff) in _purchase_stamp() for transient 429 errors. Fixes #175
Fix stamp pool 429 rate limiting during replenishment
Add Prometheus metrics endpoint and monitoring foundation
…191) Add Alloy as a Docker service that scrapes /metrics from both gateway containers and pushes to Grafana Cloud Prometheus. Credentials stored in GitHub secrets, injected via .env file at deploy time. Scrapes staging (provenance_gateway_dev:8000) and production (provenance_gateway:8000) with environment labels.
…or to dashboard (#192) - Alloy: staging → development, production → main (matches branch names) - deploy.yml: GATEWAY_ENVIRONMENT=development for dev, =main for main - Dashboard: add environment dropdown, filter all panels by $environment, add Bee API Errors panel
…cs (#193) - Fix stamp pool metrics: use current_levels/reserve_config from get_status() instead of non-existent reserves key - Add descriptions to all 17 dashboard panels explaining what each shows - Add Bee API Errors panel to dashboard - Update CLAUDE.md with production monitoring stack architecture and setup - Update README.md with full monitoring section including Grafana Cloud setup
* Fix pool metrics: access dataclass attributes instead of dict keys * Fix pool metrics dataclass access, add stamp provisioning and debug panels - Fix pool metrics poller: use dataclass attributes instead of dict.get() - Add 12 new dashboard panels: stamp provisioning breakdown (pool vs direct, by size), data volume, HTTP status codes, latency by endpoint, notary signing, gateway version info, deploy history - Dashboard now has 23 panels across 7 rows
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Promotes all monitoring work from dev to production.
What's included
/metricsendpoint with auto-instrumented HTTP metrics + custom business metricsdatafund.grafana.net/d/gateway-overviewWhat happens on merge
swarm_connect:mainimage with/metricsendpoint.envfrom GitHub secretsenvironment=mainlabelIssues addressed