Summary
x402d serves Prometheus metrics on :9090/metrics, but the endpoint returns HTTP 500 Internal Server Error. The process itself is healthy and serving traffic — only the metrics endpoint is broken. Suspected cause: corrupted/invalid method label on a metric (label-cardinality / invalid-label-value), which makes the Prometheus client registry fail to encode the exposition output.
Symptom / Impact
GET :9090/metrics → HTTP/1.1 500 Internal Server Error (no body served).
- Prometheus cannot scrape the
monitoring/x402d-native target → up=0.
- Mainnet alerts firing as a correct consequence:
X402dMetricsTargetDown (critical → telegram-critical) since 2026-06-03 10:53 UTC
- generic
TargetDown for job="monitoring/x402d-native" (warning) since 2026-06-03 11:00 UTC
- All native x402d metrics are currently blind on mainnet (no scrape data).
Environment
- Cluster: mainnet
goat-mainnet-usw2 (acct 361027011257, us-west-2)
- Namespace:
goat-network-app
- Pod (at time of report):
goat-x402d-649cdfc67b-jnc5j — Running 1/1, 53d uptime, 0 restarts
- Instance scraped:
10.102.113.110:9090
Repro
kubectl exec -n goat-network-app goat-x402d-649cdfc67b-jnc5j -c goat-x402d -- \
sh -c 'wget -S -qO- http://localhost:9090/metrics'
# => HTTP/1.1 500 Internal Server Error
Suspected fix area
- Metrics registration / labeling path where the
method label is set (likely an HTTP middleware/instrumentation recording raw or non-normalized request method, producing an invalid UTF-8 / high-cardinality / empty label value).
- Check the metrics handler for an error swallowed into a 500 during
gather()/encode, and normalize/validate the method label before observing.
Ops note
While this is being fixed, a time-boxed (7-day) Alertmanager silence scoped to job="monitoring/x402d-native" is being applied to suppress X402dMetricsTargetDown + the generic TargetDown for this job. The alert rules are NOT being deleted — they should re-fire if this isn't resolved when the silence expires. Real x402 balance alerts (X402GoatTssBalanceLow, X402OtherChainTssBalanceLow) are intentionally left firing.
Summary
x402dserves Prometheus metrics on:9090/metrics, but the endpoint returns HTTP 500 Internal Server Error. The process itself is healthy and serving traffic — only the metrics endpoint is broken. Suspected cause: corrupted/invalidmethodlabel on a metric (label-cardinality / invalid-label-value), which makes the Prometheus client registry fail to encode the exposition output.Symptom / Impact
GET :9090/metrics→HTTP/1.1 500 Internal Server Error(no body served).monitoring/x402d-nativetarget →up=0.X402dMetricsTargetDown(critical → telegram-critical) since 2026-06-03 10:53 UTCTargetDownforjob="monitoring/x402d-native"(warning) since 2026-06-03 11:00 UTCEnvironment
goat-mainnet-usw2(acct 361027011257, us-west-2)goat-network-appgoat-x402d-649cdfc67b-jnc5j— Running 1/1, 53d uptime, 0 restarts10.102.113.110:9090Repro
Suspected fix area
methodlabel is set (likely an HTTP middleware/instrumentation recording raw or non-normalized request method, producing an invalid UTF-8 / high-cardinality / empty label value).gather()/encode, and normalize/validate themethodlabel before observing.Ops note
While this is being fixed, a time-boxed (7-day) Alertmanager silence scoped to
job="monitoring/x402d-native"is being applied to suppressX402dMetricsTargetDown+ the genericTargetDownfor this job. The alert rules are NOT being deleted — they should re-fire if this isn't resolved when the silence expires. Real x402 balance alerts (X402GoatTssBalanceLow,X402OtherChainTssBalanceLow) are intentionally left firing.