Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -437,9 +437,10 @@ The `docs/` directory contains user-facing documentation:
- `docs/prompts.md` – MCP Prompts configuration guide
- `docs/logging.md` – MCP Logging guide (automatic K8s error logging, secret redaction)
- `docs/OTEL.md` – OpenTelemetry observability setup
- `docs/metrics.md` – Metrics toolset (Prometheus / Alertmanager via obs-mcp)
- `docs/tracing.md` – Tracing toolset (Grafana Tempo via obs-mcp)
- `docs/otelcol.md` – OpenTelemetry Collector toolset (component discovery, schemas, and config validation via obs-mcp)
- `docs/observability/metrics.md` – Metrics toolset (Prometheus / Alertmanager via obs-mcp)
- `docs/observability/tracing.md` – Tracing toolset (Grafana Tempo via obs-mcp)
- `docs/observability/logs.md` – Logs toolset (Grafana Loki via obs-mcp)
- `docs/observability/otelcol.md` – OpenTelemetry Collector toolset (component discovery, schemas, and config validation via obs-mcp)
- `docs/KIALI.md` – Kiali toolset configuration
- `docs/getting-started-kubernetes.md` – Kubernetes ServiceAccount setup
- `docs/getting-started-claude-code.md` – Claude Code CLI integration
Expand Down
8 changes: 4 additions & 4 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,13 @@ Choose the guide that matches your needs:

## Toolset Guides

- **[Metrics](./metrics.md)** - Prometheus and Alertmanager tools (`metrics` toolset, via [obs-mcp](https://github.com/rhobs/obs-mcp))
- **[Tracing](./tracing.md)** - Grafana Tempo and TraceQL (`traces` toolset, via [obs-mcp](https://github.com/rhobs/obs-mcp))
- **[OpenTelemetry Collector](./otelcol.md)** - Component discovery, schemas, and config validation (`otelcol` toolset, via [obs-mcp](https://github.com/rhobs/obs-mcp))
- **[Metrics](./observability/metrics.md)** - Prometheus and Alertmanager tools (`metrics` toolset, via [obs-mcp](https://github.com/rhobs/obs-mcp))
- **[Tracing](./observability/tracing.md)** - Grafana Tempo and TraceQL (`traces` toolset, via [obs-mcp](https://github.com/rhobs/obs-mcp))
- **[Logs](./observability/logs.md)** - Grafana Loki and LogQL (`logs` toolset, via [obs-mcp](https://github.com/rhobs/obs-mcp))
- **[OpenTelemetry Collector](./observability/otelcol.md)** - Component discovery, schemas, and config validation (`otelcol` toolset, via [obs-mcp](https://github.com/rhobs/obs-mcp))
- **[OADP](OADP.md)** - Tools for OpenShift API for Data Protection (Velero backups, restores, schedules)
- **[Kiali](KIALI.md)** - Tools for Kiali ServiceMesh with Istio
- **[KubeVirt](kubevirt.md)** - KubeVirt virtual machine management tools
- **[Observability](OBSERVABILITY.md)** - Tools for Prometheus metrics and Alertmanager alerts

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was non existing doc likely from before obs-mcp toolset


## Feature Specifications

Expand Down
176 changes: 176 additions & 0 deletions docs/observability/logs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
# Logs Toolset (`logs`)

This toolset provides tools for querying [Grafana Loki](https://grafana.com/oss/loki/) using LogQL and the Loki HTTP API.
It is implemented by the [`rhobs/obs-mcp`](https://github.com/rhobs/obs-mcp) package and registered into the openshift-mcp-server as the `logs` toolset.

For Prometheus and Alertmanager MCP tools, see the [metrics toolset guide](./metrics.md).
For Grafana Tempo and TraceQL (`traces` toolset), see the [tracing toolset guide](./tracing.md).
For OpenTelemetry Collector configuration assistance (`otelcol` toolset), see the [otelcol toolset guide](./otelcol.md).

## Workflow

1. Call **`loki_list_instances`** first to discover `LokiStack` instances, namespaces, multitenancy, and tenant names.
2. Use **`loki_label_names`** (and optionally **`loki_label_values`**) to learn which labels exist before writing LogQL queries.
3. Run **`loki_query_range`** with a LogQL query to retrieve matching log streams and lines.

## Tools

### loki_list_instances

**Discovery entry point.** Lists LokiStack instances visible in the Kubernetes API.

**Parameters:** none.

**Output:** JSON per instance includes `lokiNamespace`, `lokiName`, `status`, and resolved `url`. Use `lokiNamespace`, `lokiName`, and `tenant` as parameters on other Loki tools.

---

### loki_label_names

List available Loki label names for a time range. Use this before writing LogQL queries to discover which labels are indexed.

**Parameters:**
- `lokiNamespace` (string, optional) — Kubernetes namespace of the LokiStack (from `loki_list_instances`)
- `lokiName` (string, optional) — Name of the LokiStack (from `loki_list_instances`)
- `tenant` (string, optional) — Loki tenant ID; for LokiStack gateway modes (e.g. openshift-network) use `network`
- `start` (string, optional) — Start time (RFC3339, Unix timestamp, `NOW`, or relative like `NOW-1h`)
- `end` (string, optional) — End time (RFC3339, Unix timestamp, `NOW`, or relative)

---

### loki_label_values

List possible values for a Loki label key. Use this to build precise label matchers in LogQL.

**Parameters:**
- `label` (string, required) — Label key to inspect (e.g. `namespace`, `pod`, `container`, `SrcK8S_Namespace`)
- `lokiNamespace`, `lokiName`, `tenant`, `start`, `end` — same as `loki_label_names`

---

### loki_query_range

Execute a Loki LogQL range query and return matching log streams and lines.

**Parameters:**
- `query` (string, required) — LogQL query string (e.g. `{namespace="default"}`)
- `lokiNamespace` (string, optional) — Kubernetes namespace of the LokiStack
- `lokiName` (string, optional) — Name of the LokiStack
- `tenant` (string, optional) — Loki tenant ID
- `duration` (string, optional) — Lookback duration from now when start/end are omitted (e.g. `5m`, `1h`). Defaults to `15m`
- `start` (string, optional) — Start time (RFC3339, Unix, `NOW`, or relative)
- `end` (string, optional) — End time (RFC3339, Unix, `NOW`, or relative)
- `limit` (number, optional) — Maximum number of log lines to return. Defaults to 100, max 1000
- `direction` (string, optional) — Search direction: `backward` (default) or `forward`

---

## Enable the Toolset

### Command line

```bash
kubernetes-mcp-server --toolsets core,logs
```

### Configuration file (TOML)

```toml
toolsets = ["core", "logs"]
```

### MCP client configuration

```json
{
"mcpServers": {
"kubernetes": {
"command": "npx",
"args": ["-y", "kubernetes-mcp-server@latest", "--toolsets", "core,logs"]
}
}
}
```

You can enable **`metrics`**, **`traces`**, and **`logs`** together (same obs-mcp dependency, different toolsets):

```toml
toolsets = ["core", "metrics", "traces", "logs"]
```

---

## Configuration

Optional settings use a **`[toolset_configs.logs]`** section (the key is the toolset name `logs`).

```toml
[toolset_configs.logs]
# Where to read the bearer token from: "header" (default) or "kubeconfig".
# Set to "kubeconfig" when running locally (STDIO mode) so the token is read
# from your kubeconfig session (e.g. after `oc login`).
auth_mode = "kubeconfig"

# URL of the Loki API endpoint.
# Optional — if unset, use LokiStack discovery (loki_list_instances + lokiNamespace/lokiName).
# Example for a direct Loki endpoint:
# loki_url = "https://logging-loki-gateway-http.openshift-logging.svc.cluster.local:8080"
loki_url = ""

# Skip TLS certificate verification (development only). Default: false
insecure = false

# Resolve Loki query URLs via OpenShift Routes instead of in-cluster Services.
# Default: false
useRoute = false
```

### Configuration reference

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `auth_mode` | string | `"header"` | Bearer token source: `"header"` or `"kubeconfig"` |
| `loki_url` | string | — | Loki API endpoint URL (optional; use LokiStack discovery if unset) |
| `insecure` | bool | `false` | Skip TLS certificate verification |
| `useRoute` | bool | `false` | Use OpenShift `Route` resources for LokiStack gateway URLs |

---

## Authentication and TLS

Bearer token behavior matches the [metrics toolset](./metrics.md) (**Authentication and TLS** section): `auth_mode` chooses header vs kubeconfig, and TLS uses kubeconfig CA data, OpenShift service CA when in-cluster, then the system trust store. Set `insecure = true` only when you cannot install the correct CA (not recommended in production).

### Loki URL resolution

When the `logs` toolset is enabled, the Loki URL is determined in this order:

1. `loki_url` in the `[toolset_configs.logs]` config section (if set)
2. `LOKI_URL` environment variable
3. Default: `http://localhost:3100` (kubeconfig mode only)

In `header` mode, you can either set `loki_url` **or** use LokiStack discovery (`loki_list_instances` + `lokiNamespace`/`lokiName` arguments on each tool call).

---

## Instance discovery

The server lists **`LokiStack`** objects cluster-wide and derives gateway base URLs from each resource. With **`useRoute = true`**, it prefers OpenShift `Route` hosts where available.

Chosen instances are **validated** against this discovery list before any request is sent, so callers cannot point tools at arbitrary URLs.

---

## Prerequisites

- **Loki Operator** workloads in the cluster (`LokiStack` CRs) or a standalone Loki endpoint.
- **RBAC** on the MCP identity to **list** `LokiStack` objects cluster-wide. If **`useRoute`** is enabled, the server also **gets** `Route` resources in each Loki namespace to resolve external hosts.
- **Bearer token** with permission to reach the resolved Loki API (same patterns as the metrics toolset).

---

## Related documentation

- [Metrics toolset guide](./metrics.md) — Prometheus and Alertmanager (`metrics` toolset)
- [Tracing toolset guide](./tracing.md) — Grafana Tempo and TraceQL (`traces` toolset)
- [OpenTelemetry Collector toolset guide](./otelcol.md) — Component discovery, schemas, config validation (`otelcol` toolset)
- [OTEL.md](../OTEL.md) — OpenTelemetry export from this MCP server process (not the same as querying Loki in-cluster)
1 change: 1 addition & 0 deletions docs/metrics.md → docs/observability/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ This toolset provides tools for querying Prometheus/Thanos metrics and Alertmana
It is implemented by the [`rhobs/obs-mcp`](https://github.com/rhobs/obs-mcp) package and registered
into the openshift-mcp-server as the `metrics` toolset.

For Grafana Loki and LogQL (`logs` toolset), see the [logs toolset guide](./logs.md).
For Grafana Tempo and TraceQL (`traces` toolset), see the [tracing toolset guide](./tracing.md).
For OpenTelemetry Collector configuration assistance (`otelcol` toolset), see the [otelcol toolset guide](./otelcol.md).

Expand Down
4 changes: 3 additions & 1 deletion docs/otelcol.md → docs/observability/otelcol.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Component schemas are embedded in the binary (via `redhat-opentelemetry-collecto
running Collector instance or cluster endpoint is required.

For Prometheus and Alertmanager MCP tools, see the [metrics toolset guide](./metrics.md).
For Grafana Loki and LogQL (`logs` toolset), see the [logs toolset guide](./logs.md).
For Grafana Tempo and TraceQL, see the [tracing toolset guide](./tracing.md).

## Workflow
Expand Down Expand Up @@ -127,5 +128,6 @@ No Prometheus, Tempo, or Collector endpoint URLs are needed.
## Related documentation

- [Metrics toolset guide](./metrics.md) — Prometheus and Alertmanager (`metrics` toolset)
- [Logs toolset guide](./logs.md) — Grafana Loki and LogQL (`logs` toolset)
- [Tracing toolset guide](./tracing.md) — Grafana Tempo and TraceQL (`traces` toolset)
- [OTEL.md](OTEL.md) — OpenTelemetry export from this MCP server process (not the same as Collector config assistance)
- [OTEL.md](../OTEL.md) — OpenTelemetry export from this MCP server process (not the same as Collector config assistance)
3 changes: 2 additions & 1 deletion docs/tracing.md → docs/observability/tracing.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ This toolset provides tools for querying [Grafana Tempo](https://grafana.com/doc
It is implemented by the [`rhobs/obs-mcp`](https://github.com/rhobs/obs-mcp) package and registered into the openshift-mcp-server as the `traces` toolset.

For Prometheus and Alertmanager MCP tools, see the [metrics toolset guide](./metrics.md).
For Grafana Loki and LogQL (`logs` toolset), see the [logs toolset guide](./logs.md).
For OpenTelemetry Collector configuration assistance (`otelcol` toolset), see the [otelcol toolset guide](./otelcol.md).

## Workflow
Expand Down Expand Up @@ -163,4 +164,4 @@ Chosen instances are **validated** against this discovery list before any reques
## Related documentation

- [Metrics toolset guide](./metrics.md) — Prometheus and Alertmanager (`metrics` toolset)
- [OTEL.md](OTEL.md) — OpenTelemetry export from this MCP server process (not the same as querying Tempo in-cluster)
- [OTEL.md](../OTEL.md) — OpenTelemetry export from this MCP server process (not the same as querying Tempo in-cluster)
23 changes: 23 additions & 0 deletions evals/tasks/observability/logs/loki-backend-reachability.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
kind: Task
apiVersion: mcpchecker/v1alpha2
metadata:
name: loki-backend-reachability
difficulty: easy
parallel: true
runs: 1
labels:
category: logs
suite: observability
toolType: smoke-test
description: |
Smoke test that the agent can reach Loki via loki_list_instances and report
a discovered LokiStack. Run obs-mcp with --toolsets logs (or metrics,traces,logs).
spec:
prompt:
inline: |
Is the Loki backend reachable? List LokiStack instances and report the
name, namespace, and URL of any stack you find.
verify:
- llmJudge:
contains: "obs-mcp-loki"
reason: "Verify the agent discovered the obs-mcp-loki LokiStack from loki_list_instances"
26 changes: 26 additions & 0 deletions evals/tasks/observability/logs/loki-label-names.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
kind: Task
apiVersion: mcpchecker/v1alpha2
metadata:
name: loki-label-names
difficulty: medium
parallel: true
runs: 1
labels:
category: logs
suite: observability
toolType: exploration
description: |
Tests discovery workflow: loki_list_instances then loki_label_names with tenant
network on the obs-mcp-loki stack (openshift-network mode).
spec:
prompt:
inline: |
For LokiStack obs-mcp-loki in namespace obs-mcp-loki, tenant network, what
label names are available for writing LogQL queries?
verify:
- llmJudge:
contains: "SrcK8S_Namespace"
reason: "NetObserv flow logs expose SrcK8S_Namespace as an indexed Loki label"
- llmJudge:
contains: "DstK8S_Namespace"
reason: "NetObserv flow logs expose DstK8S_Namespace as an indexed Loki label"
22 changes: 22 additions & 0 deletions evals/tasks/observability/logs/loki-label-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
kind: Task
apiVersion: mcpchecker/v1alpha2
metadata:
name: loki-label-values
difficulty: medium
parallel: true
runs: 1
labels:
category: logs
suite: observability
toolType: exploration
description: |
Tests loki_label_values for SrcK8S_Namespace on the network tenant.
spec:
prompt:
inline: |
For LokiStack obs-mcp-loki in namespace obs-mcp-loki with tenant network,
what values exist for the SrcK8S_Namespace label?
verify:
- llmJudge:
contains: "SrcK8S_Namespace"
reason: "Verify the agent queried the SrcK8S_Namespace label"
21 changes: 21 additions & 0 deletions evals/tasks/observability/logs/loki-list-instances.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
kind: Task
apiVersion: mcpchecker/v1alpha2
metadata:
name: loki-list-instances
difficulty: easy
parallel: true
runs: 1
labels:
category: logs
suite: observability
toolType: discovery
description: |
Tests that the agent calls loki_list_instances before other Loki tools.
spec:
prompt:
inline: |
Which LokiStack instances are available in this cluster?
verify:
- llmJudge:
contains: "obs-mcp-loki"
reason: "Verify the agent reported LokiStack instance details"
27 changes: 27 additions & 0 deletions evals/tasks/observability/logs/loki-query-network-flows.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
kind: Task
apiVersion: mcpchecker/v1alpha2
metadata:
name: loki-query-network-flows
difficulty: medium
parallel: true
runs: 1
labels:
category: logs
suite: observability
toolType: query
description: |
Tests loki_query_range with NetObserv flow log labels (SrcK8S_Namespace /
DstK8S_Namespace) and tenant network—not kubernetes_namespace_name.
spec:
prompt:
inline: |
Query NetObserv network flow logs from the last hour where the source or
destination namespace is obs-mcp-loki. Use LokiStack obs-mcp-loki in namespace
obs-mcp-loki with tenant network.
verify:
- llmJudge:
contains: "SrcK8S_Namespace"
reason: "Verify the agent used obs-mcp-loki indexed namespace labels in LogQL"
- llmJudge:
contains: "network"
reason: "Verify the agent used tenant network for the openshift-network LokiStack"
Loading