CORENET-7091: Add enhancement proposal to productize ovn-kubernetes MCP tools by arkadeepsen · Pull Request #2002 · openshift/enhancements

arkadeepsen · 2026-05-08T09:40:20Z

No description provided.

openshift-ci-robot · 2026-05-08T09:40:25Z

@arkadeepsen: This pull request references CORENET-7091 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

arghosh93 · 2026-05-08T14:06:33Z

+
+OVN-Kubernetes operators and support engineers often need Northbound and Southbound database views (`ovn-nbctl`, `ovn-sbctl`, traces, logical flows) while investigating connectivity and routing. These tools are already implemented in ovn-kubernetes-mcp, but OpenShift users benefit from consuming them via a **single MCP server** that shares authentication, tool governance, and documentation with the rest of the platform troubleshooting surface.
+
+The primary motivation for landing these tools in upstream kubernetes-mcp-server is **productization via downstream sync into openshift-mcp-server**. By first integrating the OVN toolset upstream, OpenShift can ship and support the same upstream code through the established downstream pipeline.


Instead of productization, can we say that keeping all Openshft related MCP servers in a single repository is the main motivation? or we can keep both.

Added a line stating that the ovnk tools can be consumed from the same ocp mcp server.

arghosh93 · 2026-05-08T16:07:36Z

+  kms --> Sync
+```
+
+**Downstream.** openshift-mcp-server consumes kubernetes-mcp-server changes through its normal fork sync or vendor workflow (exact mechanics follow that repository’s documented process).


Dont we want to add any more implementation details, like what exact tools will be added and what purpose those may serve?

I have added that all the tools under ovn and ovs packages will be added to the OCP MCP server. I have added some more details about how to go about the implementation. I didn't want to add specific details of the the local PoC I did as that might not be the only way of implementing the integration.

arghosh93 · 2026-05-08T16:16:14Z

+- Add an `ovn-kubernetes` toolset to kubernetes-mcp-server that reuses the existing OVN MCP tool implementations from ovn-kubernetes-mcp, rather than re-implementing equivalent functionality.
+- Enable kubernetes-mcp-server to execute OVN tool commands in-cluster using its existing pod-exec capabilities, with only minor upstream refactoring required in the imported OVN tools.
+- Import the OVN and OVS layers from ovn-kubernetes-mcp incrementally (starting with core OVN/OVS troubleshooting tools), expanding coverage as dependencies and eval coverage mature.
+- Make the toolset available to OpenShift users through openshift-mcp-server via downstream sync from kubernetes-mcp-server.


Is having an automated sync mechanism between ovn-mcp-server, kubernetes-mcp-server and openshift-mcp-server also a goal of this feature?

The current plan is to import the the packages from ovn-kubernetes-mcp repo. Thus, whenever we need the latest changes in kubernetes-mcp-server, the go.mod and go.sum files can be updated to refer to the latest changes from ovn-kubernetes-mcp repo. Regarding the automation, since kubernetes-mcp-server is in a separate upstream repo where we are not maintainers, not sure whether adding the automatic sync process as part of this EP would be appropriate. We can figure that part out, if needed, in the future. For now, we'll just bump the import as we do for k8s bump in the different repos.

Given that must-gather is downstream specific, bringing it into the kubernetes-mcp-server would not be a problem, right?

There's already an existing downstream effort for must-gather. It differs from how it's been implemented in ovn-kubernetes-mcp repo. If we want to integrate the networking bits from the must-gather tool, we'll have to do it in the openshift-mcp-server directly, as kubernetes-mcp-server won't have must-gather related tools.

okay, we can skip using must-gather tool from ovn-kubernetes-mcp and use the existing one. We can try to directly add networking bits to kubernetes-mcp-server to imitate behaviour in ovn-kubernetes-mcp. Can we consider this one of the goals?

It will not work on kubernetes-mcp-server as the must-gather implementation is in downstream openshift-mcp-server.

arghosh93 · 2026-05-12T15:44:04Z

+- Add an `ovn-kubernetes` toolset to kubernetes-mcp-server that reuses the existing OVN MCP tool implementations from ovn-kubernetes-mcp, rather than re-implementing equivalent functionality.
+- Enable kubernetes-mcp-server to execute OVN tool commands in-cluster using its existing pod-exec capabilities, with only minor upstream refactoring required in the imported OVN tools.
+- Import the OVN and OVS layers from ovn-kubernetes-mcp incrementally (starting with core OVN/OVS troubleshooting tools), expanding coverage as dependencies and eval coverage mature.
+- Make the toolset available to OpenShift users through openshift-mcp-server via downstream sync from kubernetes-mcp-server.


Given that must-gather is downstream specific, bringing it into the kubernetes-mcp-server would not be a problem, right?

arghosh93 · 2026-05-12T15:45:48Z

+
+- Add an `ovn-kubernetes` toolset to kubernetes-mcp-server that reuses the existing OVN MCP tool implementations from ovn-kubernetes-mcp, rather than re-implementing equivalent functionality.
+- Enable kubernetes-mcp-server to execute OVN/OVS tool commands in-cluster using its existing pod-exec capabilities, with only minor refactoring required in **ovn-kubernetes-mcp** and **kubernetes-mcp-server** to integrate that pod-exec path cleanly.
+- Import the full OVN and OVS handler set from ovn-kubernetes-mcp (`pkg/ovn/mcp` and `pkg/ovs/mcp`) into the `ovn-kubernetes` toolset, while other upstream packages stay excluded per Non-Goals.


As per today's discussion, we should mention kernel and sosreport tools which are helpful to explore node's kernel resources.

This is already changed.

arghosh93 · 2026-05-12T15:46:18Z

+- Full parity in the first iteration with every tool category shipped by the standalone ovn-kubernetes-mcp binary (for example kernel diagnostics, optional images such as pwru/tcpdump, must-gather, sosreport) where those require separate dependencies, images, or workflows.
+- New Kubernetes or OpenShift APIs, CRDs, operators, or cluster-side agents solely for this feature.
+- Replacing existing CLI-based troubleshooting; MCP tools are an additional interface.
+- Importing ovn-kubernetes-mcp tools under `kernel` and `network-tools` packages in the first iteration, since those tools depend on a node debugging capability (for example a node-debug tool) that is not currently available in kubernetes-mcp-server.


We probably need to remove this.

This is already changed.

arghosh93 · 2026-05-12T15:54:56Z

+
+### Non-Goals
+
+- Full parity in the first iteration with every tool category shipped by the standalone ovn-kubernetes-mcp binary (for example kernel diagnostics, optional images such as pwru/tcpdump, must-gather, sosreport) where those require separate dependencies, images, or workflows.


Since ovn-kubernetes-mcp is an upstream repo, we can't expect all current and future tools to be applicable to an OpenShift environment.
Given that we plan to import the packages from ovn-kubernetes-mcp repo, how should we control access to tools that may not be supported?

We're only going to call the handlers of the tools which are supported. The import is for the packages where these handlers are defined. Unsupported handlers should not be used.

arghosh93 · 2026-05-13T06:20:52Z

+  - https://redhat.atlassian.net/browse/CORENET-7091
+see-also:
+  - https://github.com/ovn-kubernetes/ovn-kubernetes-mcp
+  - https://github.com/containers/kubernetes-mcp-server


NIT: do we need kubernetes-mcp-server and openshift-mcp-server here?

Yes. Since the implementation of the EP will impact all the repos, we need all of them to be incuded here.

arghosh93 · 2026-05-13T06:33:26Z

+### User Stories
+
+- As a cluster administrator or platform engineer, I want OVN-Kubernetes MCP troubleshooting tools in the same MCP server I already use for Kubernetes resources, so that I do not have to deploy, operate, or manage authentication for a second MCP server dedicated only to OVN-Kubernetes.
+- As a support engineer, I want MCP clients to expose the full ovn-kubernetes-mcp troubleshooting surface that kubernetes-mcp-server imports—NB/SB inspection and related `ovn-*` workflows (including `get`, `lflow-list`, `trace` where those tools apply), OVS bridge and OpenFlow helpers, and **`kernel`** / **`network-tools`** host and capture tooling—so that assisted troubleshooting matches how other cluster operations are automated without switching servers or credentials mid-incident.


Suggested change

- As a support engineer, I want MCP clients to expose the full ovn-kubernetes-mcp troubleshooting surface that kubernetes-mcp-server imports—NB/SB inspection and related `ovn-*` workflows (including `get`, `lflow-list`, `trace` where those tools apply), OVS bridge and OpenFlow helpers, and **`kernel`** / **`network-tools`** host and capture tooling—so that assisted troubleshooting matches how other cluster operations are automated without switching servers or credentials mid-incident.

- As a support engineer, I want MCP clients to expose the full ovn-kubernetes-mcp troubleshooting surface that kubernetes-mcp-server imports (NB/SB inspection and related `ovn-*` workflows (including `get`, `lflow-list`, `trace` where those tools apply), OVS bridge and OpenFlow helpers, and **`kernel`** / **`network-tools`** host and capture tooling) so that assisted troubleshooting matches how other cluster operations are automated without switching servers or credentials mid-incident.

There's already a bracket between the dashes.

arghosh93 · 2026-05-13T07:44:25Z

+- Add an `ovn-kubernetes` toolset to kubernetes-mcp-server that reuses the existing OVN MCP tool implementations from ovn-kubernetes-mcp, rather than re-implementing equivalent functionality.
+- Enable kubernetes-mcp-server to execute OVN tool commands in-cluster using its existing pod-exec capabilities, with only minor upstream refactoring required in the imported OVN tools.
+- Import the OVN and OVS layers from ovn-kubernetes-mcp incrementally (starting with core OVN/OVS troubleshooting tools), expanding coverage as dependencies and eval coverage mature.
+- Make the toolset available to OpenShift users through openshift-mcp-server via downstream sync from kubernetes-mcp-server.


okay, we can skip using must-gather tool from ovn-kubernetes-mcp and use the existing one. We can try to directly add networking bits to kubernetes-mcp-server to imitate behaviour in ovn-kubernetes-mcp. Can we consider this one of the goals?

arghosh93 · 2026-05-13T08:10:34Z

+
+**Importing upstream tools into kubernetes-mcp-server.** The OVN troubleshooting MCP tools already exist in ovn-kubernetes-mcp. The integration approach for kubernetes-mcp-server is to add an `ovn-kubernetes` toolset that reuses those implementations as imported packages and exposes them through kubernetes-mcp-server’s tool registration.
+
+**Command execution strategy.** OVN/OVS tools run commands inside OVN-Kubernetes pods via kubernetes-mcp-server’s pod exec. **`kernel`** and **`network-tools`** handlers use the node-level execution contract wired up in the same integration (for example debug pod or node-targeted exec, as the upstream packages require). Imported libraries should delegate all cluster I/O to kubernetes-mcp-server rather than opening separate Kubernetes client connections. Expect **refactoring in ovn-kubernetes-mcp and kubernetes-mcp-server** so each category uses a clear, single host-supplied execution path per invocation.


I understand that kubernetes-macp-server is building its own node-debug method to allow host access using kubectl/oc CLI. However, in ovn-kubernetes-mcp we use a different method to do node debug for kernel and other network tools. I wonder how we can use tools from ovn-kubernetes-mcp while using the utility from kubernetes-mcp-server, considering it's downstream of ovn-kubernetes-mcp.

The same way we'll use pod-exec from kubernetes-mcp-server for the OVN/OVS tools. The function definition should be similar, that is, the argument list and the return type should be same in both ovn-kubernetes-mcp and kubernetes-mcp-server, for the node-debug function, which will be called by the kernel and the network-tools handlers.

Shall we mention this explicitly in the document? for what I understand the current kubernetes-mcp-server does not have any node-debug method capability so far. so if that needs to be implemented is worth to call it out in this section.

The effort is ongoing. I had a discussion with Surya and she had mentioned that in the EP we'll assume that the node-debug tool exists. We can expedite the merrging of the node-debug tool PR by helping with reviews, so that we can get started with the integration of the kernel/network-tools.

arghosh93 · 2026-05-13T08:16:33Z

+
+**Command execution strategy.** OVN/OVS tools run commands inside OVN-Kubernetes pods via kubernetes-mcp-server’s pod exec. **`kernel`** and **`network-tools`** handlers use the node-level execution contract wired up in the same integration (for example debug pod or node-targeted exec, as the upstream packages require). Imported libraries should delegate all cluster I/O to kubernetes-mcp-server rather than opening separate Kubernetes client connections. Expect **refactoring in ovn-kubernetes-mcp and kubernetes-mcp-server** so each category uses a clear, single host-supplied execution path per invocation.
+
+**Scope.** All troubleshooting tools under ovn-kubernetes-mcp **`ovn`**, **`ovs`**, **`kernel`**, and **`network-tools`** belong to this effort (NB/SB inspection, logical flows, OVN trace, OVS bridge and OpenFlow helpers, kernel-oriented diagnostics, and **`network-tools`**-style capture where applicable). Other ovn-kubernetes-mcp surfaces—must-gather, sosreport, and similar—remain out of scope unless separately agreed; see Non-Goals.


Suggested change

**Scope.** All troubleshooting tools under ovn-kubernetes-mcp **`ovn`**, **`ovs`**, **`kernel`**, and **`network-tools`** belong to this effort (NB/SB inspection, logical flows, OVN trace, OVS bridge and OpenFlow helpers, kernel-oriented diagnostics, and **`network-tools`**-style capture where applicable). Other ovn-kubernetes-mcp surfaces—must-gather, sosreport, and similar—remain out of scope unless separately agreed; see Non-Goals.

**Scope.** All troubleshooting tools under ovn-kubernetes-mcp **`ovn`**, **`ovs`**, **`kernel`**, and **`network-tools`** belong to this effort (NB/SB inspection, logical flows, OVN trace, OVS bridge and OpenFlow helpers, kernel-oriented diagnostics, and **`network-tools`**-style capture where applicable). Other ovn-kubernetes-mcp surfaces (must-gather, sosreport, and similar) remain out of scope unless separately agreed; see Non-Goals.

taanyas

lgtm

taanyas · 2026-05-13T09:17:53Z

+
+## Open Questions
+
+- How to structure mcpchecker suites or task labels so OVN/OVS, **`kernel`**, and **`network-tools`** coverage stays maintainable under kubernetes-mcp-server’s pass-rate gates, given differing cluster prerequisites?


For the mcpchecker structure — since kernel and network-tools require privileged node access which may not be available in all CI environments, would it make sense to have separate suites for OVN/OVS and kernel/network-tools so their pass rates are tracked independently?

I am more inclined towards creating a separate suite for each layer of ovnk mcp server tools. That is for each of OVN, OVS, kernel and network-tools, we'll have separate evals suites. But we can take a call when working on the evals for the tools.

mattedallo

lgtm

I added some "non blocking" comments.

mattedallo · 2026-05-13T13:40:29Z

+
+OVN-Kubernetes operators and support engineers often need Northbound and Southbound database views (`ovn-nbctl`, `ovn-sbctl`, traces, logical flows), host-oriented diagnostics, and packet or kernel-level capture workflows while investigating connectivity and routing. These tools are already implemented in ovn-kubernetes-mcp, but OpenShift users benefit from consuming them via a **single MCP server** that shares authentication, tool governance, and documentation with the rest of the platform troubleshooting surface.
+
+The primary motivation for landing these tools in upstream kubernetes-mcp-server is **productization via downstream sync into openshift-mcp-server**. By first integrating the OVN toolset upstream, OpenShift can ship and support the same upstream code through the established downstream pipeline. This also lets OpenShift customers consume the OVN-Kubernetes tools from the same MCP server as the rest of the platform troubleshooting surface, openshift-mcp-server, after downstream sync.


Nit: maybe we can expand a bit what is the cost we are saving on exploiting the existing openshift-mcp-server productization pipeline.
That will strength the motivation of integrating versus keeping it separate.

Added some more details in the motivation section.

mattedallo · 2026-05-13T15:08:14Z

+
+None. This work adds MCP tools only and does not extend the OpenShift or Kubernetes API surface.
+
+### Topology Considerations


Minor note : the topology section seems written with the local binary deployment model in mind. It might be worth a brief mention that the same considerations apply for in-cluster deployments, or a note that the OVN-K tools inherit whatever cluster-access model kubernetes-mcp-server provides.

The kubeconfig is mentioned specifically regarding hypershift since it has managment cluster and the hosted (guest) cluster which have separate kubeconfigs. The deployment is by default expected to be in-cluster and not local.

mattedallo · 2026-05-14T09:06:30Z

+
+**Importing upstream tools into kubernetes-mcp-server.** The OVN troubleshooting MCP tools already exist in ovn-kubernetes-mcp. The integration approach for kubernetes-mcp-server is to add an `ovn-kubernetes` toolset that reuses those implementations as imported packages and exposes them through kubernetes-mcp-server’s tool registration.
+
+**Command execution strategy.** OVN/OVS tools run commands inside OVN-Kubernetes pods via kubernetes-mcp-server’s pod exec. **`kernel`** and **`network-tools`** handlers use the node-level execution contract wired up in the same integration (for example debug pod or node-targeted exec, as the upstream packages require). Imported libraries should delegate all cluster I/O to kubernetes-mcp-server rather than opening separate Kubernetes client connections. Expect **refactoring in ovn-kubernetes-mcp and kubernetes-mcp-server** so each category uses a clear, single host-supplied execution path per invocation.


Shall we mention this explicitly in the document? for what I understand the current kubernetes-mcp-server does not have any node-debug method capability so far. so if that needs to be implemented is worth to call it out in this section.

mattedallo · 2026-05-14T10:43:12Z

+
+**Split of work:** kubernetes-mcp-server decides how each capability is exposed to MCP users (tool names and parameters). ovn-kubernetes-mcp keeps handler logic that validates inputs, builds command lines, and defines execution contracts; kubernetes-mcp-server integrates by calling those libraries and supplying pod exec, node-level debugging, or other supported cluster operations against the target cluster.
+
+```mermaid


On the diagram few things tripped me up:

The main call relationship (kubernetes-mcp-server's tool handler calling ovn-kubernetes-mcp's imported handler logic) isn't shown that's the core of the integration.

"delegated_in_cluster_execution" sits inside the ovn-kubernetes-mcp box, but the actual execution will happen in kubernetes-mcp-server's client AFAIU. ovn-kubernetes-mcp defines the contract/interface; kubernetes-mcp-server implements it.

The box only shows "OVN_OVS" but kernel and network-tools are also in scope, with a different execution path (node-debug vs pod-exec).

The two subgraphs connected by a dotted arrow could be read as two separate services communicating at runtime, when in practice ovn-kubernetes-mcp will be compiled into kubernetes-mcp-server as an imported Go package.

Would something like this be more accurate? Let me know your thoughts

flowchart TB subgraph kms [kubernetes-mcp-server process] ToolHandler["Tool handler\n(defines MCP tool name, schema)"] subgraph ovnkLib ["ovn-kubernetes-mcp (imported Go package)"] HandlerLogic["Handler logic\n(validates inputs, builds commands)"] end subgraph executor [kubernetes-mcp-server K8s client] PodExec["PodExec\n(OVN/OVS tools)"] NodeDebug["NodeDebug\n(kernel / network-tools)"] end ToolHandler -->|"calls imported package"| HandlerLogic HandlerLogic -->|"calls injected executor"| PodExec HandlerLogic -->|"calls injected executor"| NodeDebug PodExec -->|"exec in ovnkube pod"| Cluster["Cluster"] NodeDebug -->|"privileged debug pod on node"| Cluster end

Loading

I have removed the mermaid diagram as it was getting messier. Hope the latest diagram helps in conveying the integration more clearly.

yes it is, thanks!

mattedallo · 2026-05-15T15:17:04Z

+
+None. This work adds MCP tools only and does not extend the OpenShift or Kubernetes API surface.
+
+### Topology Considerations


Minor: I would add a sentence under Topology Considerations to introduce what each section is going to address and to clarify that the OVN-K toolset inherits openshift-mcp-server's existing cluster-access mechanisms.
Something like:

The OVN-Kubernetes toolset uses openshift-mcp-server's existing pod-exec and node-debug capabilities and does not introduce new cluster-access mechanisms or deployment requirements. The considerations below describe topology-specific implications for those underlying capabilities, not for the OVN-K tools themselves.

mattedallo · 2026-05-15T15:48:18Z

+
+#### Hypershift / Hosted Control Planes
+
+The MCP server uses whatever cluster the kubeconfig targets. For HyperShift, that is typically the **hosted cluster** API when troubleshooting workload networking; there is no change to management-plane APIs. Operators must select the correct context (management versus guest) the same way they would for `kubectl exec`.


Nit (Optional): I found this subsection a bit hard to follow without prior HyperShift context. A brief mention of where the OVN-K components live in HyperShift (ovnkube-node on the hosted cluster, control-plane on the management cluster) would help the reader understand why the hosted cluster API is the right target. It would also be useful to note that this is inherited from openshift-mcp-server's existing cluster-targeting behavior rather than something new introduced by this feature.

Totally optional comment, just thinking about readers who aren't deeply familiar with HyperShift topology.

One possible rewording (feel free to ignore or adapt) :

The OVN-K toolset inherits openshift-mcp-server's existing cluster-targeting behavior and does not introduce any HyperShift-specific logic.

In HyperShift, ovnkube-node pods -- which contain the per-node OVN NB/SB databases, northd, and ovn-controller -- run on the hosted cluster worker nodes. All OVN-K troubleshooting targets (pod exec into ovnkube-node, node-debug for kernel/network-tools) therefore require the MCP server to reach the hosted cluster API, not the management cluster. The lightweight ovnkube-control-plane on the management cluster is not targeted by any tool in this toolset.

This is the same cluster-selection requirement that applies to any openshift-mcp-server toolset targeting workload-cluster resources. In kubeconfig mode, the operator selects the hosted cluster context; in an in-cluster deployment, the server must be deployed into (or configured to reach) the hosted cluster.

mattedallo · 2026-05-15T15:58:54Z

+
+#### Standalone Clusters
+
+Fully relevant: tools execute against pods on the same cluster the API client reaches.


Minor nit: the "Fully relevant:" is unclear what refers to. Maybe "No special considerations:" is more clear.

mattedallo · 2026-05-15T16:38:51Z

+
+**Split of work:** kubernetes-mcp-server decides how each capability is exposed to MCP users (tool names and parameters). ovn-kubernetes-mcp keeps handler logic that validates inputs, builds command lines, and defines execution contracts; kubernetes-mcp-server integrates by calling those libraries and supplying pod exec, node-level debugging, or other supported cluster operations against the target cluster.
+
+```mermaid


yes it is, thanks!

arghosh93

One NIT comment. Otherwise, LGTM.

arghosh93 · 2026-05-19T09:02:24Z

+
+## Motivation
+
+OVN-Kubernetes operators and support engineers often need Northbound and Southbound database views (`ovn-nbctl`, `ovn-sbctl`, traces, logical flows), host-oriented diagnostics, and packet or kernel-level capture workflows while investigating connectivity and routing. These tools are already implemented in ovn-kubernetes-mcp, but OpenShift users benefit from consuming them via a **single MCP server** that shares authentication, tool governance, and documentation with the rest of the platform troubleshooting surface.


NIT: This does not mention OVS OpenFlows. I do agree that we have mentioned this later on in the enhancement, and if you want to ignore it, that should be fine.

Updated the motivation.

mattedallo · 2026-05-19T09:17:07Z

lgtm

openshift-ci · 2026-05-19T09:30:49Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: arghosh93, mattedallo, taanyas
Once this PR has been reviewed and has the lgtm label, please assign abhat for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

enhancements/network/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

arkadeepsen · 2026-05-19T09:33:59Z

@tssurya PTAL

Cali0707 · 2026-06-09T14:20:26Z

+
+## Alternatives (Not Implemented)
+
+- **Add the OVN toolset to kubernetes-mcp-server first, then rely on downstream sync into openshift-mcp-server:** Not chosen for this enhancement because OpenShift is landing the integration directly in openshift-mcp-server to ship on product cadence without gating on upstream kubernetes-mcp-server acceptance, release, and fork sync timing. The import-and-delegate pattern remains the same; a future upstream integration could still reduce long-term duplication if both codebases converge.


I'm not sure what the concern around waiting for upstream merge and sync is, as upstream is currently 100% Red Hat. In general whenever a toolset has no hard requirements on openshift APIs we prefer to land upstream. Is the issue that this toolset requires openshift specifics?

Currently the upstream is synced downstream about 1-2 times per week, and can be done more frequently when needed. cc @mattedallo

I have mentioned the other reasons here: openshift/openshift-mcp-server#315 (comment)

Adding the same here for easy readability:

one of the reasons not to add these tools to upstream is because some of the ovnk tools need the node-debug functionality and as per some existing conversations it seems that adding that tool upstream is not in current plans. Additionally, upstream k8s-mcp-server might want to be CNI agnostic whereas for openshift-mcp-server these tools will be very useful as most customers use ovnk as the CNI. We already have a separate upstream repo for ovnk mcp server (https://github.com/ovn-kubernetes/ovn-kubernetes-mcp) and thus adding these tools in k8s-mcp-server will mean that 2 separate upstream projects have the same tools, which probably is not ideal.

I think it's better if I add them in the EP itself.

Added the same in the EP.

@Cali0707 Is upstream kubernetes-mcp-server planning to stay CNI agnostic like Kubernetes in general is? - I guess yes?
OR are there plans to allow calico, cilium, ovn-kubernetes, and other CNIs to add their stack troubleshooting ? I guess this decision depends on scope of kubernetes-mcp-server project..

tssurya

nicely written!

some inline comments/questions

tssurya · 2026-06-22T08:52:14Z

+
+## Motivation
+
+OVN-Kubernetes operators and support engineers often need Northbound and Southbound database views (`ovn-nbctl`, `ovn-sbctl`, traces, logical flows), OVS bridge and OpenFlow inspection (`ovs-ofctl` and related helpers), host-oriented diagnostics, and packet or kernel-level capture workflows while investigating connectivity and routing. These tools are already implemented in ovn-kubernetes-mcp, but OpenShift users benefit from consuming them via a **single MCP server** that shares authentication, tool governance, and documentation with the rest of the platform troubleshooting surface.


nit: I think all sections call out the benefits of a single MCP server which is great - but main thing this enhancement brings is in "missing MCP tools in OCP for troubleshooting networking issues" - let's make that the main intent - its to provide the existing upstream mcp-server tools to support/operators/end-users to troubleshoot networking issues - I know its self-implied but let's call that part out.
i.e OCP MCP Server doesn't have core networking tools exposed like ovn, ovs ctl etc

tssurya · 2026-06-22T08:55:04Z

+
+### Non-Goals
+
+- Full parity with every tool category shipped by the standalone ovn-kubernetes-mcp binary (for example must-gather, sosreport) where those require separate dependencies, images, or product workflows outside this MCP integration.


i think the reason we don't bring in must-gather and sosreport is because they already exist downstream right - not because of any dependencies?

The initial plan is to add the tools for llive-cluster debugging. The next step is to add the tools for offline debugging.

The must-gather tool is added in openshift-mcp-server, but it doesn't have the network debugging functionality that is available in upstream ovn-kubernetes-mcp repo. The dependency here is on the availability of ovsdb-tool, since the the network tools use it for getting the relevant information.

So, it'll need additional considerations of how these tools can be integrated into openshift-mcp-server and is not part of the current integration effort.

tssurya · 2026-06-22T08:56:17Z

+- Enable openshift-mcp-server to run in-cluster troubleshooting for this toolset: OVN/OVS commands via existing pod-exec into suitable pods, and **`kernel`** / **`network-tools`** flows via whatever node-level debugging or host access path those upstream handlers require, implemented **as part of this same integration** (expect refactoring in **ovn-kubernetes-mcp** and **kubernetes-mcp-server**/**openshift-mcp-server** so execution is delegated cleanly to the host).
+- Import the full handler sets from ovn-kubernetes-mcp **`ovn`**, **`ovs`**, **`kernel`**, and **`network-tools`** into openshift-mcp-server’s OVN-Kubernetes tool registration, subject only to exclusions in Non-Goals.
+- Ship the toolset to OpenShift users in openshift-mcp-server product builds (versioning and packaging follow that repository’s release process).
+


What happens if OCP is running a cluster where the CNI is ovn-kubernetes? Do we have a way to turn it off in ocp-mcp-server? is that part of the goals?
example, our tools shouldn't be exposed if there is no openshift-ovn-kubernetes namespace even.

By default only core and config toolsets will be enabled by default. Other toolsets have to be explicitly enabled for usage: https://github.com/openshift/openshift-mcp-server/blob/main/docs/openshift/user-guide.md#toolsets-and-functionality

tssurya · 2026-06-22T08:57:46Z

+
+### Workflow Description
+
+1. An operator configures MCP clients (for example Cursor, other MCP hosts) to use openshift-mcp-server with a kubeconfig that can reach the target cluster and satisfies RBAC for pod read and pod exec where policies allow.


are there any docs around the ocp-mcp-server product usage for end users since its already tech preview? I'm curious to see what's the workflow outlined for ocp-users to install the server and use it
we might benefit from referencing that here

The documentation is available in the github repo: https://github.com/openshift/openshift-mcp-server/blob/main/docs/openshift/user-guide.md#deployment-and-architectural-guardrails

I didn't find any docs in docs.redhat.com. I'll cross check with the openshift-mcp-server folks.

tssurya · 2026-06-22T09:05:19Z

+
+### Risks and Mitigations
+
+- **RBAC and privilege:** Pod exec and node-level debugging are sensitive. Mitigation: reuse openshift-mcp-server permission models for `pods/exec`, node-scoped operations, and any debug-pod workflows; document required roles; keep tools read-only where possible.


out of curiosity what's the permission model ocp-mcp-server is using? any doc links to their design - speaking of which if ocp-mcp-server had a design doc we should include that here

I didn't find any separate design docs. I'll check with the maintainers.

tssurya · 2026-06-22T09:12:52Z

+
+- **Logs:** API server **audit logs** may record `pods/exec` and node- or debug-related API calls according to cluster policy. **openshift-mcp-server** logs should show handler errors, including which execution path failed (pod exec versus node debug). For node-debug failures, correlate MCP server timestamps with events on the target node and any debug pod namespace the integration uses.
+
+- **Disable:** Disable or unregister the `ovn-kubernetes` toolset in MCP deployment configuration (exact mechanism depends on openshift-mcp-server packaging); no cluster-side toggle is defined here. Disabling the whole MCP server removes all toolsets, including OVN-Kubernetes; there is no per-path cluster toggle for pod exec versus node debug in this enhancement.


do we have docs around this in how ocp-mcp-server does this?

Toolsets are by default disabled. They have to explicitly enabled. Once enabled, they can be disabled by removing the toolset names: https://github.com/openshift/openshift-mcp-server/blob/main/docs/openshift/user-guide.md#toolsets-and-functionality

tssurya · 2026-06-22T09:14:07Z

+### Dev Preview -> Tech Preview
+
+- Imported OVN-Kubernetes MCP tools (OVN/OVS, **`kernel`**, **`network-tools`**) usable end to end against representative clusters where RBAC and cluster policy allow the required pod and node-level operations.
+- Clear documentation for namespace/pod selection, node or debug-pod selection where applicable, and permissions.


where are the docs for ocp-mcp-server present? are we working closely with the docs team on what we plan to document as supported tools? - i think we are missing s documentation section..

For now I haven't found anything on the docs.redhat.com. The userguide is available in the github repo: https://github.com/openshift/openshift-mcp-server/blob/main/docs/openshift/user-guide.md

tssurya · 2026-06-22T09:16:36Z

+
+- **Disable:** Disable or unregister the `ovn-kubernetes` toolset in MCP deployment configuration (exact mechanism depends on openshift-mcp-server packaging); no cluster-side toggle is defined here. Disabling the whole MCP server removes all toolsets, including OVN-Kubernetes; there is no per-path cluster toggle for pod exec versus node debug in this enhancement.
+
+## Infrastructure Needed [optional]


I think we discussed this at some point also to consider adding opensource models for CI - but we need to check with ocp-mcp-server team d/s on how they do this and if we can use or reuse that for u/s as well

for offline debugging using must-gather/sosreport - is that not in scope? -

The evals are configured to run using claude, gemini, openai. For each of them the corresponding API token has to be provided.

tssurya · 2026-06-22T09:18:35Z

+- **Logs:** API server **audit logs** may record `pods/exec` and node- or debug-related API calls according to cluster policy. **openshift-mcp-server** logs should show handler errors, including which execution path failed (pod exec versus node debug). For node-debug failures, correlate MCP server timestamps with events on the target node and any debug pod namespace the integration uses.
+
+- **Disable:** Disable or unregister the `ovn-kubernetes` toolset in MCP deployment configuration (exact mechanism depends on openshift-mcp-server packaging); no cluster-side toggle is defined here. Disabling the whole MCP server removes all toolsets, including OVN-Kubernetes; there is no per-path cluster toggle for pod exec versus node debug in this enhancement.
+


do we need some kind of perf/scale section? even if its adding some open ended questions still better to think about it than not have it - example number of tools, tool callback time evals (depends on where the model is running i guess) - but im curious to see if the ocp-mcp-server folks had any thoughts around this

Currently the requirement for any tool to be added is to have evals added for it and the evals passing the minimum criteria. I am not aware of any perfscale requirement for now. I'll check with the maintainers regarding this.

tssurya · 2026-06-22T09:27:43Z

+
+- **Unit tests:** Ensure imported tool implementations can be exercised without requiring a live cluster (for example by substituting test doubles for in-cluster command execution and validating command construction and output handling), including **`kernel`** and **`network-tools`** handlers where feasible.
+- **Integration:** Validate the `ovn-kubernetes` toolset end to end in openshift-mcp-server: pod-exec paths for OVN/OVS, and node-level paths for **`kernel`** / **`network-tools`** as implemented for this integration.
+- **Manual:** Run MCP tool calls against a cluster with OVN-Kubernetes installed, verifying OVN/OVS output for a known `ovnkube-node` pod and representative **`kernel`** / **`network-tools`** scenarios supported by the cluster.


what are the testing scenarios we are targeting? - are we planning to induce something and then check if tools are executed in the right ordering and its doing top-down flow etc?

The evals have a format of providing a prompt and the response of the prompt needs to pass a verification step. For now most of the existing evals are using simple scenarios so that the corresponding tools are called and the response is verified.

openshift-ci · 2026-06-24T14:32:13Z

@arkadeepsen: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 8, 2026

openshift-ci Bot requested review from abhat and danwinship May 8, 2026 09:40

arkadeepsen force-pushed the ovnk-mcp branch from 6ace90c to b1bf890 Compare May 8, 2026 10:46

arghosh93 reviewed May 8, 2026

View reviewed changes

arkadeepsen force-pushed the ovnk-mcp branch 2 times, most recently from aaccb39 to bbf81a5 Compare May 12, 2026 15:54

arghosh93 reviewed May 12, 2026

View reviewed changes

arghosh93 reviewed May 13, 2026

View reviewed changes

arkadeepsen force-pushed the ovnk-mcp branch from bbf81a5 to 4895c8e Compare May 13, 2026 08:34

taanyas approved these changes May 13, 2026

View reviewed changes

mattedallo reviewed May 14, 2026

View reviewed changes

arkadeepsen force-pushed the ovnk-mcp branch from 4895c8e to 7304386 Compare May 15, 2026 07:45

mattedallo reviewed May 15, 2026

View reviewed changes

arkadeepsen force-pushed the ovnk-mcp branch from 7304386 to 0316210 Compare May 18, 2026 16:58

arghosh93 approved these changes May 19, 2026

View reviewed changes

mattedallo approved these changes May 19, 2026

View reviewed changes

arkadeepsen force-pushed the ovnk-mcp branch from 0316210 to 0c31631 Compare May 19, 2026 09:32

mattedallo mentioned this pull request May 19, 2026

CORENET-7134: Add ovn-kubernetes toolset skeleton openshift/openshift-mcp-server#315

Open

Cali0707 reviewed Jun 9, 2026

View reviewed changes

arkadeepsen force-pushed the ovnk-mcp branch from 0c31631 to 5e11479 Compare June 12, 2026 13:10

tssurya suggested changes Jun 22, 2026

View reviewed changes

Add enhancement proposal to productize ovn-kubernetes MCP tools

7cea3cd

arkadeepsen force-pushed the ovnk-mcp branch from 5e11479 to 7cea3cd Compare June 24, 2026 14:07


		OVN-Kubernetes operators and support engineers often need Northbound and Southbound database views (`ovn-nbctl`, `ovn-sbctl`, traces, logical flows) while investigating connectivity and routing. These tools are already implemented in ovn-kubernetes-mcp, but OpenShift users benefit from consuming them via a single MCP server that shares authentication, tool governance, and documentation with the rest of the platform troubleshooting surface.

		The primary motivation for landing these tools in upstream kubernetes-mcp-server is productization via downstream sync into openshift-mcp-server. By first integrating the OVN toolset upstream, OpenShift can ship and support the same upstream code through the established downstream pipeline.


		### Non-Goals

		- Full parity in the first iteration with every tool category shipped by the standalone ovn-kubernetes-mcp binary (for example kernel diagnostics, optional images such as pwru/tcpdump, must-gather, sosreport) where those require separate dependencies, images, or workflows.


		Importing upstream tools into kubernetes-mcp-server. The OVN troubleshooting MCP tools already exist in ovn-kubernetes-mcp. The integration approach for kubernetes-mcp-server is to add an `ovn-kubernetes` toolset that reuses those implementations as imported packages and exposes them through kubernetes-mcp-server’s tool registration.

		Command execution strategy. OVN/OVS tools run commands inside OVN-Kubernetes pods via kubernetes-mcp-server’s pod exec. `kernel` and `network-tools` handlers use the node-level execution contract wired up in the same integration (for example debug pod or node-targeted exec, as the upstream packages require). Imported libraries should delegate all cluster I/O to kubernetes-mcp-server rather than opening separate Kubernetes client connections. Expect refactoring in ovn-kubernetes-mcp and kubernetes-mcp-server so each category uses a clear, single host-supplied execution path per invocation.


		Command execution strategy. OVN/OVS tools run commands inside OVN-Kubernetes pods via kubernetes-mcp-server’s pod exec. `kernel` and `network-tools` handlers use the node-level execution contract wired up in the same integration (for example debug pod or node-targeted exec, as the upstream packages require). Imported libraries should delegate all cluster I/O to kubernetes-mcp-server rather than opening separate Kubernetes client connections. Expect refactoring in ovn-kubernetes-mcp and kubernetes-mcp-server so each category uses a clear, single host-supplied execution path per invocation.

		Scope. All troubleshooting tools under ovn-kubernetes-mcp `ovn`, `ovs`, `kernel`, and `network-tools` belong to this effort (NB/SB inspection, logical flows, OVN trace, OVS bridge and OpenFlow helpers, kernel-oriented diagnostics, and `network-tools`-style capture where applicable). Other ovn-kubernetes-mcp surfaces—must-gather, sosreport, and similar—remain out of scope unless separately agreed; see Non-Goals.


		## Open Questions

		- How to structure mcpchecker suites or task labels so OVN/OVS, `kernel`, and `network-tools` coverage stays maintainable under kubernetes-mcp-server’s pass-rate gates, given differing cluster prerequisites?


		None. This work adds MCP tools only and does not extend the OpenShift or Kubernetes API surface.

		### Topology Considerations


		Split of work: kubernetes-mcp-server decides how each capability is exposed to MCP users (tool names and parameters). ovn-kubernetes-mcp keeps handler logic that validates inputs, builds command lines, and defines execution contracts; kubernetes-mcp-server integrates by calling those libraries and supplying pod exec, node-level debugging, or other supported cluster operations against the target cluster.

		```mermaid


		#### Hypershift / Hosted Control Planes

		The MCP server uses whatever cluster the kubeconfig targets. For HyperShift, that is typically the hosted cluster API when troubleshooting workload networking; there is no change to management-plane APIs. Operators must select the correct context (management versus guest) the same way they would for `kubectl exec`.


		#### Standalone Clusters

		Fully relevant: tools execute against pods on the same cluster the API client reaches.


		## Motivation

		OVN-Kubernetes operators and support engineers often need Northbound and Southbound database views (`ovn-nbctl`, `ovn-sbctl`, traces, logical flows), host-oriented diagnostics, and packet or kernel-level capture workflows while investigating connectivity and routing. These tools are already implemented in ovn-kubernetes-mcp, but OpenShift users benefit from consuming them via a single MCP server that shares authentication, tool governance, and documentation with the rest of the platform troubleshooting surface.


		## Alternatives (Not Implemented)

		- Add the OVN toolset to kubernetes-mcp-server first, then rely on downstream sync into openshift-mcp-server: Not chosen for this enhancement because OpenShift is landing the integration directly in openshift-mcp-server to ship on product cadence without gating on upstream kubernetes-mcp-server acceptance, release, and fork sync timing. The import-and-delegate pattern remains the same; a future upstream integration could still reduce long-term duplication if both codebases converge.


		### Non-Goals

		- Full parity with every tool category shipped by the standalone ovn-kubernetes-mcp binary (for example must-gather, sosreport) where those require separate dependencies, images, or product workflows outside this MCP integration.


		### Workflow Description

		1. An operator configures MCP clients (for example Cursor, other MCP hosts) to use openshift-mcp-server with a kubeconfig that can reach the target cluster and satisfies RBAC for pod read and pod exec where policies allow.


		### Risks and Mitigations

		- RBAC and privilege: Pod exec and node-level debugging are sensitive. Mitigation: reuse openshift-mcp-server permission models for `pods/exec`, node-scoped operations, and any debug-pod workflows; document required roles; keep tools read-only where possible.


		- Logs: API server audit logs may record `pods/exec` and node- or debug-related API calls according to cluster policy. openshift-mcp-server logs should show handler errors, including which execution path failed (pod exec versus node debug). For node-debug failures, correlate MCP server timestamps with events on the target node and any debug pod namespace the integration uses.

		- Disable: Disable or unregister the `ovn-kubernetes` toolset in MCP deployment configuration (exact mechanism depends on openshift-mcp-server packaging); no cluster-side toggle is defined here. Disabling the whole MCP server removes all toolsets, including OVN-Kubernetes; there is no per-path cluster toggle for pod exec versus node debug in this enhancement.


		- Disable: Disable or unregister the `ovn-kubernetes` toolset in MCP deployment configuration (exact mechanism depends on openshift-mcp-server packaging); no cluster-side toggle is defined here. Disabling the whole MCP server removes all toolsets, including OVN-Kubernetes; there is no per-path cluster toggle for pod exec versus node debug in this enhancement.

		## Infrastructure Needed [optional]

Uh oh!

Conversation

arkadeepsen commented May 8, 2026

Uh oh!

openshift-ci-robot commented May 8, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arkadeepsen May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

taanyas left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattedallo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

openshift-ci-robot commented May 8, 2026 •

edited by openshift-ci Bot

Loading

arkadeepsen May 13, 2026 •

edited

Loading