Skip to content

feat: add npd-prober example showing NPD + NRC integration for node readiness#159

Open
santoshkal wants to merge 1 commit intokubernetes-sigs:mainfrom
santoshkal:feat/138
Open

feat: add npd-prober example showing NPD + NRC integration for node readiness#159
santoshkal wants to merge 1 commit intokubernetes-sigs:mainfrom
santoshkal:feat/138

Conversation

@santoshkal
Copy link

@santoshkal santoshkal commented Mar 10, 2026

Description

Adds a complete npd-prober example demonstrating how to use Node Problem Detector (NPD)
custom plugins with the Node Readiness Controller (NRC) to manage node taints based on
HTTP/TCP health probes.

  • npd-prober binary: Lightweight probe binary using kubelet-style HTTP (200-399 = healthy)
    and TCP semantics, designed as an NPD custom plugin. Includes --allow-non-local-redirects
    flag matching kubelet behavior, and structured logging via klog/v2.
  • NPD integration: ConfigMap-based NPD config that invokes the prober at intervals and
    maps exit codes to a ServiceReadiness node condition.
  • NodeReadinessRule: Watches the NPD-managed condition and applies/removes a readiness.k8s.io/ServiceReady=pending:NoSchedule taint based on probe health.
  • Kind testing guide: Step-by-step instructions to deploy and verify the full
    NPD → condition → NRC → taint pipeline on a local Kind cluster, including failure
    simulation via service port misconfiguration.

Documents NPD's problem-oriented condition semantics (exit 0 = condition False, exit 1 =
condition True) and how this maps to requiredStatus: "False" in the NodeReadinessRule.

Related Issue

Fixes #138

Type of Change

/kind feature

Testing

Checklist

  • go test ./examples/npd-prober/ -v — all unit tests pass
  • go vet ./examples/npd-prober/ — no issues
  • Follow testing-npd.md on a Kind cluster to verify end-to-end flow

Does this PR introduce a user-facing change?

NONE


Doc #(issue)

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 10, 2026
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: santoshkal
Once this PR has been reviewed and has the lgtm label, please assign haircommander for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@netlify
Copy link

netlify bot commented Mar 10, 2026

Deploy Preview for node-readiness-controller canceled.

Name Link
🔨 Latest commit b37999d
🔍 Latest deploy log https://app.netlify.com/projects/node-readiness-controller/deploys/69affb3b8ef8070008c009e4

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Mar 10, 2026

CLA Signed
The committers listed above are authorized under a signed CLA.

  • ✅ login: santoshkal / name: Santosh Kaluskar (b37999d)

@k8s-ci-robot
Copy link
Contributor

Welcome @santoshkal!

It looks like this is your first PR to kubernetes-sigs/node-readiness-controller 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/node-readiness-controller has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 10, 2026
@k8s-ci-robot
Copy link
Contributor

Hi @santoshkal. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 10, 2026
@AvineshTripathi
Copy link
Contributor

I think we should also expand npd-prober to support gRPC, file, and generic exec/command probes as part of this same effort.

Even though Node Problem Detector natively supports executing scripts, pulling all of these standard Kubernetes-style probe methods (http, tcp, grpc, file, exec) into a single, unified npd-prober binary would make the user experience much cleaner and would avoid redundant code examples in the repo.

I believe including these additions now will cover the vast majority of node-readiness-controller + node-problem-detector integration use cases out-of-the-box. What do you think? cc @ajaysundark

p.s. happy to create another issue for this and work on it/even ready to assist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: NPD prober-like readiness plugin

3 participants