[WIP]OSDOCS-19990: Resource fair sharing by StephenJamesSmith · Pull Request #113457 · openshift/openshift-docs

StephenJamesSmith · 2026-06-16T14:14:22Z

Admission Fair Sharing (Kueue) Integration for Multi-Tenant Resource Fairness

Version: 4.22+

Jira: https://redhat.atlassian.net/browse/OSDOCS-19990

Preview: https://113457--ocpdocs-pr.netlify.app/openshift-enterprise/latest/ai_workloads/kueue/admission-fair-sharing.html

Dev: @kannon92
QE @MaysaMacedo @anahas-redhat

openshift-ci-robot · 2026-06-16T14:14:27Z

@StephenJamesSmith: This pull request references OSDOCS-19990 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Admission Fair Sharing (Kueue) Integration for Multi-Tenant Resource Fairness

Version: 4.22+

Jira: https://redhat.atlassian.net/browse/OSDOCS-19990

Preview:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

ocpdocs-previewbot · 2026-06-16T14:25:39Z

🤖 Tue Jun 16 14:57:00 - Prow CI generated the docs preview:

https://113457--ocpdocs-pr.netlify.app/openshift-enterprise/latest/ai_workloads/kueue/admission-fair-sharing.html

ocpdocs-vale-bot · 2026-06-16T14:26:45Z

+
+`resourceWeights`:: Assigns weights to resources. The higher the weight, the higher the penalty.
+
+`usageHalfLifeTimeSeconds`:: The time in seconds after which the current usage will decrease by half. In other words, controls how long the past consumption should impact future admission. 


🤖 [error] RedHat.TermsErrors: Use 'for example' or 'that is' rather than 'In other words'. For more information, see RedHat.TermsErrors.

openshift-ci · 2026-06-16T14:58:38Z

@StephenJamesSmith: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

StephenJamesSmith · 2026-06-16T15:00:07Z

+
+Given this upstream limitation, isolating CPU as the sole scoring factor by setting a memory weight of `0` is a reliable approach for deterministic fair sharing behavior.
+
+The following example contains `admissionFairSharing.resourceWeights` settings for mixed CPU, memory, and GPU weights:


@MaysaMacedo Wondering if we should use the example Option A from kubernetes-sigs/kueue#10434?

We can't because that is not implemented yet.

MaysaMacedo · 2026-06-16T14:58:53Z

+. Choose the `configuration` type you want to use: 
+
+* `Default`: Uses {kueue-name} predefined values.
+* `Custom`: Uses {kueue-name} predefined values.


Custom allows for the user to specify their own desired values.

MaysaMacedo · 2026-06-16T15:00:57Z

+
+[source,yaml]
+----
+config:


Instead we can recommend the user to use the following command to apply the default configuration:

oc patch kueue.kueue.openshift.io/cluster --type=merge -p \ '{"spec":{"config":{"admissionFairSharing":{"configuration":"Default"}}}}'

There's a small correction Stephen. The command would be:

oc patch kueue.kueue.openshift.io/cluster --type=merge -p \ '{"spec":{"config":{"admissionFairSharing":{"configuration":"Default","custom":null}}}}'

MaysaMacedo · 2026-06-16T15:03:13Z

+      configuration: Default
+----
+
+* For `Custom` configuration use the following command:


Suggested change

* For `Custom` configuration use the following command:

* For `Custom` configuration you can adapt the following command with your desired values:

Changed to "Use the following command to create a Custom configuration that applies values that you specify:"

MaysaMacedo · 2026-06-16T15:03:22Z

+[source,terminal]
+----
+oc patch kueue.kueue.openshift.io/cluster --type=merge -p \
+


unnecessary space

MaysaMacedo · 2026-06-16T15:06:38Z

+
+[role="_abstract"]
+Use Admission fair sharing to fairly distribute workloads across LocalQueues that share a single ClusterQueue. 
+This feature balances workload admission by prioritizing workloads from tenants that have used fewer resources historically. It tracks usage over time with a configurable decay function and applies admission penalties when workloads are admitted.


Suggested change

This feature balances workload admission by prioritizing workloads from tenants that have used fewer resources historically. It tracks usage over time with a configurable decay function and applies admission penalties when workloads are admitted.

This feature balances workload admission by prioritizing workloads from local Queues that have used fewer resources historically. It tracks usage over time with a configurable decay function and applies admission penalties when workloads are admitted.

MaysaMacedo · 2026-06-16T15:09:09Z

+[id="setting-resource-weights_{context}"]
+= Setting resource weights
+
+[role="_abstract"]


Will get back to this.

MaysaMacedo · 2026-06-16T19:12:41Z

+= Setting resource weights
+
+[role="_abstract"]
+Resource weights define the relative importance of different resource types (CPU, memory, GPU) when calculating admission penalties. Queues that consume resources with higher weights receive larger penalties, reducing their priority for future workload admission.


Instead of adding a section specific to resourceWeights, I was thinking we could just add a note where you explained what each field of the configuration is, with something like:

When using Admission Fair Sharing, the resourceWeights for any resource whose Kubernetes quantity is expressed in bytes — such as memory — must be scaled down to compensate for the internal byte representation. Without this adjustment, the raw byte value of these resources will numerically dominate human-scale resources, such as CPU cores, by several orders of magnitude, effectively making their weights meaningless. For example, if you would like to specify the value of 1.0 for the memory weight, you would need to instead specify 9.31e-10, which corresponds to 1.0 / 1,073,741,824.

@anahas-redhat @kannon92 let me know what you guys think about it

I agree removing this. Also because in the example below we're using GPUs which may be out of context without DRA explanation. Added more details here: #113457 (comment)

MaysaMacedo · 2026-06-16T19:16:26Z

@StephenJamesSmith In the description of the PR you mentioned this is for Version: 4.22+. However, that should be Version: 4.18+. Can you adapt that? Thank you

anahas-redhat · 2026-06-16T19:37:42Z

+Use Admission fair sharing to fairly distribute workloads across LocalQueues that share a single ClusterQueue. 
+This feature balances workload admission by prioritizing workloads from tenants that have used fewer resources historically. It tracks usage over time with a configurable decay function and applies admission penalties when workloads are admitted.
+
+The shared ClusterQueue causes resource starvation between tenants, creating a high risk of resource starvation for the tenants. Admission fair sharing adresses this issue by meeting the following requirements:


"high risk of resource starvation" — redundant phrasing
Suggestion: "When multiple tenants share a single ClusterQueue, some tenants risk resource starvation. Admission fair sharing addresses this by..."

anahas-redhat · 2026-06-16T19:40:09Z

+
+Improve service predictability:: Guarantee each tenant gets a consistent share of resources, reducing latency spikes and preventing starvation.
+
+Enable scalable governance:: Use dynamic, usage-based allocation instead of complex static quotas.


I think this can be changed because Admission fair sharing does not replace quotas. It works alongside ClusterQueue quotas.. maybe something like:
"Complement static quotas with dynamic, usage-based admission ordering that adapts as tenant demand changes."

anahas-redhat · 2026-06-16T19:48:55Z

+    nvidia.com/gpu.count : 50
+----
+
+In this example, ....................


Do you want to add some explanation here?

anahas-redhat · 2026-06-16T20:09:03Z

+
+[source,yaml]
+----
+admissionFairSharing:


This format does not work because time fields are in second downstream, resource name is not valid and the format is wrong.
If we want to detail about GPUs, the user would need to first create a DeviceClass (considering Nvidia, from your example):

"spec": { "config": { "resources": { "deviceClassMappings": [{ "name": "nvidia-gpus", "deviceClassNames": ["gpu.nvidia.com"] }] } } } }'

And then configure Kueue Operand like this (considering the time in your example):

oc patch kueue.kueue.openshift.io/cluster --type=merge -p '{ "spec": { "config": { "admissionFairSharing": { "configuration": "Custom", "custom": { "usageHalfLifeTimeSeconds": 432000, "usageSamplingIntervalSeconds": 300, "resourceWeights": [ {"name": "cpu", "weight": "1"}, {"name": "memory", "weight": "4"}, {"name": "nvidia-gpus", "weight": "50"} ] } } } } }'

I guess we agreed to talk about Admission Fair Sharing on the Kueue + DRA documents, right? Because it may be out of context for the user to add these concepts here.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 16, 2026

openshift-ci Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 16, 2026

ocpdocs-vale-bot reviewed Jun 16, 2026

View reviewed changes

OSDOCS-19990: Resource fair sharing

ddf8f8f

StephenJamesSmith force-pushed the OSDOCS-19990 branch from f4afbb5 to ddf8f8f Compare June 16, 2026 14:48

StephenJamesSmith commented Jun 16, 2026

View reviewed changes

MaysaMacedo reviewed Jun 16, 2026

View reviewed changes

anahas-redhat reviewed Jun 16, 2026

View reviewed changes


		Given this upstream limitation, isolating CPU as the sole scoring factor by setting a memory weight of `0` is a reliable approach for deterministic fair sharing behavior.

		The following example contains `admissionFairSharing.resourceWeights` settings for mixed CPU, memory, and GPU weights:

	* For `Custom` configuration use the following command:
	* For `Custom` configuration you can adapt the following command with your desired values:

	This feature balances workload admission by prioritizing workloads from tenants that have used fewer resources historically. It tracks usage over time with a configurable decay function and applies admission penalties when workloads are admitted.
	This feature balances workload admission by prioritizing workloads from local Queues that have used fewer resources historically. It tracks usage over time with a configurable decay function and applies admission penalties when workloads are admitted.


		Improve service predictability:: Guarantee each tenant gets a consistent share of resources, reducing latency spikes and preventing starvation.

		Enable scalable governance:: Use dynamic, usage-based allocation instead of complex static quotas.

Conversation

StephenJamesSmith commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci-robot commented Jun 16, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ocpdocs-previewbot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented Jun 16, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MaysaMacedo commented Jun 16, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anahas-redhat Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anahas-redhat Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

StephenJamesSmith commented Jun 16, 2026 •

edited

Loading

openshift-ci-robot commented Jun 16, 2026 •

edited by openshift-ci Bot

Loading

ocpdocs-previewbot commented Jun 16, 2026 •

edited

Loading

anahas-redhat Jun 16, 2026 •

edited

Loading

anahas-redhat Jun 16, 2026 •

edited

Loading