Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions _topic_maps/_topic_map.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3511,6 +3511,8 @@ Topics:
File: using-cohorts
- Name: Configuring fair sharing
File: configuring-fairsharing
- Name: Admission fair sharing
File: admission-fair-sharing
- Name: Gang scheduling
File: gangscheduling
- Name: Running jobs with quota limits
Expand Down
29 changes: 29 additions & 0 deletions ai_workloads/kueue/admission-fair-sharing.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
:_mod-docs-content-type: ASSEMBLY
include::_attributes/common-attributes.adoc[]
[id="admission-fair-sharing"]
= Admission fair sharing
:context: admission-fair-sharing

toc::[]

[role="_abstract"]
Use admission fair sharing to fairly distribute workloads across local Queues that share a single `ClusterQueue`.
This feature balances workload admission by prioritizing workloads from local Queues that have used fewer resources historically. It tracks usage over time with a configurable decay function and applies admission penalties when workloads are admitted.

When multiple tenants share a single `ClusterQueue`, some tenants risk resource starvation. Admission fair sharing adresses this issue by meeting the following requirements:

Enforce multi-tenant fairness (business critical):: Ensure fair distribution of cluster resources across all tenants based on their usage history.

Improve service predictability:: Guarantee each tenant gets a consistent share of resources, reducing latency spikes and preventing starvation.

Enable scalable governance:: Complement static quotas with dynamic, usage-based admission ordering that adapts as tenant demand changes.

include::modules/kueue-configuring-kueue-instance-for-admission-fair-sharing.adoc[leveloffset=+1]

include::modules/kueue-configuring-clusterqueue-for-admission-fair-sharing.adoc[leveloffset=+1]

include::modules/kueue-configuring-localqueue-for-admission-fair-sharing.adoc[leveloffset=+1]

include::modules/kueue-setting-resource-weights.adoc[leveloffset=+1]

include::modules/kueue-verifying-the-admission-fair-sharing-status.adoc[leveloffset=+1]
2 changes: 2 additions & 0 deletions ai_workloads/kueue/release-notes.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ toc::[]

include::modules/kueue-compatible-environments.adoc[leveloffset=+1]

include::modules/kueue-release-notes-1.4.adoc[leveloffset=+1]

include::modules/kueue-release-notes-1.3.1.adoc[leveloffset=+1]

include::modules/kueue-release-notes-1.3.adoc[leveloffset=+1]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
// Module included in the following assemblies:
//
// * ai_workloads/kueue/admission-fair-sharing.adoc

:_mod-docs-content-type: PROCEDURE
[id="configuring-clusterqueue-for-admission-fair-sharing_{context}"]
= Configuring a cluster queue for admission fair sharing

[role="_abstract"]
Configure the `admissionScope` section in your `ClusterQueue` object to be `UsageBasedAdmissionFairSharing`.

.Procedure

* Specify `UsageBasedAdmissionFairSharing` as shown in the following example:
+
[source,yaml]
----
admissionScope:
admissionMode: UsageBasedAdmissionFairSharing
----
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
// Module included in the following assemblies:
//
// * ai_workloads/kueue/admission-fair-sharing.adoc

:_mod-docs-content-type: PROCEDURE
[id="configuring-kueue-instance-for-admission-fair-sharing_{context}"]
= Configuring the {kueue-name} instance for admission fair sharing

[role="_abstract"]
Configure {kueue-name} admission fair sharing using either the `Default` or `Custom` configuration. The Default configuration uses predefined {kueue-name} values.

.Procedure

. Choose the `configuration` type you want to use:
+
* `Default`: Uses {kueue-name} predefined values.
* `Custom`: Uses {kueue-name} values that you specify.

. Apply your chosen configuration:
+
* Use the following command to create a `Default` configuration:
+
[source,terminal]
----
$ oc patch kueue.kueue.openshift.io/cluster --type=merge -p \
'{"spec":{"config":{"admissionFairSharing":{"configuration":"Default"}}}}'
----
+
.Example output
[source,yaml]
----
config:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead we can recommend the user to use the following command to apply the default configuration:

oc patch kueue.kueue.openshift.io/cluster --type=merge -p \
  '{"spec":{"config":{"admissionFairSharing":{"configuration":"Default"}}}}'

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a small correction Stephen. The command would be:

oc patch kueue.kueue.openshift.io/cluster --type=merge -p \
  '{"spec":{"config":{"admissionFairSharing":{"configuration":"Default","custom":null}}}}'

admissionFairSharing:
configuration: Default
----
+
* Use the following command to create a `Custom` configuration that applies values that you specify:
+
[source,terminal]
----
$ oc patch kueue.kueue.openshift.io/cluster --type=merge -p \
'{"spec":{"config":{"admissionFairSharing":{"configuration":"Custom","custom":{"usageHalfLifeTimeSeconds":10,"usageSamplingIntervalSeconds":10,"resourceWeights":[{"name":"cpu","weight":"2.0"}]}}}}}'
----
+
.Example output
[source,yaml]
----
config:
admissionFairSharing:
configuration: Custom
custom:
resourceWeights:
- name: cpu
weight: "2.0"
usageHalfLifeTimeSeconds: 10
usageSamplingIntervalSeconds: 10
----
+
`resourceWeights`:: Assigns weights to resources. The higher the weight, the higher the penalty.
`usageHalfLifeTimeSeconds`:: The time in seconds after which the current usage will decrease by half. That is, it controls how long the past consumption should impact future admission.

`usageSamplingIntervalSeconds`:: The frequency in seconds that {kueue-name} updates consumedResources in FairSharingStatus.



Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
// Module included in the following assemblies:
//
// * ai_workloads/kueue/admission-fair-sharing.adoc

:_mod-docs-content-type: PROCEDURE
[id="configuring-localqueue-for-admission-fair-sharing_{context}"]
= Configuring a local queue for admission fair sharing (optional)

[role="_abstract"]
Optionally, you can configure `fairSharing` section in your `LocalQueue` object to adjust its weight in the fair sharing calculation. The higher the weight, the lower the penalty. For example, specifying a weight of `2` treats the queue as if it is used by half as many resources.

.Procedure

* Specify a `weight` value as shown in the following example:
+
[source,yaml]
----
spec:
fairSharing:
weight: "2"
----
22 changes: 22 additions & 0 deletions modules/kueue-release-notes-1.4.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
/ Module included in the following assemblies:
//
// * ai_workloads/kueue/release-notes.adoc

:_mod-docs-content-type: REFERENCE
[id="release-notes-1.4_{context}"]
= Release notes for {kueue-name} version 1.4

[role="_abstract"]
{kueue-name} version 1.4 is a generally available release that is supported on {product-title} versions 4.18 and later. {kueue-name} version 1.4 uses link:https://kueue.sigs.k8s.io/docs/overview/[Kueue] version 0.16.

[id="release-notes-1.4-new-features_{context}"]
== New features and enhancements

Admission fair sharing::
This release introduces admission fair sharing, which balances workload admission across multiple local Queues feeding into a shared `ClusterQueue`. Admission fair sharing:

- Prioritizes workloads based on historical resource consumption
- Tracks usage over time with a configurable decay function
- Applies immediate admission penalties to prevent resource monopolization

For more information, see xref:../../ai_workloads/kueue/admission-fair-sharing.adoc#admission-fair-sharing[Admission fair sharing].

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] OpenShiftAsciiDoc.NoXrefInModules: Do not include xrefs in modules, only assemblies (exception: release notes modules).

17 changes: 17 additions & 0 deletions modules/kueue-setting-resource-weights.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
// Module included in the following assemblies:
//
// * ai_workloads/kueue/admission-fair-sharing.adoc

:_mod-docs-content-type: CONCEPT
[id="setting-resource-weights_{context}"]
= Setting resource weights

[role="_abstract"]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will get back to this.

Resources measured in bytes, like memory, require scaled-down `resourceWeights` values. Kubernetes
represents memory in bytes, creating values that are billions of times larger than CPU core
counts. This numeric difference makes CPU weights ineffective unless you scale memory weights
down. Without this adjustment, the raw byte value of these resources will numerically dominate human-scale resources, such as CPU cores, by several orders of magnitude, effectively making their weights meaningless.

For example, if you want to achieve an effective memory weight of `1.0`, you would need to instead specify `9.31e-10`, which corresponds to `1.0 / 1,073,741,824`.


27 changes: 27 additions & 0 deletions modules/kueue-verifying-the-admission-fair-sharing-status.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
// Module included in the following assemblies:
//
// * ai_workloads/kueue/admission-fair-sharing.adoc

:_mod-docs-content-type: PROCEDURE
[id="verifying-the-admission-fair-sharing-status_{context}"]
= Verifying the admission fair sharing status

[role="_abstract"]
Check the `admissionFairSharingStatus` status in the local queue.

.Procedure

* Use the following command to verify the status of admission fair sharing:
+
[source,terminal]
----
$ oc get lq <local-queue-name> -n <local-queue-namespace> -o jsonpath={.status.fairSharing}
----
+
.Example output
[source,terminal]
----
{"admissionFairSharingStatus":{"consumedResources":{"cpu":"31999m"},"lastUpdate":"2025-06-03T14:25:15Z"},"weightedShare":0}
----