What would you like to be added:
I want the CodeInterpreterReconciler to actually look at SandboxWarmPool.Status.ReadyReplicas and surface that as a WarmPoolAvailable condition on the CodeInterpreter object, plus emit a Kubernetes Event when the pool drops below a low-watermark (I'm using 50% of desired as the threshold) or hits zero.
Right now the reconciler creates/updates the SandboxWarmPool spec but then completely ignores its status. It also doesn't Owns the warm pool, so it doesn't even get re-enqueued when pool status changes. The CodeInterpreter just sits there with Ready: true forever, regardless of what's actually happening with the sandboxes underneath it.
Why is this needed:
I ran into this while working on the reconcile-storm fixes. I noticed that kubectl get codeinterpreter always shows ready even when the warm pool is clearly broken , wrong runtimeClassName, image pull failures, quota issues, whatever. There's zero signal on the object itself.
The practical problem is: when the pool is empty, handleSandboxCreate silently falls into cold-start mode. If the cold start fails or takes too long it hits the 2-minute timeout and returns a 500. From the outside it just looks like intermittent failures with no obvious cause. You have to dig through logs across multiple components to figure out the pool was actually empty.
Adding a WarmPoolAvailable condition means operators get an answer from kubectl describe codeinterpreter immediately, and the Event gives a timestamp to correlate against when the 500s started. The Owns(&SandboxWarmPool{}) wiring makes the controller reactive to pool status changes instead of only waking up when the CI spec changes.
No new CRD fields needed , Conditions []metav1.Condition is already in the status struct, just unused for this case.
What would you like to be added:
I want the CodeInterpreterReconciler to actually look at SandboxWarmPool.Status.ReadyReplicas and surface that as a WarmPoolAvailable condition on the CodeInterpreter object, plus emit a Kubernetes Event when the pool drops below a low-watermark (I'm using 50% of desired as the threshold) or hits zero.
Right now the reconciler creates/updates the SandboxWarmPool spec but then completely ignores its status. It also doesn't Owns the warm pool, so it doesn't even get re-enqueued when pool status changes. The CodeInterpreter just sits there with Ready: true forever, regardless of what's actually happening with the sandboxes underneath it.
Why is this needed:
I ran into this while working on the reconcile-storm fixes. I noticed that kubectl get codeinterpreter always shows ready even when the warm pool is clearly broken , wrong runtimeClassName, image pull failures, quota issues, whatever. There's zero signal on the object itself.
The practical problem is: when the pool is empty, handleSandboxCreate silently falls into cold-start mode. If the cold start fails or takes too long it hits the 2-minute timeout and returns a 500. From the outside it just looks like intermittent failures with no obvious cause. You have to dig through logs across multiple components to figure out the pool was actually empty.
Adding a WarmPoolAvailable condition means operators get an answer from kubectl describe codeinterpreter immediately, and the Event gives a timestamp to correlate against when the 500s started. The Owns(&SandboxWarmPool{}) wiring makes the controller reactive to pool status changes instead of only waking up when the CI spec changes.
No new CRD fields needed , Conditions []metav1.Condition is already in the status struct, just unused for this case.