Add status reporting to EtcdCluster#350
Conversation
|
Hi @GunaKKIBM. Thanks for your PR. I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/ok-to-test |
|
please resolve the workflow failures. |
c98b771 to
6654687
Compare
| } | ||
|
|
||
| return r.reconcileClusterState(ctx, state) | ||
| result, err := r.reconcileClusterState(ctx, state) |
There was a problem hiding this comment.
please use a deferred function at line 89, so that we don't need to call r.updateStatus in multiple places.
There was a problem hiding this comment.
Updated. I actually moved it to the beginning of the function, because it should ideally be called even after fetchAndValidate function exits.
| quorum := (len(s.memberListResp.Members) / 2) + 1 | ||
| if healthyCount >= quorum { |
There was a problem hiding this comment.
We don't need to do the math ourselves. We just need to issue an lineralibity read, the etcd cluster is still functional as long as we get a response from any member.
We need enhance the health check.
Let's just keep it as it's for now, and consider improvement in followup PRs, thx
There was a problem hiding this comment.
#357 - created this issue to take this up in a separate PR
| // Update or append conditions | ||
| meta.SetStatusCondition(&s.cluster.Status.Conditions, availableCondition) | ||
| meta.SetStatusCondition(&s.cluster.Status.Conditions, progressingCondition) | ||
| meta.SetStatusCondition(&s.cluster.Status.Conditions, degradedCondition) |
There was a problem hiding this comment.
questions: do it mean that we always have three conditions?
There was a problem hiding this comment.
Yes. This works like a deployment or any other k8s controller's status. For example, here's two examples of a deployment, one successful and one failed
Successful
status:
availableReplicas: 2
conditions:
- lastTransitionTime: "2026-06-06T06:28:43Z"
lastUpdateTime: "2026-06-06T06:28:43Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2026-06-06T06:28:31Z"
lastUpdateTime: "2026-06-06T06:28:43Z"
message: ReplicaSet "nginx-deployment-6f9664446b" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 1
readyReplicas: 2
replicas: 2
updatedReplicas: 2
Failed:
status:
conditions:
- lastTransitionTime: "2026-06-06T06:30:09Z"
lastUpdateTime: "2026-06-06T06:30:09Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
- lastTransitionTime: "2026-06-06T06:40:10Z"
lastUpdateTime: "2026-06-06T06:40:10Z"
message: ReplicaSet "failing-nginx-deployment-8644746d4f" has timed out progressing.
reason: ProgressDeadlineExceeded
status: "False"
type: Progressing
observedGeneration: 1
replicas: 2
unavailableReplicas: 2
updatedReplicas: 2
There was a problem hiding this comment.
Thanks for the info.
It seems there is no Degraded condition?
There was a problem hiding this comment.
It is upto us to add different types of conditions, the type in the condition is of type string. I added one more and called it Degraded (if one or more members is not healthy). We can rename it.
I can check other K8S resources or code to see what are other preferable naming conventions
Thank you!
|
@GunaKKIBM pls resolve the minor changes so that we can merge this PR. thx |
|
@ahrtr . I will be updating this by tomorrow, along with test results for sure. Sorry about the delay. |
|
pls squash the commits, thx |
Seems fine either way from a docs perspective. What's there now is OK. |
|
Here's how the status looks like when the etcd cluster is deployed |
|
During further testing, I see an issue, rather than fixing it here. I have created a bug |
| } | ||
| }() | ||
|
|
||
| state, res, err := r.fetchAndValidateState(ctx, req) |
There was a problem hiding this comment.
The state here is a new variable. It won't update the state defined above (line 85)
There was a problem hiding this comment.
That's a great catch. Thank you very much. I have declared the variables separately and tested these changes once again.
Please let me know if that looks good. Thanks!
ivanvc
left a comment
There was a problem hiding this comment.
LGTM. Ideally, the dependency bump should be in a different commit, unless it's absolutely required by the code change. Otherwise, it looks good and isn't a blocker. Thanks, @GunaKKIBM
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahrtr, GunaKKIBM, ivanvc The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@GunaKKIBM, please resolve the conflicts with |
Signed-off-by: Guna K Kambalimath <Guna.Kambalimath@ibm.com>
Signed-off-by: Guna K Kambalimath <Guna.Kambalimath@ibm.com> Signed-off-by: Valen Mascarenhas <valen.mascarenhas@ibm.com>
#135