Enable `cleanUpWorkDirOnStart` on workers deployments #374

david-baylibre · 2025-02-06T14:34:48Z

Existing Issue

Fixes #373

Contributor Checklist

Variables are documented in the README.md
Which branch are you merging into?
- master is for changes related to the current release of the concourse/concourse:latest image and should be good to publish immediately

Reviewer Checklist

This section is intended for the core maintainers only, to track review progress. Please do not
fill out this section.

Code reviewed
Topgun tests run
Back-port if needed
Is the correct branch targeted? (master or dev)

Signed-off-by: David Rozé <droze@baylibre.com>

taylorsilva

I believe this is not needed because each deployment starts with a fresh work dir for Concourse.

If you look at the statefulset you'll see the workdir is a volume that will persist across deployments:

concourse-chart/templates/worker-statefulset.yaml

Lines 133 to 134 in cae81f3

    
           - name: concourse-work-dir 
        
             mountPath: {{ .Values.concourse.worker.workDir | quote }}

and no such volume mount exists for the deployment:

concourse-chart/templates/worker-deployment.yaml

Lines 106 to 118 in cae81f3

    
           volumeMounts: 
        
             - name: concourse-keys 
        
               mountPath: {{ .Values.worker.keySecretsPath | quote }} 
        
               readOnly: true 
        
             - name: pre-stop-hook 
        
               mountPath: /pre-stop-hook.sh 
        
               subPath: pre-stop-hook.sh 
        
             {{- if and (not (kindIs "invalid" .Values.secrets.workerAdditionalCerts)) (.Values.secrets.workerAdditionalCerts | toString) }} 
        
             - name: worker-additional-certs 
        
               mountPath: "{{ .Values.worker.certsPath }}/worker-additional-certs.pem" 
        
               subPath: worker-additional-certs.pem 
        
               readOnly: true 
        
             {{- end }}

Looking at your issue #373, it sounds like k8s isn't cleaning up the disk space from the crashed worker container. I don't think your PR here would fix that issue.

david-baylibre · 2025-03-26T07:24:58Z

You're right @taylorsilva it did not fix #373 which is also happening on clean upgrades/updates/restart.

I suspect Kubernetes not cleaning loop mounts, this happened on some of my static workers running in docker on bare machines (outside Kube). I'll dig into it...

taylorsilva · 2025-03-26T20:41:28Z

Maybe a cleanup on shutdown would help with that?

taylorsilva · 2025-08-26T23:27:05Z

@david-baylibre any progress on this? Wondering if this PR should be closed or not?

david-baylibre · 2025-12-05T11:30:25Z

@taylorsilva loop mounts are the actual problem.

From the node itself:

root@gke-ci-cluster-concourse-dev-wk-01-fc3b878d-wvtz:/# losetup -a; df -h /
/dev/loop0: [2049]:785313 (/concourse-work-dir/volumes.img)
Filesystem      Size  Used Avail Use% Mounted on
/dev/root       745G   18G  727G   3% /

Delete the pod:

# k delete po concourse-worker-7798fb4664-4x46h
pod "concourse-worker-7798fb4664-4x46h" deleted

Check again:

root@gke-ci-cluster-concourse-dev-wk-01-fc3b878d-wvtz:/# losetup -a; df -h /
/dev/loop1: [2049]:2580553 (/concourse-work-dir/volumes.img)
/dev/loop0: [2049]:785313 (/volumes.img (deleted))
Filesystem      Size  Used Avail Use% Mounted on
/dev/root       745G   18G  727G   3% /

Everytime I delete the pod, I get an extra loop mount and disk space isn't released.
The good part is deleted volumes show up in the new pod:

$ k exec -it concourse-worker-7798fb4664-75wx6 -- losetup -a
/dev/loop1: [2049]:2580553 (/volumes.img (deleted))
/dev/loop2: [2049]:3096649 (/concourse-work-dir/volumes.img)
/dev/loop0: [2049]:785313 (/volumes.img (deleted))

I would suggest to add:
umount ${CONCOURSE_WORK_DIR}/volumes in /pre-stop-hook.sh before the kill -s to end the container properly

and detach deleted volumes on startup as a poststart hook to reclaim space from previously crashed workers

taylorsilva · 2025-12-05T20:23:11Z

Ah okay! Thanks for digging into this and figuring it out. I'm a bit busy with other stuff at the moment, but happy to review any PR that fixes this. Not sure if you want to dust this one off or not?

david-baylibre · 2025-12-15T13:34:28Z

@taylorsilva I did not get to detach the loop device in the pre-stop hook because the filesystem is still in use, then the pod dies once Concourse is killed and it's too late, the pod is gone...
The only option is to create a sidecar container (which isn't very elegant) that monitors Concourse processes and detach the loop device once the main container has exited. That goes in a pre stop hook as well, a kubectl delete pod would kill the 2 containers leaving the loop device still attached.

Enable cleanUpWorkDirOnStart on workers deployments

23c845a

Signed-off-by: David Rozé <droze@baylibre.com>

taylorsilva reviewed Mar 24, 2025

View reviewed changes

david-baylibre mentioned this pull request Dec 15, 2025

Detach previous workers deleted loop devices #403

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Enable `cleanUpWorkDirOnStart` on workers deployments #374

Enable `cleanUpWorkDirOnStart` on workers deployments #374

Uh oh!

david-baylibre commented Feb 6, 2025

Uh oh!

taylorsilva left a comment •

edited

Loading

Uh oh!

david-baylibre commented Mar 26, 2025

Uh oh!

taylorsilva commented Mar 26, 2025

Uh oh!

taylorsilva commented Aug 26, 2025 •

edited

Loading

Uh oh!

david-baylibre commented Dec 5, 2025

Uh oh!

taylorsilva commented Dec 5, 2025

Uh oh!

david-baylibre commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	- name: concourse-work-dir
	mountPath: {{ .Values.concourse.worker.workDir \| quote }}

	volumeMounts:
	- name: concourse-keys
	mountPath: {{ .Values.worker.keySecretsPath \| quote }}
	readOnly: true
	- name: pre-stop-hook
	mountPath: /pre-stop-hook.sh
	subPath: pre-stop-hook.sh
	{{- if and (not (kindIs "invalid" .Values.secrets.workerAdditionalCerts)) (.Values.secrets.workerAdditionalCerts \| toString) }}
	- name: worker-additional-certs
	mountPath: "{{ .Values.worker.certsPath }}/worker-additional-certs.pem"
	subPath: worker-additional-certs.pem
	readOnly: true
	{{- end }}

Uh oh!

Enable cleanUpWorkDirOnStart on workers deployments #374

Are you sure you want to change the base?

Enable cleanUpWorkDirOnStart on workers deployments #374

Uh oh!

Conversation

david-baylibre commented Feb 6, 2025

Existing Issue

Contributor Checklist

Reviewer Checklist

Uh oh!

taylorsilva left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

david-baylibre commented Mar 26, 2025

Uh oh!

taylorsilva commented Mar 26, 2025

Uh oh!

taylorsilva commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-baylibre commented Dec 5, 2025

Uh oh!

taylorsilva commented Dec 5, 2025

Uh oh!

david-baylibre commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Enable `cleanUpWorkDirOnStart` on workers deployments #374

Enable `cleanUpWorkDirOnStart` on workers deployments #374

taylorsilva left a comment •

edited

Loading

taylorsilva commented Aug 26, 2025 •

edited

Loading