Bug Description
SOS reports collected by gather_sos and gather_edpm_sos are empty (0 bytes) after must-gather completes. The sosreport tar.xz is downloaded successfully from the nodes, but the extraction step fails with Invalid cross-device link errors on all files, and then the original tar.xz is deleted — resulting in complete data loss.
Root Cause
GNU tar 1.34 (shipped with RHEL 9) has a bug where --one-top-level combined with --strip-components triggers Invalid cross-device link errors on all file types, even when source and destination are on the same filesystem.
# FAILS - 17273 errors, 0 files extracted
tar --one-top-level=/tmp/test --strip-components=1 -Jxf sosreport.tar.xz
# WORKS - 0 errors, 15052 files extracted
tar -C /tmp/test --strip-components=1 -Jxf sosreport.tar.xz
Affected Files
collection-scripts/gather_sos:
tar -i --one-top-level="${SOS_PATH_NODES}/sosreport-$node" --strip-components=1 --exclude='*/dev/null' -Jxf "${sos_file}"
rm "${sos_file}" # deletes the original even if tar failed
collection-scripts/gather_edpm_sos:
tar --one-top-level="${SOS_PATH_NODES}/sosreport-$node" --strip-components=1 --exclude='*/dev/null' -Jxf ${SOS_PATH_NODES}/sosreport-$node.tar.xz
rm "${SOS_PATH_NODES}/sosreport-$node.tar.xz" # deletes the original even if tar failed
Error Messages (from must-gather pod logs)
tar: /must-gather/sos-reports/_all_nodes/sosreport-edpm-compute-01/version.txt: Cannot open: Invalid cross-device link
tar: /must-gather/sos-reports/_all_nodes/sosreport-edpm-compute-01/sos_reports: Cannot mkdir: Invalid cross-device link
tar: /must-gather/sos-reports/_all_nodes/sosreport-edpm-compute-01/chkconfig: Cannot create symlink to 'sos_commands/services/chkconfig_--list': Invalid cross-device link
...
Over 17,000 files failed to extract in our test.
Impact
- All SOS reports are affected — both control plane nodes (
gather_sos) and EDPM nodes (gather_edpm_sos)
- The sosreport directories exist but are completely empty (0 bytes)
- The original tar.xz archives are deleted after failed extraction, so there is no way to recover the data
- Users are unaware of the failure unless they check the pod logs
Suggested Fix
- Replace
--one-top-level=DIR with -C DIR (which achieves the same result without the bug)
- Check the
tar exit code before deleting the original archive
See PR #132.
Environment
- OpenShift 4.18.30
- openstack-must-gather image:
registry.redhat.io/rhoso-operators/openstack-must-gather-rhel9:1.0 (sha256:ab86b53a49adf8ad2b9658076cba5c55b3a19552ff17ea178dfe346fc1ac9979)
- openstack-operator v1.0.20
- GNU tar 1.34 (RHEL 9)
Steps to Reproduce
- Run the OpenStack must-gather:
oc adm must-gather --image=registry.redhat.io/rhoso-operators/openstack-must-gather-rhel9:1.0
- Check the
sos-reports/_all_nodes/ directories — they will be empty
- Check the must-gather pod logs for
Invalid cross-device link errors
Bug Description
SOS reports collected by
gather_sosandgather_edpm_sosare empty (0 bytes) after must-gather completes. The sosreport tar.xz is downloaded successfully from the nodes, but the extraction step fails withInvalid cross-device linkerrors on all files, and then the original tar.xz is deleted — resulting in complete data loss.Root Cause
GNU tar 1.34 (shipped with RHEL 9) has a bug where
--one-top-levelcombined with--strip-componentstriggersInvalid cross-device linkerrors on all file types, even when source and destination are on the same filesystem.Affected Files
collection-scripts/gather_sos:collection-scripts/gather_edpm_sos:Error Messages (from must-gather pod logs)
Over 17,000 files failed to extract in our test.
Impact
gather_sos) and EDPM nodes (gather_edpm_sos)Suggested Fix
--one-top-level=DIRwith-C DIR(which achieves the same result without the bug)tarexit code before deleting the original archiveSee PR #132.
Environment
registry.redhat.io/rhoso-operators/openstack-must-gather-rhel9:1.0(sha256:ab86b53a49adf8ad2b9658076cba5c55b3a19552ff17ea178dfe346fc1ac9979)Steps to Reproduce
sos-reports/_all_nodes/directories — they will be emptyInvalid cross-device linkerrors