Add Azure Linux 4.0 support#698
Conversation
|
Please add a PR description for future reference. |
f0c4779 to
4c26a66
Compare
00fed17 to
2d9408d
Compare
1d5f7c0 to
49ec41b
Compare
94cfd07 to
46a08db
Compare
|
|
||
| options := strings.TrimSpace(value) | ||
| var filteredTokens []string | ||
| for _, token := range strings.Fields(options) { |
There was a problem hiding this comment.
🐻❄️ Karhu deep-review (🔴 Blocking, High confidence — flagged by all 3 skeptic models)
updateBLSEntryOptions mishandles whitespace/quoting — read/write asymmetry corrupts BLS round-trips. Three independent issues:
-
Quoting:
strings.Fields(options)splits on whitespace only. The reader (readKernelCmdlinesFromBLSEntries) usesgrub.TokenizeConfigwhich handles quoting. A value likerd.cmdline="foo bar"gets split into two tokens on write. -
Tab/space detection (lines 303-305, 317-319):
strings.Cut(trimmed, " ")then fallback\t. A line like"options\tfoo=bar"produceskey="options\tfoo=bar"(no space match), theoptionsline is silently skipped, and a new one is appended. Multipleoptionslines per spec means stale verity args coexist with new ones. -
Trailing newline:
strings.Split/strings.Joinon"\n"may add or drop the trailing newline depending on input. Most tools tolerate it; signed BLS entries (UKI+pcrphase scenarios when SB lands) won't.
Fix: unify on grub.TokenizeConfig / ParseCommandLineArgs for both read and write. Use strings.IndexFunc(trimmed, unicode.IsSpace) for line-key detection. Preserve trailing newline.
| return fmt.Errorf("failed to read BLS entry file (%s):\n%w", absPath, err) | ||
| } | ||
|
|
||
| updatedContent, err := updateBLSEntryOptions(string(content), argsToRemove, newArgs) |
There was a problem hiding this comment.
🐻❄️ Karhu deep-review (🔴 Blocking, Medium confidence)
Verity BLS update touches recovery/rescue entries. updateBLSEntriesForVerity iterates every .conf file (line 240) and calls updateBLSEntryOptions here with no recovery filter. Recovery entries get root=/dev/mapper/root and verity args injected. The recovery initramfs may not contain verity dracut modules, breaking the rescue path — defeating its purpose.
Recovery-skip logic exists in the read path (customizeuki.go:790) but not here.
Fix: apply the same recovery filter before mutation. Match both recovery and rescue, case-insensitively (the read-path filter is also too narrow — see separate comment at customizeuki.go:790).
| } | ||
| rootDevice := argMap["root"] | ||
| delete(argMap, "root") | ||
| for name, value := range argMap { |
There was a problem hiding this comment.
🐻❄️ Karhu deep-review (🟡 Non-blocking, Medium confidence)
Non-deterministic kernel arg ordering — reproducibility regression. argMap from ReadNonRecoveryKernelCmdlines is a map[string]string. Go map iteration is randomized per spec. Each osmodifier run produces a different key order in GRUB_CMDLINE_LINUX; grub2-mkconfig reproduces that into the resulting grub.cfg, making images non-reproducible across osmodifier invocations.
The pre-PR code iterated argTokens (deterministic grub.cfg order). Also: the previous len(lines) != 1 safety check that errored on multi-kernel images is gone — the new code silently picks whichever non-recovery entry lands first.
Fix: sort values alphabetically before passing to UpdateKernelCommandLineArgs, or change ReadNonRecoveryKernelCmdlines to return []kernelArg preserving source order.
| } | ||
|
|
||
| if title == "" { | ||
| return nil, fmt.Errorf("BLS entry (%s) is missing 'title' key", absPath) |
There was a problem hiding this comment.
🐻❄️ Karhu deep-review (🟡 Non-blocking, Medium confidence)
title is now unconditionally required — regression vs. pre-refactor behavior. Before the unification in 3187f31d, the equivalent of extractKernelCmdlineFromBLSEntries(skipRecovery=false) only required title when skipRecovery=true. The new code rejects entries missing title for both code paths, even though the BLS spec allows it (systemd-boot falls back to filename).
Any AZL4 / Fedora image with a BLS entry missing title: fails the entire kernel-args read with no fallback. Affects extract / COSI / verity flows.
Fix: when title == "", treat the entry as non-recovery (don't error), or skip with a debug log.
| return nil, fmt.Errorf("BLS entry (%s) is missing 'title' key", absPath) | ||
| } | ||
|
|
||
| if strings.Contains(title, "recovery") { |
There was a problem hiding this comment.
🐻❄️ Karhu deep-review (🟡 Non-blocking, Medium confidence)
BLS recovery filter is substring-only and case-sensitive. strings.Contains(title, "recovery") misses Rescue, Recovery, distro-specific rescue naming. Combined with the missing recovery filter on the write path (see customizeverity.go:275), recovery entries can be both included in reads AND mutated in writes.
Fix: lowercase compare, match both recovery and rescue. Even better: extract a shared helper used by both the read path here and the verity update path.
|
|
||
| return nil | ||
| return fmt.Errorf("package (%s) is not installed:\n"+ | ||
| "this package must be installed to use Uki", systemdBootPackages[0]) |
There was a problem hiding this comment.
🐻❄️ Karhu deep-review (🟡 Non-blocking, Medium confidence)
Error message misnames the AZL4 package. Always reads "package (systemd-boot) is not installed" because it hardcodes systemdBootPackages[0]. On AZL4 the satisfying package is systemd-boot-unsigned; the error directs the operator to install the wrong one.
Fix:
return fmt.Errorf("package (%s) is not installed:\n"+
"this package must be installed to use Uki", strings.Join(systemdBootPackages, " or "))| grubCfgContent, err := file.Read(grubCfgPath) | ||
| // readGrubConfigLinuxArgsBestEffort probes a boot directory for a kernel command-line source | ||
| // when no DistroHandler is available (e.g. os-release was not detectable on a USR-verity image). | ||
| func readGrubConfigLinuxArgsBestEffort(bootDir string) (map[string][]grubConfigLinuxArg, error) { |
There was a problem hiding this comment.
🐻❄️ Karhu deep-review (🟡 Non-blocking, Medium confidence)
readGrubConfigLinuxArgsBestEffort doesn't try BLS. When detectDistroFromRootfs fails (e.g. verity image without os-release on rootfs), distroHandler is nil and this best-effort path probes only grub2/grub.cfg / grub/grub.cfg. AZL4 verity images (BLS-only, no /boot/grub2/grub.cfg) get fs.ErrNotExist and surface a confusing error rather than attempting BLS extraction.
Also: hybrid BLS + inline-linux grub.cfg files (legitimate on Fedora during BLS migration windows) are hard-rejected at line 1138-ish. Could prefer BLS with a debug log.
Fix: also probe boot/loader/entries/*.conf if grub.cfg probes all miss. Tolerate hybrid grub.cfg by preferring BLS.
| requiredRpms := []string{"systemd-boot"} | ||
| systemdBootPackages := []string{"systemd-boot"} | ||
| if distroHandler.GetTargetOs() == targetos.TargetOsAzureLinux4 { | ||
| systemdBootPackages = []string{"systemd-boot", "systemd-boot-unsigned"} |
There was a problem hiding this comment.
🐻❄️ Karhu deep-review (🟡 Non-blocking, Low confidence)
UKI dependency check accepts systemd-boot-unsigned with no Secure Boot guard. systemdBootPackages = {"systemd-boot", "systemd-boot-unsigned"} for AZL4; first match wins. An image with only systemd-boot-unsigned (which will fail Secure Boot verification on a SB-enabled VM) silently passes UKI validation.
Acknowledged in the PR description as a workaround pending signed systemd-boot for AZL4, but there's no assertion that prevents unsigned from leaking to a Secure Boot configuration.
Fix: either warn when only unsigned is present, or hard-error in ValidateConfig if secureBoot is requested AND only systemd-boot-unsigned would be installed.
|
🐻❄️ Karhu deep-review (🔴 Blocking, Medium confidence — verified)
Posting as a general comment because GitHub won't accept a line comment on this brand-new file. The AZL3 stub at Result: an AZL4 image build that enables encrypted root/boot has the encryption request silently dropped at this layer. Image boots, fails to find root, no build-time warning. Same bug reaches Fedora via the asset mismatch flagged at Fix: pick one:
|
|
🐻❄️ Karhu deep-review — suggestions (architectural / non-merge-blocking) These are direction notes from the architect role across all three models. Independent of the line-level findings — none of these block merge, but the first cluster is worth landing before AZL5 work begins. Architecture (post-merge, before AZL5)
CI & Test architecture
Workaround hygiene (cheap, high leverage)
Pre-merge polish
Skipped tests with no "unskip when X" triggerNot bugs, but worth tracking explicitly:
Strengths everyone agreed on
Per-agent reports retained locally — happy to share specific findings if useful. |
| return nil, fmt.Errorf("failed to find os-release file:\n%w", err) | ||
| } | ||
|
|
||
| // mountFstabPartitionReadonly mounts the partition referenced by fstab's mountTarget entry under |
There was a problem hiding this comment.
Duplicated docstring, see line 440 and 443
|
|
||
| // FedoraGrubCfgRelPath is the path to the grub config file on Fedora and Azure Linux systems, | ||
| // relative to the /boot directory. | ||
| FedoraGrubCfgRelPath = "grub2/grub.cfg" |
There was a problem hiding this comment.
Can we re-use the constants under customizeuki?
Download and test azl4 test image in functional and vm test workflows (core-efi only, legacy option unavailable)
Add azl4 to distribution-support table (all yes except --package-snapshot-time and os.packages.shapshotTime: the package snapshot time APIs require tdnf but azl4 uses dnf)
Add azl4 to the developer guide (no link because it hasn't been released yet)
Add azl4 to the README as a supported input image
Add azl4 vm tests (equivalents of the existing azl3 vm tests)
Add azl4 to:
Create baseImageAzureLinuxCoreEfiAll var containing both azl3 and azl4 core-efi images, to migrate all tests that only test testBaseImageAzl3CoreEfi to test baseImageAzureLinuxCoreEfiAll as well (enabling additional tests)
Add azl4 core-efi image to baseImageAzureLinuxAll (enables a ton of tests that currently test azl3)
Split functional tests for AZL in Build (dev) workflow into 4 and 3+2 to enable azl4 testing in main builds
No longer skip TestCustomizeImageOverlaysSELinux/AzureLinux3CoreEfi
Review notes:
Follow up tasks:
Patch ukify (AZL3)step in the pipelines)