grubconfigs: Sync grub.cfg and bootuuid.cfg to all ESPs on multi-device setups#1077
grubconfigs: Sync grub.cfg and bootuuid.cfg to all ESPs on multi-device setups#1077ckyrouac wants to merge 1 commit intocoreos:mainfrom
Conversation
|
Hi @ckyrouac. Thanks for your PR. I'm waiting for a coreos member to verify that this patch is reasonable to test. If it is, they should reply with Tip We noticed you've done this a few times! Consider joining the org to skip this step and gain Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
pushing up what claude generated in case anyone wants to review it while I test/validate locally and add a test. |
There was a problem hiding this comment.
Code Review
This pull request adds support for synchronizing GRUB configuration files across all detected EFI System Partitions (ESPs) on multi-device systems. It refactors the GRUB installation logic into reusable helpers and introduces a synchronization loop that mounts additional ESPs to apply updates. Feedback suggests improving error logging during ESP detection and implementing a panic-safe RAII pattern for the mount/unmount operations.
src/bootupd.rs
Outdated
| for esp_path in &esp_parts { | ||
| log::info!("Syncing GRUB configs to ESP {esp_path}"); | ||
|
|
||
| // Mount the ESP | ||
| std::process::Command::new("mount") | ||
| .arg(esp_path) | ||
| .arg(tmpmnt) | ||
| .run_inherited() | ||
| .with_context(|| format!("Mounting ESP {esp_path}"))?; | ||
|
|
||
| let result = (|| -> Result<()> { | ||
| // Ensure the vendor directory exists under EFI/ | ||
| let efi_vendor = tmpmnt.join("EFI").join(vendor); | ||
| std::fs::create_dir_all(&efi_vendor) | ||
| .with_context(|| format!("Creating {:?}", efi_vendor))?; | ||
|
|
||
| let efidir = openat::Dir::open(tmpmnt.join("EFI")) | ||
| .context("Opening EFI directory on ESP")?; | ||
|
|
||
| crate::grubconfigs::install_to_esp( | ||
| source_root, | ||
| &boot_grub2, | ||
| &efidir, | ||
| vendor, | ||
| write_uuid, | ||
| ) | ||
| })(); | ||
|
|
||
| // Always unmount, even on error | ||
| std::process::Command::new("umount") | ||
| .arg(tmpmnt) | ||
| .run_inherited() | ||
| .with_context(|| format!("Unmounting ESP {esp_path}"))?; | ||
|
|
||
| result?; | ||
| } |
There was a problem hiding this comment.
The current implementation for mounting and unmounting the ESP is not panic-safe. If a panic occurs after mounting but before unmounting (for example, inside crate::grubconfigs::install_to_esp), the umount command will not be executed. This could leave a stale mount point on a temporary directory that is about to be removed.
A more robust approach is to use a RAII guard struct that mounts in its constructor and unmounts in its Drop implementation. This ensures that the ESP is always unmounted, even in the case of a panic.
There was a problem hiding this comment.
We run in a private mountns anyways which avoids global leaks.
There was a problem hiding this comment.
Pull request overview
This PR fixes multi-device (multi-ESP) installs where only the currently-mounted ESP at /boot/efi received EFI/<vendor>/grub.cfg and EFI/<vendor>/bootuuid.cfg, causing boot failures when firmware selects a different disk’s ESP.
Changes:
- Refactors ESP config writing in
grubconfigs::install()into a sharedwrite_esp_configs()helper. - Adds
grubconfigs::install_to_esp()to install GRUB config files into an already-mounted ESPEFI/directory. - Adds
sync_grub_configs_to_all_esps()inbootupd::install()to discover ESP partitions across provided devices, mount each, and sync the GRUB config files.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/grubconfigs.rs |
Extracts shared ESP config writing and adds a helper to install configs to an already-mounted ESP. |
src/bootupd.rs |
Adds multi-device ESP discovery + mount loop to sync GRUB config files to all ESPs after the primary install. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/bootupd.rs
Outdated
| // Collect all ESP partition device paths. | ||
| let esp_parts: Vec<String> = devices | ||
| .iter() | ||
| .filter_map(|dev| dev.find_partition_of_esp().ok().map(|p| p.path())) |
There was a problem hiding this comment.
find_partition_of_esp() errors are silently ignored via .ok() here, which can cause the sync step to become a no-op (e.g. one ESP discovered, one skipped due to a transient probe error) without any indication to the user. Consider matching the earlier EFI-device filtering logic in this file: log a warning per-device when ESP discovery fails, and only skip when it truly has no ESP, so failures are visible and diagnosable.
| .filter_map(|dev| dev.find_partition_of_esp().ok().map(|p| p.path())) | |
| .filter_map(|dev| match dev.find_partition_of_esp() { | |
| Ok(p) => Some(p.path()), | |
| Err(e) => { | |
| log::warn!("Failed to discover ESP on device {:?}: {e:#}", dev); | |
| None | |
| } | |
| }) |
There was a problem hiding this comment.
Right, in the bootc PR I added a new API for this
src/bootupd.rs
Outdated
| // Mount the ESP | ||
| std::process::Command::new("mount") | ||
| .arg(esp_path) | ||
| .arg(tmpmnt) | ||
| .run_inherited() | ||
| .with_context(|| format!("Mounting ESP {esp_path}"))?; | ||
|
|
||
| let result = (|| -> Result<()> { | ||
| // Ensure the vendor directory exists under EFI/ | ||
| let efi_vendor = tmpmnt.join("EFI").join(vendor); | ||
| std::fs::create_dir_all(&efi_vendor) | ||
| .with_context(|| format!("Creating {:?}", efi_vendor))?; | ||
|
|
||
| let efidir = openat::Dir::open(tmpmnt.join("EFI")) | ||
| .context("Opening EFI directory on ESP")?; |
There was a problem hiding this comment.
After mounting the ESP, the code doesn’t validate that the mount is actually a VFAT ESP and writable (unlike the existing logic in efi.rs which checks filesystem type / mountpoint and calls util::ensure_writable_mount). Adding an explicit writability + fstype check here (or reusing the existing EFI mount helper) would prevent confusing failures when the ESP is mounted read-only or the wrong partition is mounted.
src/bootupd.rs
Outdated
| // Collect all ESP partition device paths. | ||
| let esp_parts: Vec<String> = devices | ||
| .iter() | ||
| .filter_map(|dev| dev.find_partition_of_esp().ok().map(|p| p.path())) | ||
| .collect(); | ||
|
|
||
| if esp_parts.len() < 2 { | ||
| log::debug!("Fewer than 2 ESPs found, nothing to sync"); | ||
| return Ok(()); | ||
| } |
There was a problem hiding this comment.
esp_parts may contain duplicate ESP paths (e.g. if the same device is passed twice or if multiple Device entries resolve to the same ESP). Consider de-duplicating the list before iterating to avoid redundant mount/copy/unmount cycles.
cgwalters
left a comment
There was a problem hiding this comment.
One existing problem we had is that I messed up the static configs stuff and didn't wire them in as something updatable in place. If we'd had that, this could have just been part of a general "sync content across ESP".
src/bootupd.rs
Outdated
| log::info!("Syncing GRUB configs to ESP {esp_path}"); | ||
|
|
||
| // Mount the ESP | ||
| std::process::Command::new("mount") |
There was a problem hiding this comment.
Nonblocking but I'd like to use the mount-as-fd new mount API for stuff like this in the future
src/bootupd.rs
Outdated
| for esp_path in &esp_parts { | ||
| log::info!("Syncing GRUB configs to ESP {esp_path}"); | ||
|
|
||
| // Mount the ESP | ||
| std::process::Command::new("mount") | ||
| .arg(esp_path) | ||
| .arg(tmpmnt) | ||
| .run_inherited() | ||
| .with_context(|| format!("Mounting ESP {esp_path}"))?; | ||
|
|
||
| let result = (|| -> Result<()> { | ||
| // Ensure the vendor directory exists under EFI/ | ||
| let efi_vendor = tmpmnt.join("EFI").join(vendor); | ||
| std::fs::create_dir_all(&efi_vendor) | ||
| .with_context(|| format!("Creating {:?}", efi_vendor))?; | ||
|
|
||
| let efidir = openat::Dir::open(tmpmnt.join("EFI")) | ||
| .context("Opening EFI directory on ESP")?; | ||
|
|
||
| crate::grubconfigs::install_to_esp( | ||
| source_root, | ||
| &boot_grub2, | ||
| &efidir, | ||
| vendor, | ||
| write_uuid, | ||
| ) | ||
| })(); | ||
|
|
||
| // Always unmount, even on error | ||
| std::process::Command::new("umount") | ||
| .arg(tmpmnt) | ||
| .run_inherited() | ||
| .with_context(|| format!("Unmounting ESP {esp_path}"))?; | ||
|
|
||
| result?; | ||
| } |
There was a problem hiding this comment.
We run in a private mountns anyways which avoids global leaks.
src/bootupd.rs
Outdated
| // setups we need to sync these files to every other ESP so that | ||
| // any disk can boot independently. | ||
| if let Some(ref vendor) = installed_efi_vendor { | ||
| sync_grub_configs_to_all_esps( |
There was a problem hiding this comment.
I think it'd be a lot cleaner for us to get the list of explicit devices and then do an install on each one.
I think this would require changing the generic crate::grubconfigs::install invocation above instead.
Now I am not sure if we support multi-device BIOS. But basically WDYT about ensuring we gather all devices, and use that in the above?
There was a problem hiding this comment.
updated with this suggestion
…ce setups On multi-device setups (e.g. RAID 1 with an ESP per disk), grubconfigs::install() only wrote grub.cfg and bootuuid.cfg to the single ESP mounted at /boot/efi. If the system booted from a different disk, GRUB would drop to a shell because no grub.cfg existed on that ESP. Extend grubconfigs::install() with an `additional_esp_devices` parameter. When non-empty, each listed ESP is temporarily mounted and receives copies of grub.cfg and bootuuid.cfg via a new write_esp_configs() helper (extracted from the existing inline logic to avoid duplication). In bootupd::install(), discover colocated ESPs via find_colocated_esps() on the first device and pass them through, consistent with the approach already used in the update and validate paths in efi.rs. Callers that do not need multi-ESP support (bios.rs, efi.rs update path) simply pass an empty slice. Fixes coreos#1076 Assisted-by: Claude Code (Opus 4.6) Signed-off-by: ckyrouac <ckyrouac@redhat.com>
Three issues prevented a successful Anaconda-based bootc install onto a
RAID 1 array with per-disk ESP partitions:
1. Bind mount failure during ostreecontainer payload install:
Anaconda mounts the root FS at /mnt/sysimage and then tries to
bind-mount /mnt/sysimage/boot/efi2 and /boot/efi3 into /mnt/sysroot.
However, it does not create the mount point directories for the
secondary ESPs before the bind mount, causing:
mount: /mnt/sysroot/boot/efi2: special device
/mnt/sysimage/boot/efi2 does not exist.
Fix: add a %pre-install script that creates the directories after
the root FS is mounted but before the payload runs.
2. Empty secondary ESPs after installation:
Anaconda only installs the bootloader (shim, GRUB, grub.cfg) to the
primary ESP at /boot/efi. The secondary ESPs (efi2, efi3) are
formatted and mounted but receive no bootloader files. Additionally,
the 'noauto' fsoption causes Anaconda to unmount them before %post.
Fix: add a %post --nochroot script that explicitly mounts each
secondary ESP by device path and copies the full EFI directory tree
from the primary ESP.
3. GRUB cannot find the kernel on the RAID root filesystem:
The GRUB EFI binary shipped with CentOS Stream 10 does not include
the mdraid1x module. GRUB's grub.cfg uses 'search --fs-uuid' to
locate the root filesystem and load the kernel from the ostree
deployment path, but it cannot access the md array.
Fix: add a separate /boot partition (ext4, 1G) on the primary disk.
GRUB can read ext4 natively without mdraid support. The kernel and
initramfs live on /boot, and the initramfs handles RAID assembly
for the root filesystem during boot.
Note: /boot is only on vda (not mirrored), so only disk 1 can boot
independently. Full boot redundancy from any disk would require either
whole-disk RAID (where GRUB reads from the raw mirror) or a GRUB build
with mdraid1x support. This is a known limitation tracked in:
- bootc-dev/bootc#1911
- coreos/bootupd#1077
On multi-device setups (e.g. RAID 1 with an ESP per disk), grubconfigs::install() only writes grub.cfg and bootuuid.cfg to the single ESP currently mounted at /boot/efi. This means that if the system boots from a different disk, GRUB drops to a shell because there is no grub.cfg on that ESP.
Add sync_grub_configs_to_all_esps() which runs after the primary grubconfigs::install(). It discovers all ESP partitions across the provided devices, mounts each one to a temporary directory, and calls the new grubconfigs::install_to_esp() helper to write the static EFI grub.cfg and bootuuid.cfg.
Also refactor the ESP config writing in grubconfigs::install() into a shared write_esp_configs() helper to avoid code duplication.
Fixes #1076
Assisted-by: Claude Code (Opus 4.6)