Workaround for Fedora CoreOS boot partition space errors during upgrades
The default boot partition size for Fedora CoreOS installations is 384MB. This default size is likely not enough1 today for hosting new deployments and accommodating new kernels and initrd images in the boot partition’s /ostree folder. 2
Eventually, some systems can start to fail when two kernels and initrd images cannot fit into the boot partition, and you may find rpm-ostree status reporting:
error: Installing kernel: regfile copy: No space left on device
Suddenly, Fedora CoreOS installations also default to XFS for the root partition, and XFS supports neither online nor offline shrinking3. Safely resizing the boot partition after CoreOS is installed and running with data is essentially not possible. Any attempt poses a risk of data loss.
The only options should be (a) to resize the partition before ethe first boot4,5, or (b) to clean up the old deployments, if possible.
Fedora CoreOS exploits the Grub’s Boot Loader Specification (BLS)6 feature. BLS aims to provide a scheme for different operating systems to cooperatively manage the boot loader configuration, using drop-in files that build into menu/boot entries at runtime (i.e., during the boot, when Grub is loaded).
The systemd unit ostree-finalize-staged.service, which runs ostree admin finalize-staged, is responsible for creating the drop-in files in /boot/loader/entries.
Here is the hack I’ve crafted, which is currently keeping some of my old systems affected by this issue up to date and running.
- Mount /sysroot in read-write mode:
mount -o remount,rw /sysroot mv /boot/ostree /sysroot/boot/ln -s /sysroot/boot/ostree /boot/ostreeecho "set root=hd0,gpt4" >> /boot/grub2/grub.cfg- Add a new systemd unit to amend the /boot/loader/entries/*.conf files used by BLS with the updated target root and folders for kernel and initrd image:
# /etc/systemd/system/fix-bls-path.service
[Unit]
Description=Fix BLS entries to use /boot/ostree instead of /ostree
DefaultDependencies=no
Conflicts=final.target
# Shutdown ordering
RequiresMountsFor=/boot
# Run ExecStop script after ostree-finalize-staged.service stops
Before=ostree-finalize-staged.service
# Run ExecStop script before ostree-finalize-staged-hold.service and systemd-journal-flush.service stop
After=ostree-finalize-staged-hold.service systemd-journal-flush.service
Wants=ostree-finalize-staged-hold.service
[Service]
Type=oneshot
RemainAfterExit=yes
WorkingDirectory=/boot
ExecStop=/usr/bin/sh -c 'echo "Patching /boot/loader/entries to use the real path for kernel and initrd (moved in /sysroot)"; sed -i "s, /ostree, /boot/ostree," loader/entries/*.conf; echo "Patched /boot/loader/entries"'
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
The symbolic link guarantees ostree can finalize the deployments and store kernel and initrd image in /sysroot/boot/ostree, where we expect to have more available space.
The systemd unit updates the entries for grub in /boot/loader every time a staged deployment is finalized, such that the kernel and initrd images target /boot/ostree, instead of /ostree.
In the default installation file-system layout of Fedora CoreOS, (hd0, gpt3) is the grub’s mapping for the undersized boot partition. The grub variable $root targets hd0, gpt3, as set at the beginning of /boot/grub2/grub.cfg.
We need that configuration to remain in place during the execution of the configuration script. After running blscfg, the entries are loaded from the disk’s boot partition and they can be shown and ran.
Appending set root=hd0,gpt4 to the grub.cfg allows both the configuration loading (from hd0,gpt3) and the pivot to the actual root partition for loading kernel and initrd.