Adventures in upgrading Proxmox
Running docker inside LXC is weird. It's containers on top of other container, and there was a fairly recent issue with AppArmor that prevented some functionality from running inside a docker container with very cryptic error. I was trying to deploy coolify and/or dokploy in my homelab and hitting all sorts of weird issues. Eventually I've found this GitHub issue for runc, and, apparently, it was fixed in the new version of pve-lxc package. But I'm still on Proxmox 8, and the new version seemingly only available in Proxmox 9.
I've upgraded one node without much hassle, but the second node, the one that runs my NVR and has the Coral TPU, that one gave me some grief. Because Apex drivers are installed as a DKMS module, it failed to rebuild, which interrupted the system upgrade process. Not sure how exactly, but after the reboot the system did not come back online. The machine is in the basement, which means I have to take my USB KVM and make a trip downstairs...
I've recently added another Zigbee dongle, that supports Thread, and it happens to share same VID:PID combo as the old dongle, so due to how these were mapped into guest OS, all my light switches stopped working. I had to fix the issue fast.
Thankfully I was able to reach the GRUB screen and pick previous kernel, so I could boot into the machine. That was a plus, but trying to reboot into the new kernel still caused panic.
Google suggested that the unable to mount rootfs on unknown-block(0,0) error indicates an issue with missing initrd, which needs to be regenerated with update-initramfs -u -k ${KERNEL_VERSION}. It ran successfully, albeit with somewhat cryptic no /etc/kernel/proxmox-boot-uuids found message. After reboot it kernel-panicked again, even though the /boot/initrd-${VERSION} files were present. I guess that error is relevant. After another quick Google search I've found this Reddit thread which provided the steps to solve this issue.
lsblk -o +FSTYPE | grep /boot/efi # understand which device the EFI partition is on
unount /boot/efi
proxmox-boot-tool init /dev/${DEVICE} # plug in device from step 1
mount /boot/efi
update-initramfs -u -k all
rebootThis generated the necessary file and after rebooting the system was able to boot again with the new kernel.
While trying to troubleshoot I've also uninstalled the Apex DKMS module, and now I had to re-install it again, but it started failing with errors because of the kernel change.
Apparently some symbols/API's where obsoleted and I had to patch the source code. Upstream seemingly did not have it, but I found the necessary changes:
diff --git a/src/gasket_core.c b/src/gasket_core.c
index b1c2726..88bd5b2 100644
--- a/src/gasket_core.c
+++ b/src/gasket_core.c
@@ -1373,7 +1373,9 @@ static long gasket_ioctl(struct file *filp, uint cmd, ulong arg)
/* File operations for all Gasket devices. */
static const struct file_operations gasket_file_ops = {
.owner = THIS_MODULE,
+#if LINUX_VERSION_CODE < KERNEL_VERSION(6,0,0)
.llseek = no_llseek,
+#endif
.mmap = gasket_mmap,
.open = gasket_open,
.release = gasket_release,
diff --git a/src/gasket_page_table.c b/src/gasket_page_table.c
index c9067cb..0c2159d 100644
--- a/src/gasket_page_table.c
+++ b/src/gasket_page_table.c
@@ -54,7 +54,7 @@
#include <linux/vmalloc.h>
#if __has_include(<linux/dma-buf.h>)
-MODULE_IMPORT_NS(DMA_BUF);
+MODULE_IMPORT_NS("DMA_BUF");
#endif
#include "gasket_constants.h"After doing this and re-running the build process (as outlined in the previous post), the driver installed and I was able to bring back frigate.
Big thanks to /u/Dunadan-F for the solution.