Summary of "I Almost Lost My Virtual Machines..."
Overview
This is a technical how‑to for single‑GPU passthrough on an Arch/Manjaro host so you can run Windows inside QEMU/KVM and give the guest exclusive access to a single physical GPU (author used an RTX 3090). The guide covers:
- prerequisites and kernel/boot settings
- checking IOMMU groups
- VBIOS/ROM handling (dump, patch, load)
- configuring libvirt/virt‑manager
- writing libvirt hooks to detach/rebind the GPU at VM start/stop
- workarounds for NVIDIA virtualization checks (Error 43)
The tutorial is practical and step‑by‑step with runnable scripts and examples, but repeatedly warns that setups differ and you’ll likely need to “taste and adjust” per your hardware.
Warning: every system is different — IOMMU groups, drivers, display manager names, and PCI addresses vary. Test carefully and adapt scripts to your hardware. ROM dumping/patching can be risky; back up originals.
Key concepts and benefits
- Single‑GPU passthrough: let a VM (Windows) own the only discrete GPU while the host gives it up on VM start and takes it back on shutdown.
- IOMMU grouping: required to isolate PCI devices safely for passthrough.
- VFIO: kernel driver stack used to bind the GPU to the VM.
- VBIOS / ROM patching: some GPUs (notably NVIDIA) may need a patched ROM passed to the VM to function.
- Libvirt hooks: scripts under
/etc/libvirt/hooks/qemu.d/<vm>/prepareand.../releaseautomate unbinding the GPU from host drivers and binding it to vfio‑pci (and reversing on shutdown). - NVIDIA Error 43 workarounds: spoof
vendor_id, hide KVM flags, and/or include a patched ROM.
Prerequisites
- Host distro: Manjaro (Arch family). Steps use Arch‑centric tooling (grub, pacman), but concepts apply to other distros.
- Required familiarity: terminal usage, editing
/etc/default/grub, editing VM XML in virt‑manager, basic root/sudo operations. - Recommended tools/packages (example pacman list used by the author):
- qemu, libvirt, edk2-ovmf, virt-manager
- ebtables, dnsmasq, libvirt-daemon services
- a hex editor (e.g., bless) and any VBIOS dump/flash tools if you plan to dump/patch ROM locally
Concrete step summary (ordered)
-
Enable IOMMU in kernel boot params
- Edit
/etc/default/gruband addintel_iommu=on(oramd_iommu=onfor AMD). - Rebuild grub:
sudo grub-mkconfig -o /boot/grub/grub.cfgand reboot.
- Edit
-
Verify IOMMU grouping
- Use
dmesg | grep -i iommuor the grouping script from the Arch wiki to confirm IOMMU is enabled and to list PCI IDs/groups. - Save the GPU PCI IDs (author saved them to a text file for later use).
- Use
-
Install virtualization packages
- Install QEMU/libvirt/OVMF/virt‑manager and start/enable libvirt services; enable the default libvirt network.
-
VBIOS / ROM handling
- Options:
- Download the exact VBIOS from a repository such as TechPowerUp.
- Dump the ROM from your GPU with appropriate tools (risky).
- If required, patch the ROM (hex editor like bless). Keep the original dump backed up and save the patched ROM for attaching to the VM.
- Options:
-
Create the VM in virt‑manager
- Use local install media (Windows ISO). Allocate memory/CPU/disk.
- Use OVMF (UEFI) firmware (
ovmf_code.fd) — UEFI is required for many single‑GPU setups. - Configure CPU topology to mirror the host cores/threads (recommended).
- Ensure CD‑ROM boot is enabled for the installer.
-
Create libvirt hooks to automate GPU switching
- Directory structure example:
/etc/libvirt/hooks/qemu.d/<VM-name>/prepare/start.sh/etc/libvirt/hooks/qemu.d/<VM-name>/release/revert.sh
- prepare/start.sh should:
- stop the display manager (e.g.,
systemctl stop sddm.service), unbind VT consoles and EFI framebuffer, remove host GPU modules, and bind the GPU tovfio-pci.
- stop the display manager (e.g.,
- release/revert.sh should:
- unload vfio, rebind the GPU to host drivers, restore VT consoles/EFI framebuffer, and restart the display manager.
- Scripts should source a small config file such as
/etc/libvirt/hooks/kvm.confthat contains PCI IDs as variables (e.g.,verse_gpu_video=pci_0000_01_00_0using underscores). - Make the scripts executable:
sudo chmod +x /etc/libvirt/hooks/qemu.d/...
- Directory structure example:
-
Attach the GPU and ROM to the VM
- In virt‑manager, add “PCI Host Device” entries for the GPU functions (GPU display, GPU audio, and any USB/other functions).
- Edit the VM XML and add a ROM line under the device so the VM uses the patched ROM, for example:
<rom file='/home/<user>/patch.rom'/>
-
Spoof vendor_id / hide KVM flags to bypass NVIDIA Error 43
- Edit VM XML (features) to add a
vendor_idelement under<features>/<hyperv>or setvendor_id state='on' value='SOME_STRING'. - Optionally hide KVM CPU flags if needed.
- Edit VM XML (features) to add a
-
Test
- Start the VM and watch the hooks unload host drivers and bind the GPU to VFIO. Windows should detect the GPU in Device Manager.
- On VM shutdown, the release script should return the GPU to the host cleanly.
Libvirt hooks — implementation notes
- Hook paths:
/etc/libvirt/hooks/qemu.d/<vm>/prepare/start.shand.../release/revert.sh. - Common actions in hooks:
- Stop the display manager (sddm/gdm/lightdm).
- Unbind consoles / efi framebuffer (avoid host using the card).
- modprobe -r (e.g., nvidia) and then bind
vfio-pci. - On release: unload vfio, reattach host drivers, call
nvidia-xconfig --query-gpu-info(or similar) if needed, and restart the display manager.
- Race conditions/timing: the author used sleeps (example: 10s) to avoid races — you may need to tweak waits.
VBIOS / ROM handling
- Options:
- Download known good ROMs (e.g., TechPowerUp VGA BIOS database).
- Dump ROM locally (dangerous if you try to flash).
- Patch ROM in a hex editor when necessary (NVIDIA may require modifications).
- Always keep an original backup and prefer downloaded ROMs if you’re unsure about dumping/flashing.
NVIDIA Error 43 workarounds
- Spoof
vendor_idin VM XML to hide virtualization from the NVIDIA driver. Example: set a string with<vendor_id state='on' value='SOME_STRING'/>under CPU/features or hyperv features. - Hide KVM CPU flags if necessary.
- Passing a patched ROM to the VM is commonly used to make some NVIDIA cards initialize correctly in a VM.
Troubleshooting & notes
- System differences: IOMMU groups, driver names, display manager services, and PCI addresses vary; adapt scripts accordingly.
- ROM dumping/patching risk: back up originals; prefer known downloads if unsure.
- Timing/race conditions: add sleeps/delays in hooks if you see failures or blank screens.
- AMD vs NVIDIA: AMD often needs fewer ROM tweaks; NVIDIA commonly requires vendor_id hiding and more steps.
- If the VM fails to start or the host screen blanks:
- Check hook logs.
- Ensure scripts are executable and the PCI IDs in your config match the actual devices.
- Verify libvirt/virt‑manager XML edits (ROM, vendor_id) are correct.
Commands & files mentioned (examples)
- Kernel parameter (in
/etc/default/grub):intel_iommu=onoramd_iommu=on - Rebuild grub:
sudo grub-mkconfig -o /boot/grub/grub.cfg - Install example packages (Manjaro/Arch):
sudo pacman -S qemu libvirt edk2-ovmf virt-manager ebtables dnsmasq bless
- Libvirt hooks path:
/etc/libvirt/hooks/qemu.d/<vm>/prepare/start.shand/etc/libvirt/hooks/qemu.d/<vm>/release/revert.sh - Hook config file:
/etc/libvirt/hooks/kvm.conf(contains PCI ID variables) - Make scripts executable:
sudo chmod +x /etc/libvirt/hooks/qemu.d/... - Example VM XML ROM line:
<rom file='/home/<user>/patch.rom'/> - OVMF firmware to use:
ovmf_code.fd(UEFI)
Resources / further reading
- Arch Linux wiki — virtualization and VFIO passthrough pages.
- “Pass-through post” and community VFIO single‑GPU passthrough guides (multiple blog posts and GitLab repos, e.g., matoking and other contributors).
- TechPowerUp VGA BIOS database (for VBIOS/ROM downloads).
- The Arch wiki grouping script and VFIO community guides for checking IOMMU groups and binding devices.
Speaker / sponsor
- Presenter: Mudahar (YouTuber who presented this tutorial).
- Sponsor mentioned at start: Raycon (brief product plug: ~6 hours play time, multiple colors, 45‑day returns).
Category
Technology
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.