安裝部署gpu pass through
1)在KVM主機上啟用IOMMU
vi /etc/default/grubGRUB_TIMEOUT=5GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"GRUB_DEFAULT=savedGRUB_DISABLE_SUBMENU=trueGRUB_TERMINAL_OUTPUT="console"GRUB_CMDLINE_LINUX="rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet amd_iommu=on"GRUB_DISABLE_RECOVERY="true" |
如果是amd cpu在GRUB_CMDLINE_LINUX后面加上amd_iommu=on,如果是intel cpu則加上intel_iommu=on
2)禁用nouveau驅動
vi /etc/modprobe.d/blacklist-nouveau.confblacklist nouveauoptions nouveau modeset=0 |
3)升級grub參數并重啟生效
grub2-mkconfig -o /boot/grub2/grub.cfgreboot檢查iommu是否啟動dmesg | grep -E "DMAR|IOMMU"檢查nouveau是否禁用dmesg | grep -i nouveau |
4)啟動 vfio-pci 驅動,并綁定到設備
modprobe vfio-pci這里需要將顯卡所在的iommu_group所有設備都添加到/etc/modprobe.d/vfio.conf通過命令for iommu_group in $(ls -dv /sys/kernel/iommu_groups/*/); do echo "IOMMU group $(basename "$iommu_group")"for device in $(ls -1 "$iommu_group"/devices/); doecho -n $'\t'lspci -nns "$device"donedone查找到對應設備,將Vendor ID和Device ID添加到/etc/modprobe.d/vfio.conf...IOMMU group 2 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482] 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483] 07:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1650 SUPER] [10de:2187] (rev a1) 07:00.1 Audio device [0403]: NVIDIA Corporation TU116 High Definition Audio Controller [10de:1aeb] (rev a1) 07:00.2 USB controller [0c03]: NVIDIA Corporation TU116 USB 3.1 Host Controller [10de:1aec] (rev a1) 07:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 USB Type-C UCSI Controller [10de:1aed] (rev a1)...vi /etc/modprobe.d/vfio.confoptions vfio-pci ids=10de:2187,10de:1aeb,10de:1aec,10de:1aed,1022:1482,1022:1483執行dracut --forcerebootdmesg | grep -i vfio 檢查是否綁定[root@dev /]# lspci -nnk -d 10de:07:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1650 SUPER] [10de:2187] (rev a1) Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3852] Kernel driver in use: vfio-pci Kernel modules: nouveau07:00.1 Audio device [0403]: NVIDIA Corporation TU116 High Definition Audio Controller [10de:1aeb] (rev a1) Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3852] Kernel driver in use: vfio-pci Kernel modules: snd_hda_intel07:00.2 USB controller [0c03]: NVIDIA Corporation TU116 USB 3.1 Host Controller [10de:1aec] (rev a1) Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3852] Kernel driver in use: vfio-pci07:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 USB Type-C UCSI Controller [10de:1aed] (rev a1) Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3852] Kernel driver in use: vfio-pci Kernel modules: i2c_nvidia_gpu會出現有設備無法綁定情況,需要手動設置,比如USB controller這個設備綁定不了執行下面命令echo -n "0000:07:00.2" > /sys/bus/pci/drivers/xhci_hcd/unbindecho -n "0000:07:00.2" > /sys/bus/pci/drivers/vfio-pci/bind |