Naive Wang's Nowhere Land

My personal blog

Installation Guide of Arch Linux on Laptops for Deep Learning(NVIDIA)

Notice : this tutorial may be deprecated due to later system updates.

中文注: 本教程不会破坏包管理和配置. 一些国内的安装教程自做聪明的不遵循ArchWiki的安装方法配置系统, 导致pacman包管理依赖树被污染(通过下载并编译源码的方式回滚python-3.7, 并安装在系统环境)以及配置文件部分无效(例如fcitx输入法的不恰当配置会导致一些Qt5窗口(telegram-desktop, libreoffice)无法加载出输入框), 规范在国内圈子里是一个被忽视的问题, 如果你是一个来自中文社区的读者, 请多加注意

This guide is for users who want to install Arch Linux with on laptops with NVIDIA GPU for production of deep learning tool-chains; but issues my be encountered as DE lagging while running models on CUDA; nouveau driver raises FIFO errors; and being reluctant to use Manjaro destro instead. This guide gives instructions of installing Arch Linux with sddm/plasma on your laptop configuring your NVIDIA dGPU for CUDA only, and using intel iGPU for Desktop Envirnment without nouveau.

Passed laptop : Dell Precision 7540 (RTX3000, 32Gx3)

1. Preparation

# dd if=<Arch ISO file> of=/dev/<thumb drive device>

2. Install Arch Linux (UEFI)

Assume you do not have a dual boot and do not want it either.

use fdisk/cfdisk to create two partitions, the first one is for EFI with size ranging from 256M~512M and type 1(EFI), the second one is Linux with default type 0, you may want to create a swap while for a modern laptop with big ram and nvme SSD, this is not recommended.

make EFI:

$ mkfs.vfat -F32 /dev/<your first partition>

make linux: there are several file systems available(ext4, f2fs, xfs, btrfs, etc.), but most time ext4 is the best choice except you want you system on a USB flash(f2fs).

$ mkfs.<file system> /dev/<your linux partition>

If your laptop is wired, dhcpcd is running by default, usually network is avaliable.

$ mount /dev/<linux partition> /mnt

choose mirror:

edit /etc/pacman.d/mirrorlist to config the nearst and fastest mirror at your region, move the line of server to top.

start Installation Stage 1:

You can choose different kernel version like linux-lts with nvidia-lts

$ pacstrap /mnt base linux linux-firmware xf86-video-intel nvidia vi vim sudo dhcpcd net-tools grub efibootmgr

$ genfstab -U /mnt » /mnt/etc/fstab

change root to your newly installed Arch:

$ arch-chroot /mnt

config your host name:

$ echo “your host name” » /etc/hostname

set time:

$ ln -sf /usr/share/zoneinfo/You Zone /etc/localtime

blacklist nouveau

$ echo “blacklist nouveau” » /etc/modprobe.d/blacklist.conf

blacklist nvidia-drm module, which conflicts with your wm, but keep nvidia module for CUDA.

$ echo “blacklist nvidia-drm” » /etc/modprobe.d/blacklist.conf

repack kernel

$ mkinitcpio -P

reboot, all done.

Contention

For a laptop user, if nvidia-drm module is disabled, there is nothing to do with nvidia-settings, which is only avaliable while the card is rendering a screen server. As a result, the power management of your nvidia card is without control, for example , RTX3000 with max power at 80W is draining power from 12W to 30W while idling, with comparing to connecting to a X11 server, drains only about 8W while idling. So there is a bit of misleading at ArchWiki, which claims using NVIDIA card costs more energy comparing to using intel HD graphics, actually it does save more power when you totally shutdown your NVIDIA card with acpi call, but who would do that? I have purchased a high performance dGPU over $300 but not even use it just because the dilemma of my system?

There is a way of using a dummy screen for enable the use of nvidia-settings and to fine tune the power and performance, someone did it for overclocking and tuning fan speed on their mining/deeplearning rigs. But this method is not test on this thread, anyway, it is worth to try.