[HowTo] Nvidia hardware transcoding on OMV 5 in a Plex docker container

  • For OMV 6 proceed to the second post: OMV 6


    It is quite tricky to get Plex hardware transcoding working on OMV in a docker container with nvidia graphics cards.

    It took me some weeks to find out all the little details and I want to share it with you.


    My configuration is tested for:

    - Debian 10 Buster (wich is OMV5)

    - OMV 5.6.13-1 Usul

    - Kernel Linux 5.10.0-0.bpo.8-amd64

    - Docker 5:20.10.8~3-0~debian-buster

    - Portainer 2.6.0

    - Nvidia driver 460.73.01

    - Cuda Version 11.2

    - Nvidia Quadro P1000 graphics card

    - Plex 1.24.0.4897


    First of all and very important:

    Hardware transcoding must work on the intel hardware ! There are several tutorials for that. Mainly it is related to access rights like the "noexex" to "exec" option in the fstab.


    Step 1:

    if you already tried to install nvidia driver: purge it out !

    Code
    apt-get purge *nvidia*
    
    apt autoremove
    
    apt autoclean


    Step 2:

    prepare your header files. Very important !

    Code
    apt-get install module-assistant
    
    sudo m-a prepare



    Step 3: installing Nvidia Driver

    follow the instructions given by Nvidia:


    NvidiaGraphicsDrivers - Debian Wiki



    in detail:


    Add buster-backports to your /etc/apt/sources.list , for example:

    Code
    #Busterter-backports
    deb http://deb.debian.org/debian buster-backports main contrib non-free


    then:


    Code
    apt update
    apt install -t buster-backports nvidia-driver firmware-misc-nonfree


    IMPORTANT:


    watch the messages during intallation. It there are any error messages something is wrong and you might not have debian buster with backports ! You have to solve this issue first before proceeding !!!



    now we have to configure nvidia:

    Code
    apt install -t buster-backports nvidia-xconfig
    
    sudo nvidia-xconfig


    Since Docker 5.20.10.2 (I think) there was a change how docker gets access to hardware via cgroups. You need this workaround in the kernel boot parameters:


    Code
    echo 'GRUB_CMDLINE_LINUX=systemd.unified_cgroup_hierarchy=false' > /etc/default/grub.d/cgroup.cfg
    
    update-grub


    now reboot and the nvidia driver should already work.


    Step 4: Install Nvidia container toolkit

    Follow the Installation guide by Nvidia:

    https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#setting-up-nvidia-container-toolkit


    in detail:


    Install curl if you don't have it already:

    Code
    sudo apt install curl



    Setup the stable repository and the GPG key - all in one command! (use copy button):

    if this doesn't work please check for any changes under this link: nvidia container

    Code
    distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
          && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
          && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
                sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
                sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list



    Install the nvidia-docker2 package (and dependencies) after updating the package listing:

    Code
    sudo apt-get update
    
    apt install -t buster-backports nvidia-docker2


    Now install Nvidia encode library and nvidia-smi:

    Code
    apt install -t buster-backports libnvidia-encode1
    
    apt install -t buster-backports nvidia-smi


    Step 5: install Nvidia container runtime:

    Code
    apt install -t buster-backports nvidia-container-runtime


    Step 6: some modifications:

    Change/edit the Daemon configuration file

    /etc/docker/daemon.json :


    Code
    {
        "runtimes": {
            "nvidia": {
                "path": "/usr/bin/nvidia-container-runtime",
                "runtimeArgs": []
            }
        },
        "default-runtime": "nvidia",
        "data-root": "/var/lib/docker"
    }


    and the

    /etc/nvidia-container-runtime/config.toml

    to:



    Restart docker:

    Code
    sudo systemctl restart docker



    Step 7: starting Plex (in Portainer)

    in Portainer you have to add the following parameters in the "Env" tab:

    name:




    NVIDIA_DRIVER_CAPABILITIES




    Value




    compute,video,utility




    name




    NVIDIA_VISIBLE_DEVICES




    Value




    all




    and in the tab "Runtime & Ressources":

    change the "Runtime" Value from runc to nvidia !!!


    No Privileged mode and no Init set.




    Step 8: try and error:

    this is how it worked for me. If you want to check operation you can display GPU load with:


    Code
    watch -d -n 0.5 nvidia-smi

    (If nvidia-smi is not installed, do apt install -t buster-backports nvidia-smi)“


    or install:


    Code
    apt install nvtop


    and use:


    Code
    nvtop



    You can also try/use the docker command line interface (cli) to start Plex:





    Step 9: get rid of the session limit:

    If you want to disable the session limit (my P1000 had a limit of 3) go ahead with this link. It worked for me also:

    GitHub - keylase/nvidia-patch: This patch removes restriction on maximum number of simultaneous NVENC video encoding sessions imposed by Nvidia to consumer-grade GPUs.
    This patch removes restriction on maximum number of simultaneous NVENC video encoding sessions imposed by Nvidia to consumer-grade GPUs. - GitHub -…
    github.com


    in detail:


    if you don't have git, install it:


    Code
    apt install git

    then

    Code
    git clone https://github.com/keylase/nvidia-patch.git nvidia-patch


    Patch the nvidia driver:

    Code
    cd nvidia-patch
    bash ./patch.sh


    If you want to rollback:

    Code
    bash ./patch.sh -r


    Step 10: Nvidia Power Management:

    You need these modifications if you are using autoshutdown as the nvidia driver (or the pci-bus of the card?) is falling of if restarting from hibernate or suspend mode !

    You can read more about that in the nvidia documentation:

    Chapter 22. PCI-Express Runtime D3 (RTD3) Power Management

    Chapter 21. Configuring Power Management Support

    Chapter 29. Using the nvidia-persistenced Utility



    First of all you need the dedicated nvidia scripts for power management and you have to find them in the nvidia driver install package:


    Download the nvidia install package from:

    Unix-Treiber | NVIDIA


    or for the 460 driver the direct link:

    Linux x64 (AMD64/EM64T) Display Driver | 460.39 | Linux 64-bit | NVIDIA
    Lade den Deutsch Linux x64 (AMD64/EM64T) Display Driver für Linux 64-bit Systeme. Veröffentlicht 2021.1.26
    www.nvidia.de


    Or on the system:

    Code
    wget http://us.download.nvidia.com/XFree86/Linux-x86_64/460.39/NVIDIA-Linux-x86_64-460.39.run


    Now do not install the driver (!) - just extract it:



    Code
    sh NVIDIA-Linux-x86_64-460.39.run --extract-only



    In the next step you have to search for the following files and copy them to the given directories (I used an ssh client for this)


    Code
    /etc/systemd/system/nvidia-suspend.service
    /etc/systemd/system/nvidia-hibernate.service
    /etc/systemd/system/nvidia-resume.service
    /lib/systemd/system-sleep/nvidia
    /usr/bin/nvidia-sleep.sh



    then enable the services:


    Code
    sudo systemctl enable nvidia-suspend.service
    sudo systemctl enable nvidia-hibernate.service
    sudo systemctl enable nvidia-resume.service


    change nvidia kernel config:


    /etc/modprobe.d/nvidia-kernel-common.conf

    Code
    options nvidia NVreg_PreserveVideoMemoryAllocations=1
    options nvidia NVreg_DynamicPowerManagementVideoMemoryThreshold=100
    options nvidia NVreg_DynamicPowerManagement=0x02
    options nvidia NVreg_EnableMSI=0


    now make nvidia-sleep.sh executable and update modules:


    Code
    chmod a+x /usr/bin/nvidia-sleep.sh
    
    sudo update-initramfs -u


    Remark:

    I still use the script files from the 450 driver version and they still work with 460. So I think they are more general and not driver specific. But if you experience problems

    with hardware transcoding on a new driver version after a resume maybe you have to extract them fresh from the new driver package.




    And I don't know if this final part is really needed, but if you still have issues with resume from suspend you can try to disable active state power management on pcie:


    Change/add kernel parameter pcie_aspm=off in /etc/default/grub to:

    Code
    GRUB_CMDLINE_LINUX_DEFAULT="quiet pcie_aspm=off"


    and after changing grub:


    Code
    sudo update-grub
    
    reboot



    That was a long way :)


    Please let me know if there are still issues with htis guide. I'll try to keep it up to date...


    Good luck,


    Chris

  • geaves

    Hat den Titel des Themas von „Nvidia hardware transcoding on OMV 5 in a Plex docker container“ zu „[HowTo] Nvidia hardware transcoding on OMV 5 in a Plex docker container“ geändert.
  • chris_kmn

    Hat das Label OMV 5.x hinzugefügt.
  • ******************************* OMV6 & OMV 7**************************************


    If you are having issues with the new docker-compose-Plugin, read the comments regarding that issue.


    Tested and running on:

    - Linux Debian 11, Kernel 5.18.0-0.bpo.4-amd64, 5.18.0-0.deb11.4-amd64, 5.19.0-0.deb11.2-amd64, 5.19.17-1-pve, Linux 6.1.15-1-pve, Linux 6.2.16-11-bpo11-pve, 6.2.16-20-pve, 6.5.13-1-pve, 6.1.15-1-pve

    - 6.9.1-1 (Shaitan), 7.0.1-1 (Sandworm)

    - docker-ce 5:25.0.4-1~debian.12~bookworm

    - docker-compose-plugin 2.24.7-1~debian.12~bookworm

    (- Portainer 2.18.2)

    - nvidia driver: 470.199.02, Cuda 11.4, 525.147.05-7~deb12u1 Cuda 12.0

    - Nvidia Quadro P1000 & T600 graphics card

    - Plex 1.40.1.8173


    Not working: nvidia driver 550.54.14, Cuda 12.4


    First of all and very important:

    Transcoding must work on the intel hardware ! There are several tutorials for that. Mainly it is related to access rights like changing the "noexex" to "exec" option in the fstab and the correct setting of plex docker container


    Second advise:

    Check if your graphics card supports the codecs your videos are encoded. If your card is older, hw-transcoding will not work even if the setup is done properly.

    Here is an overview provided by nvidia: Nvidia codec matrix


    And now let‘s start:

    Step 1:

    if you already tried to install nvidia driver: purge it out !

    Code
    apt-get purge *nvidia*
    
    apt autoremove
    
    apt autoclean


    You can also try this method if the above leads to errors:


    Code
    sudo apt-get remove --purge '^nvidia-.*'
    
    apt autoremove
    
    apt autoclean

    Step 2:

    prepare your header files:


    Code
    apt-get install module-assistant
    
    sudo m-a prepare


    Step 3:

    Instructions from nvidia. In detail:


    Add "contrib" and "non-free" components to /etc/apt/sources.list, for example:

    Code
    # Debian Bullseye
    deb http://deb.debian.org/debian/ bullseye main contrib non-free
    deb-src http://deb.debian.org/debian/ bullseye main contrib non-free


    or Bookworm:


    Code
    # Debian Bookworm
    deb http://deb.debian.org/debian/ bookworm main contrib non-free non-free-firmware
    deb-src http://deb.debian.org/debian/ bookworm main contrib non-free non-free-firmware



    You should check that thare aren't any doubled entries ! But there might be additional other sources, depending on your setup.



    Then:


    Code
    apt update
    
    apt install nvidia-driver firmware-misc-nonfree


    Check if there aren‘t any error messages.


    If there are no errors, proceed:


    Code
    apt install nvidia-xconfig
    
    sudo nvidia-xconfig



    Since Docker 5.20.10.2 (I think) there was a change how docker gets access to hardware via cgroups. You need this workaround in the kernel boot parameters:


    Do the following:


    Code
    echo 'GRUB_CMDLINE_LINUX=systemd.unified_cgroup_hierarchy=false' > /etc/default/grub.d/cgroup.cfg
    
    
    update-grub


    now reboot and the nvidia driver should already work.


    Check with


    Code
    apt install nvidia-smi
    
    nvidia-smi


    it should show your nvidia card and driver.

    Step 4: Install Nvidia container toolkit

    Follow the Installation guide by Nvidia:

    Installation Guide — NVIDIA Cloud Native Technologies documentation


    in detail:


    Install curl if you don't have it already:


    Code
    sudo apt install curl



    Setup the stable repository and the GPG key - all in one command! (use copy button):

    if this doesn't work please check for any changes under this link: nvidia container


    Code
    curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
        sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
        sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
    Code
    sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list


    Install the nvidia-docker2 package (and dependencies) after updating the package listing:


    Code
    sudo apt-get update
    
    apt install nvidia-docker2


    Now install Nvidia encode library and nvidia-smi:


    Code
    apt install libnvidia-encode1


    Step 5: install Nvidia container runtime and configure it:

    (the configuration tool seems to be new. I've never used it before)

    Code
    sudo apt-get install -y nvidia-container-toolkit
    
    sudo nvidia-ctk runtime configure --runtime=docker
    
    sudo systemctl restart docker

    *******************************************************************************

    Important Comment if you are using the compose plugin:


    With the new docker compose plugin, every time you reconfigure the OMV settings, that plugin is overwriting the daemon.json
    from step 6.


    !!! To prevent this you have to go to /#/services/compose/settings in the OMV GUI and leave the entry for "Docker Storage" blank !!!
    (standard setting is /var/lib/docker)



    If overwriting happened , repeat the commands:


    Code
    sudo nvidia-ctk runtime configure --runtime=docker
    
    sudo systemctl restart docker

    ***********************************************************************************


    Step 6: some modifications:

    a.) Configuration for use without compose-Plugin:

    Change/edit the Daemon configuration file   /etc/docker/daemon.json :


    Code
    {
        "runtimes": {
            "nvidia": {
                "path": "/usr/bin/nvidia-container-runtime",
                "runtimeArgs": []
            }
        },
        "default-runtime": "nvidia",
        "data-root": "/var/lib/docker"
    }


    b.:) Configuration for use with compose-Plugin:


    Code
    {
        "data-root": "/var/lib/docker",
        "runtimes": {
            "nvidia": {
                "args": [],
                "path": "nvidia-container-runtime"
            }
        }
    }


    c.:) I wasn' able to figure out until now, if the two configs are needed or if one or the other works for both. May be it is even not necessary to manually edit the daemon.json anymore. I'll keep you updated -> it seems that both configs are working the same. I'll keep them for a while in the guid, delete one later.



    and /etc/nvidia-container-runtime/config.toml to: (you can leave out the comments #...)


    Code
    disable-require = false
    accept-nvidia-visible-devices-envvar-when-unprivileged = true
    
    [nvidia-container-cli]
    environment = []
    load-kmods = true
    ldconfig = "/sbin/ldconfig.real"
    
    [nvidia-container-runtime]
    #debug = "/var/log/nvidia-container-runtime.log"



    Restart docker:


    Code
    sudo systemctl restart docker


    Step 7a: starting Plex (in Portainer)

    (Docker-image: linuxserver/plex:latest. Link: Docker)

    in Portainer you have to add the following parameters in the "Env" tab:

    name:




    NVIDIA_DRIVER_CAPABILITIES




    Value




    compute,video,utility




    name




    NVIDIA_VISIBLE_DEVICES




    Value




    all




    nameVERSIONValueplexpass


    and in the tab "Runtime & Ressources":

    change the "Runtime" Value from runc to nvidia !!!



    No Privileged mode and no Init set.


    Check that you have set up properly the PUID and GUID values and changed the ownership of the plex config directory to that user ! (also needed for Intel HW Transcoding)


    Step 7b: starting Plex with compose:

    Go to the compose-plugin and create a new compose-file under the files-section:


    If you are getting a runtime error when starting the container, check if docker is running with nvidia:

    Code
    sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi


    If it is not starting, repeat:


    Code
    sudo nvidia-ctk runtime configure --runtime=docker
    
    sudo systemctl restart docker


    Check that you have set up properly the PUID and GUID values and changed the ownership of the plex config directory to that user ! (also needed for Intel HW Transcoding)



    Step 8: check the box

    this is how it worked for me. If you want to check operation you can display GPU load with:


    Code
    watch -d -n 0.5 nvidia-smi


    or install:


    Code
    apt install nvtop


    and use:


    Code
    nvtop



    You can also try/use the docker command line interface (cli) to start Plex:





    Step 9: get rid of the session limit:

    If you want to disable the session limit (my P1000 had a limit of 3) go ahead with this link. It worked for me also:

    GitHub - keylase/nvidia-patch: This patch removes restriction on maximum number of simultaneous NVENC video encoding sessions imposed by Nvidia to consumer-grade GPUs.
    This patch removes restriction on maximum number of simultaneous NVENC video encoding sessions imposed by Nvidia to consumer-grade GPUs. - GitHub -…
    github.com


    in detail:


    if you don't have git, install it:


    Code
    apt install git

    then

    Code
    git clone https://github.com/keylase/nvidia-patch.git nvidia-patch


    Patch the nvidia driver:

    Code
    cd nvidia-patch
    bash ./patch.sh


    If you want to rollback:

    Code
    bash ./patch.sh -r


    Please let me know if there are still issues with htis guide. I'll try to keep it up to date...


    Good luck,


    Chris

  • Step 10: Nvidia Power Management:

    This issue might be solved in the actual drivers. But if you have problems with a "fallen off graphics card" after a suspend or hibernate (no hw-transcoding after suspend/hibernate) you can try this.

    You need these modifications if you are using autoshutdown as the nvidia driver (or the pci-bus of the card?) is falling of if restarting from hibernate or suspend mode !

    You can read more about that in the nvidia documentation:

    http://us.download.nvidia.com/…namicpowermanagement.html

    http://us.download.nvidia.com/…ADME/powermanagement.html

    http://us.download.nvidia.com/…/nvidia-persistenced.html



    First of all you need the dedicated nvidia scripts for power management and you have to find them in the nvidia driver install package:


    Download the nvidia install package from:

    Unix-Treiber | NVIDIA


    or for the 470 driver the direct link:

    470.103.01


    Or on the system:

    Code
    wget http://us.download.nvidia.com/XFree86/Linux-x86_64/470.103.01/NVIDIA-Linux-x86_64-470.103.01.run


    Now do not install the driver (!) - just extract it:


    Code
    sh NVIDIA-Linux-x86_64-470.103.01.run --extract-only



    In the next step you have to search for the following files and copy them to the given directories (I used an ssh client for this)


    Code
    /etc/systemd/system/nvidia-suspend.service
    /etc/systemd/system/nvidia-hibernate.service
    /etc/systemd/system/nvidia-resume.service
    /lib/systemd/system-sleep/nvidia
    /usr/bin/nvidia-sleep.sh



    then enable the services:


    Code
    sudo systemctl enable nvidia-suspend.service
    sudo systemctl enable nvidia-hibernate.service
    sudo systemctl enable nvidia-resume.service


    change nvidia kernel config /etc/modprobe.d/nvidia-kernel-common.conf:


    Code
    options nvidia NVreg_PreserveVideoMemoryAllocations=1
    options nvidia NVreg_DynamicPowerManagementVideoMemoryThreshold=100
    options nvidia NVreg_DynamicPowerManagement=0x02
    options nvidia NVreg_EnableMSI=0


    now make nvidia-sleep.sh executable and update modules:


    Code
    chmod a+x /usr/bin/nvidia-sleep.sh
    
    sudo update-initramfs -u


    Remark:

    I still use the script files from the 450 driver version and they still work with 460. So I think they are more general and not driver specific. But if you experience problems

    with hardware transcoding on a new driver version after a resume maybe you have to extract them fresh from the new driver package.




    And I don't know if this final part is really needed, but if you still have issues with resume from suspend you can try to disable active state power management on pcie:


    Change/add kernel parameter pcie_aspm=off in /etc/default/grub to:

    Code
    GRUB_CMDLINE_LINUX_DEFAULT="quiet pcie_aspm=off"


    and after changing grub:


    Code
    sudo update-grub
    
    reboot
  • chris_kmn

    Hat das Label von OMV 5.x auf OMV 6.x geändert.

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!