So I’m no expert on passing the intel gpu through but based on the “Read-only file system” my guess would be the permissions aren’t getting set for the container to be allowed to write through to the device.
Hi, love your guide, but had a problem after finishing all the steps, without any failure.
So I can see my GTX960 in the LXC (Ubuntu) and “nvidia-smi” is working well in there. But when I run a movie/series etc. and starting transcoding using the HW acceleration, the GPU does nothing (Well, the memory start filling a little bit, but just about 0-1% GPU-Utility).
I know that the GPU can handle >20x transcoding 1080p → 240p, because I get it run before, but after upgrading my machine I don’t know my steps done before …
I was look on the host, as well as on LXC.
Here are some pictures, which show that the HW acceleration is (more or less) just working on two streams. While going through your ‘guide’ the patch also said “Patched!”.
Yesterday I also have redone all the steps, but it was giving the same result.
@evomod It’s perfectly normal for the nvidia-smi utility to show 0% GPU activity while transcoding. It is only showing you a point in time snapshot of activity of the GPU. And since the transcoder does not keep a sustained load running on the GPU (because the GPU can transcode several times faster than real time, which is the point) the GPU/hardware transcoder will go idle after it has a filled a buffer that matches the setting you have in ‘Transcoder default throttle buffer’. Which typically will only take a few seconds. Then after some percentage of that buffer is depleted from playback, it will burst again to refill it. If you’d like to watch activity of the GPU as you start a transcoded stream to see this in action, use this command:
watch -n 2 -d -c ‘nvidia-smi -q -d UTILIZATION’
As far load goes, think about it, if your GPU can handle 20 simultaneous transcodes and you’re only running 2, it’s going to be idle AT LEAST 90% of the time.
The instruction @constiens has given were 100% right.
I can not say why, but I have made a mistake with the
ls -l /dev/nv*
The output was “195” for nvidia-modeset and “236” for nvidia-uvm.
But I typed 234 instead of 236 in for the LXC.
I thought I have checked everything twice, but forgot about this command.
@evomod Very cool. I actually have had that same exact issue occur on my proxmox host. Interesting that it was showing you it was kind of working. For me it was after a kernel update on the host and I had to reinstall the nvidia driver.
But for me, basically the nvidia device ID does not always stay the same. I’m not sure if it relates to driver version or the order of which devices are initialized at boot. I’ve seen my GPU bounce between 236 and 237. So my solution has been to put all the device IDs the GPU has shown up as in the LXC config file and leave them all in simultaneously.
example:
lxc.cgroup.devices.allow: c 195:* rwm
lxc.cgroup.devices.allow: c 236:* rwm
lxc.cgroup.devices.allow: c 237:* rwm
I’ve not seen other devices use those ID’s so this config doesn’t seem to interfere with anything else.
That could have happened to me too.
I thought that the 234 was correct, so not looking after this anymore. Doing a restart and the hardware transcoding was not working, because of the change in numbers. I will also add some more numbers to the config file because in some weeks I do not think about this and have the same issue again … never had this problem with proxmox 5 and my old config, strange.
This is interesting because I just did the same thing on mine too include all the numbers I get. I’m in travel right now but I’ll update the guide with a note with this suggestion when I get back.
Mine typically ping pong between two different ids