Discussion:
Random freezing on GNOME with AMDGPU
(too old to reply)
CToID
2024-07-02 19:40:03 UTC
Permalink
Hello folks,

I wonder if any of you who is using an AMD GPU (especially newer ones)
has encountered the same problem as I do.

My problem is that sometimes the screen just freezes entirely, and I
have to switch to another TTY and back in order to get it unstuck. But
the same freeze will usually happen again after I get the things
unstuck. Restart my PC doesn't fix the problem.

Here are some informations about my system:

- CPU/GPU: AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics
- vainfo output: in attachment `vainfo-output`
- system log: in attachment `sys-log`

I do find a workaround online somewhere, that is to set
`amdgpu.vm_update_mode=3` kernel parameter, and it seems to fix the problem.

I never have this kind of problem on Linux distros I used before Debian
(Void, OpenSUSE,) so if anyone has some idea why this happens, please
give me some advice, thanks!
--
Best,

ID
Greg Marks
2024-07-03 05:10:01 UTC
Permalink
My problem is that sometimes the screen just freezes entirely, and I have to
switch to another TTY and back in order to get it unstuck. But the same
freeze will usually happen again after I get the things unstuck. Restart my
PC doesn't fix the problem.
I have a similar problem (with KDE). I upgraded from Debian 10 to
Debian 12.5. Then I discovered that the NVidia driver for my old GF108
graphics card is no longer available.
...
I updated another computer with an NVidia Quadro graphics card. NVidia
says the Debian nvidia-driver package works -- but it's not part of the
default net-install, and apt-get refuses to install it. And it refuses
to install the nvidia-tesla drivers. I gave up on compiling the drivers
I downloaded directly from NVidia, for the same reasons. And a laptop.
So they're running nouveau too (and occasionally freezing).
Does nouveau cause the freezes?
I'm not sure if this is related, but a couple years ago I had multiple
computer freezes possibly caused by nouveau. The screen froze; the
keyboard and mouse were unresponsive. (If I remember correctly, the
mouse pointer could be moved around on the screen, but it wouldn't
do anything.) Sometimes I could ssh to the machine and run certain
commands; often I would have to hold down the power button to force a
shutdown and then start it back up (which produces lots of orphaned
inodes until one does a clean reboot). In the logs, the kernel
would report "general protection fault, probably for non-canonical
address...," with a call trace including lines "nouveau_fence_new..."
and "nouveau_gem_ioctl_pushbuf..." and "nouveau_gem_ioctl_new..." and
"nouveau_drm_ioctl..." In my case, the GPU was a PNY Quadro K620.
Similar issues seem to have been reported here:

https://forums.debian.net/viewtopic.php?t=151941

and here:

https://gitlab.freedesktop.org/drm/amd/-/issues/1585

Whether these issues are connected with the OP's problem, and whether
the crashes have to do with nouveau or bad RAM or something else, I
don't know. My "solution" was to stop using software (e.g. certain Web
browsers) that seemed to have been running at the time of the crashes.

Best regards,
Greg Marks
Richard
2024-07-03 11:40:01 UTC
Permalink
Have the same issue, though it's pretty much impossible to reproduce it
reliably. But it seems to be a general issue with the AMDGPU driver in
Linux 6.1+:
https://bbs.archlinux.org/viewtopic.php?id=292673&p=2

It also seems to already have an official tracker:
https://gitlab.freedesktop.org/drm/amd/-/issues/3163

Best
Richard
Post by CToID
Hello folks,
I wonder if any of you who is using an AMD GPU (especially newer ones)
has encountered the same problem as I do.
My problem is that sometimes the screen just freezes entirely, and I
have to switch to another TTY and back in order to get it unstuck. But
the same freeze will usually happen again after I get the things
unstuck. Restart my PC doesn't fix the problem.
- CPU/GPU: AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics
- vainfo output: in attachment `vainfo-output`
- system log: in attachment `sys-log`
I do find a workaround online somewhere, that is to set
`amdgpu.vm_update_mode=3` kernel parameter, and it seems to fix the problem.
I never have this kind of problem on Linux distros I used before Debian
(Void, OpenSUSE,) so if anyone has some idea why this happens, please
give me some advice, thanks!
--
Best,
ID
George at Clug
2024-07-03 15:00:01 UTC
Permalink
If it is of any interest...

Two of us are using AMDGPU for Radeon RX 7700 (one computer) and RX 6600 (four computers). Two computers have XFCE the other have KDE (no Gnome).

The Radeon RX 7700 runs Arch Linux, the other Debian 12 (Bookworm), all kept up to date.

A while ago we had a few lock ups which we put down to what we were doing, like having too many tabs open in Firefox while running 100% CPU utilisation while gaming.

However now that we are reducing the number of open tabs, we have not experienced a lock up for some time. It maybe be a result of some of the pages that were connected. Issues seemed to happen when certain web sites were opened. Only happened so infrequently, it was hard to find a pattern.

I am always concerned about running X11 programs under a Wayland GUI, particularly with screen sharing a X11 game or using OBS Studio, something which was also what we were doing.

Since we are no longer experiencing lockups, I wonder if your system has become stable, and if you have tried limiting what concurrent software you are using, particularly mixing X11 and Wayland native programs that are interacting with each other. I can only imagine the hoops the software is going through, I am amazed it all works.

George.
Post by Richard
Have the same issue, though it's pretty much impossible to reproduce it
reliably. But it seems to be a general issue with the AMDGPU driver in
https://bbs.archlinux.org/viewtopic.php?id=292673&p=2
https://gitlab.freedesktop.org/drm/amd/-/issues/3163
Best
Richard
Post by CToID
Hello folks,
I wonder if any of you who is using an AMD GPU (especially newer ones)
has encountered the same problem as I do.
My problem is that sometimes the screen just freezes entirely, and I
have to switch to another TTY and back in order to get it unstuck. But
the same freeze will usually happen again after I get the things
unstuck. Restart my PC doesn't fix the problem.
- CPU/GPU: AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics
- vainfo output: in attachment `vainfo-output`
- system log: in attachment `sys-log`
I do find a workaround online somewhere, that is to set
`amdgpu.vm_update_mode=3` kernel parameter, and it seems to fix the problem.
I never have this kind of problem on Linux distros I used before Debian
(Void, OpenSUSE,) so if anyone has some idea why this happens, please
give me some advice, thanks!
--
Best,
ID
Richard
2024-07-03 21:20:01 UTC
Permalink
I don't even know if I can answer that. As Debian's firmware even in sid is
ancient I'm using the ones from kernel.org, so the old firmware can't
really be the issue like the gitlab entry suggests. But my issue always was
that it only happened when I least expected it. It never was reproducible
by no means. So with the last few firmware updates, the situation probably
has gotten better, but I can't tell for sure if the issue is fully fixed.
The last logs I saved are from May, but for sure this also has happened
afterwards. Question is if it happened after installing the firmware files
from the June drop. To be honest, I just can't really tell. It only
happened quite regularly with like the March or April firmware, it has
gotten much better since. But I'd wait another month or two to make sure
it's actually gone.

Best
Post by George at Clug
Since we are no longer experiencing lockups, I wonder if your system has
become stable, and if you have tried limiting what concurrent software you
are using, particularly mixing X11 and Wayland native programs that are
interacting with each other. I can only imagine the hoops the software is
going through, I am amazed it all works.
Franco Martelli
2024-07-03 13:40:01 UTC
Permalink
I updated another computer with an NVidia Quadro graphics card. NVidia
says the Debian nvidia-driver package works -- but it's not part of the
default net-install, and apt-get refuses to install it. And it refuses
to install the nvidia-tesla drivers. I gave up on compiling the drivers
I downloaded directly from NVidia, for the same reasons. And a laptop.
So they're running nouveau too (and occasionally freezing).
Does nouveau cause the freezes?
Yes with the KDE's compositor frequently.
I solved disabling the KDE compositor in SystemSetting and installing
"Picom", now rarely Picom crashs but I can restart it from the console
with the command:

~$ picom -b --config ~/.config/picom.conf

at least the system doesn't freeze anymore.

Cheers
--
Franco Martelli
Charlie Gibbs
2024-07-03 19:40:01 UTC
Permalink
Post by Greg Marks
I'm not sure if this is related, but a couple years ago I had multiple
computer freezes possibly caused by nouveau. The screen froze; the
keyboard and mouse were unresponsive. (If I remember correctly, the
mouse pointer could be moved around on the screen, but it wouldn't
do anything.) Sometimes I could ssh to the machine and run certain
commands; often I would have to hold down the power button to force
a shutdown and then start it back up (which produces lots of orphaned
inodes until one does a clean reboot).
<snip>
Post by Greg Marks
Whether these issues are connected with the OP's problem, and whether
the crashes have to do with nouveau or bad RAM or something else, I
don't know. My "solution" was to stop using software (e.g. certain
Web browsers) that seemed to have been running at the time of the
crashes.
My solution was to stop using nouveau.
--
***@surfnaked.ca (Charlie Gibbs)
Loading...