Hi!
I’ve already posted in the Arch Linux community on lemmy.ml but I’m also posting it here for additional visibility. I’d cross-post it but I don’t think PieFed has that option yet. Hopefully it’s okay.
Anyway, a few hours ago today, when I turned on my computer, went to the systemd-boot boot loader, chose “Arch Linux” from the list of boot entries, I was faced with a system that is stuck at boot as seen from the image I uploaded.
So far, I’ve tried disabling Overdrive by editing the kernel parameters at boot, and by booting an Arch Linux live ISO to no avail. As in, I’m stuck at the same stage of the booting process, even when using the aforementioned live ISO. Which means I can’t really boot into the system.
This happened before, like, a few months ago. I either booted with a live ISO and executed mkinitcpio -P, or just did a hard reset, as I waited for a kernel, GPU drivers or mesa update. About a month ago, it stopped happening and the system booted fine. I don’t really know what fixed it, sorry. Until today, that is.
I’m at a loss of what to do aside from either reinstalling Arch Linux or installing a different distro. I really don’t want to do that, though, as I haven’t really done any backups of my config files, and I’m generally happy with how I’ve set up my system. The fact that the live ISO didn’t work also made me think of a hardware problem, namely the GPU, which complicates things even more, as I don’t have a spare one.
Some information about my hardware:
- GPU: Radeon RX Vega 56
- Motherboard: ASUS Prime X470-Pro
- CPU: AMD Ryzen 7 2700X
I ran # pacman -Syu
last night so everything is up to date. Not sure how relevant this is but I’m using the radeon open-source drivers.
Hopefully all of this was somewhat clear and if there’s something I missed, please let me know.
Thanks in advance!
EDIT: Changed the GPU to a different PCIe slot and everything’s working fine so far. I’m not celebrating just yet because when this first happened a few months ago, I’d hard reset the PC and everything would work fine. But if I shut it down and let it pass like 12 hours before I’d power it on again, the problem would reappear. So I’m just basically waiting for tomorrow now.
PieFed isn’t letting me edit the OP due to an unexpected error. The errors keep piling up, haha!
Just wanted to thank all of you wonderful people for all the help you’ve given me. I love each and everyone of you (even the ones who skimmed through my post :p). A user on the other thread I created in the Arch Linux community suggested I add the
nomedeset
parameter, with which I managed to boot into the system. I updated it and installedlinux-lts
along withlinux-lts-headers
. Adjusted/boot/loader/entries/arch_linux.conf
to switch to the lts kernel by default and rebooted the PC. Unfortunately, didn’t work but I got logs! Here’s the relevant part, I think:mai 03 11:04:23 arch kernel: amdgpu: [powerplay] Failed message: 0xe, input parameter: 0x0, error code: 0xffffffff mai 03 11:04:23 arch kernel: amdgpu: [powerplay] Failed message: 0x4, input parameter: 0x2000000, error code: 0xffffffff mai 03 11:04:23 arch kernel: [drm:resource_construct [amdgpu]] *ERROR* DC: unexpected audio fuse! mai 03 11:04:23 arch kernel: [drm] Display Core v3.2.316 initialized on DCE 12.0 mai 03 11:04:23 arch kernel: amdgpu 0000:0a:00.0: [drm] *ERROR* No EDID read. mai 03 11:04:23 arch kernel: amdgpu 0000:0a:00.0: [drm] *ERROR* No EDID read. mai 03 11:04:23 arch kernel: amdgpu 0000:0a:00.0: [drm] *ERROR* No EDID read. mai 03 11:04:23 arch kernel: amdgpu 0000:0a:00.0: [drm] *ERROR* No EDID read. mai 03 11:04:23 arch kernel: [drm] Timeout wait for RLC serdes 0,0 mai 03 11:04:23 arch kernel: [drm] kiq ring mec 2 pipe 1 q 0 mai 03 11:04:23 arch kernel: amdgpu 0000:0a:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_0.2.1.0 test failed (-110) mai 03 11:04:23 arch kernel: [drm:amdgpu_gfx_enable_kcq [amdgpu]] *ERROR* KCQ enable failed mai 03 11:04:23 arch kernel: [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* hw_init of IP block <gfx_v9_0> failed -110 mai 03 11:04:23 arch kernel: amdgpu 0000:0a:00.0: amdgpu: amdgpu_device_ip_init failed mai 03 11:04:23 arch kernel: amdgpu 0000:0a:00.0: amdgpu: Fatal error during GPU init mai 03 11:04:23 arch kernel: amdgpu 0000:0a:00.0: amdgpu: amdgpu: finishing device.
I did a search and it seems like it’s the GPU’s fault due to the ring errors. I think. I remembered I have an old nvidia GPU laying around so I’m going to try to reseat the current GPU and, if that doesn’t work, try the old one. Not sure if I have to uninstall the amd drivers or if it’s ok to have both the amd and nvidia drivers installed. If that doesn’t work, I’m going to go through all the other suggestions y’all gave me to try and pinpoint the problem.
Again, thank you so much!