New Zimaos install crashing Daily

New Zimaos install on Aoostar R1 n150 runs for about a day, and then locks up. Display is connected to unit. display goes blank, system does not respond to pings, web interface inaccessable. I have to power cycle the unit to return to normal operation. All of the sleep related things i can find in uefi setup are disabled. unit has a static ip assigned.
Any ideas on where to start with this?

1 Like

grok claims its likely a i915 gpu lockup issue

It seems there is a known issue with i915 power saving causing lockups in the debian kernel, but the usual fixes for it dont seem to work in zimaos

It seems likely that what i need to figure out is how to set i915.enable_psr=0 enable_dc=0 enable_fbc=0
it seems this is usually done in grub via command line, but the grub layout seems to be nonstandard, and not editiable with the normal debian methods from terminal

I’ve had/am having problems with crashes caused by various reasons; some are fixed with updates, and others aren’t. One I had recently was with the network; the card would overload, drop the connection, and crash the system. I had to find a backup Wi-Fi card, and that solved it. But what I really want to say is that usually the problem is on the computer itself, with something interfering with the network. I don’t know if this is very helpful.

The computer itself seemed pretty stable for a several days running unraid, while i was preclearing some drives. and immediately started crashing daily upon installing zimaos. so Im leaning towards a software issue at the moment. and it seems that there is a problem documented with the drivers used in this device and my hardware. at this point i just need to figure out how to implement the fix of turning off the power saving features for the i915 so i can see if that fixes the issue.

I did discover where to edit the settings for the i915 power saving features i think.
it appears to be in /mnt/boot/cmdline.txt I appended that to

console=tty1 quiet splash loglevel=3 systemd.show_status=1 rd.udev.log_level=3 net.ifnames=0 biosdevname=0 intel_iommu=on iommu.strict=0 iommu.passthrough=0 vfio_iommu_type1.allow_unsafe_interrupts=1 thunderbolt.host_reset=false nvme_core.default_ps_max_latency_us=0 i915.enable_psr=0 i915.enable_fbc=0 i915.enable_dc=0 i915.enable_guc=0

from what iu can see this adjusts the i915 power saving features, and adjustst the nvme deep power states I have my fingers crossed that this stops my freezing issues

According to some reports, disabling the search function on certain machines can improve stability.

Please see:
https://community.zimaspace.com/t/system-becomes-unresponsive/6384/2?u=zima-giorgio

It’s hard to tell without collecting the actual dump logs and assertively identify which package, module or driver caused the exception. Without having an actual frame of reference you could end up having to try and tweak every possible OS configuration and/or ZimaOS configuration that can scale ad infinitum.

Just food for thought there in case any of those i915 or ZimaOS search settings don’t result in the expected behavior.

In my opinion you’re on the right track. The freezes you described match exactly what happens when the i915 driver or the NVMe controller drops into a deep power state and never wakes up.

I think the parameters you added are correct, disabling PSR, FBC, and DC for the i915, plus forcing the NVMe into a safe latency profile, often stops these 24-hour lockups on N100/N150 devices.

I suggest giving it 24–48 hours and seeing if the system remains stable. If it does, then it confirms the issue was power-saving features in the GPU or NVMe subsystem. If it still freezes, I think the next place to look would be ASPM or CPU C-states in BIOS.

But from what you’ve done so far, I believe you’ve likely targeted the real cause.

4 days uptime, and still going Im calling this successful. At the minimum it certainly eliminated some issues, now i can put a load on this thing and see if i can break anything else