[ZimaBoard 2] NVMe controller errors with Crucial P3 SSDs on dual NVMe adapter - TrueNAS Scale 25.10.2

martincruz · February 21, 2026, 8:40pm

Hello everyone,

I’m reaching out to the community for help with a persistent issue I’m facing on my ZimaBoard 2. I’ve done quite a bit of troubleshooting already, but the problem still appears occasionally, and I’m hoping someone here might have experience with a similar setup.

Hardware Setup:

Board: ZimaBoard 2 16Gb ram
Storage Adapter: Zima PCIe 3.0 x4 to Dual NVMe M.2 adapter card
SSDs: 2x Crucial P3 1TB NVMe SSDs
OS Drive: 128GB SATA SSD (for TrueNAS)
Additional Disk: Crucial BX500 1TB SATA SSD (for local snapshots/backups)

Software:

OS: TrueNAS Scale 25.10.2 (fresh install)
Kernel: 6.12.33-production+truenas

The Problem:

I occasionally see the following errors in the console or logs:

nvme nvme0: controller is down: will reset CSTS=0x3, PCI_STATUS=0x10
nvme nvme0: resetting controller due to persistent internal error

After the error, the system usually recovers (the controller resets), but it’s clearly a sign of instability. The chip on the NVMe adapter gets very hot to the touch (can’t keep a finger on the heatsink), though the SSDs themselves remain cool.

What I’ve Tried So Far:

Sysctl/kernel parameters:
- Added pcie_aspm=off (confirmed active in /proc/cmdline)
- Tried adding nvme_core.default_ps_max_latency_us=0 and pcie_port_pm=off via the TrueNAS web UI (Sysctl with UDEV type), but later discovered these are not proper sysctl variables.
- Then applied them correctly as kernel_extra_options via the midclt command:
```
midclt call system.advanced.update '{"kernel_extra_options": "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off"}'
```
After a reboot, I verified that all three parameters are now present in /proc/cmdline and that cat /sys/module/nvme_core/parameters/default_ps_max_latency_us returns 0.

Result: The error frequency has decreased significantly, but it hasn’t disappeared completely.

What I Haven’t Tried Yet:

Firmware update: I haven’t updated the Crucial P3 firmware. I plan to do it, but I need to find a Windows machine for that.
Active cooling: Adding a small fan pointing at the adapter.
Testing with a single NVMe to isolate a potential power delivery issue.

My Questions:

Has anyone here experienced similar nvme controller is down errors with a ZimaBoard + dual NVMe adapter + Crucial P3 combo?
Could the adapter chip overheating be a normal behavior, or is it a red flag? Should I prioritize active cooling?
Does anyone know if there’s a known firmware issue with Crucial P3 drives that might cause this? (I’ll update it anyway, but curious if others have seen improvements after updating.)
Could this be a power delivery limitation of the ZimaBoard’s PCIe slot? The adapter draws power from the slot, and two high-performance NVMe drives might be too much.
Is there any other kernel parameter or BIOS setting I should try before concluding it’s a hardware issue?

Additional Info:

The SSDs are brand new and pass smartctl long tests.
I’m aiming for a mirror pool with these two drives for data redundancy.

Any insights, experiences, or suggestions would be greatly appreciated. Thanks in advance for your help!