Nvme drives recognized but fail with controller errors

I have finally resolved the previous issue where the ZimaCube was killing hard drives (ZimaCube Pro not accepting any drives, possible drive board damage, no reply from support - #5 by parkan) with a backplane replacement. I’m trying to proceed with installing the rest of the system for running TrueNAS, which requires a storage pool for applications. I’ve installed a
WD Blue SN5000 1TB in the nvme caddy and it seems to be recognized by the OS, however trying to create a pool on it or run SMART tests fails with

smartctl failed for disk nvme0n1:
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.44-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

Read NVMe Identify Controller failed: NVME_IOCTL_ADMIN_CMD: Input/output error

and the dreaded Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off" and report a bug in dmesg

I’ve tried changing the port of the drive and re-seating the caddy

My understanding is that this could be caused by undervoltage on the drive, though one would hope that with only a single slot occupied that wouldn’t be an issue.

What are my options at this point? Is it possible that one of the two ribbon connectors to the backplane is somehow not seated properly? They were incredibly difficult to connect, but the way the connectors fit seems like it should result in a positive connection once it’s in.

"nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off” This device does not seem to be in power saving mode. You can try to test whether the SSD device speed is at or slightly lower than the normal level to confirm whether the wiring is normal. At the same time, make sure that the intersection of the metal interface and the soft wire is as flat as possible instead of a 90-degree angle.

It’s not possible to format or mount the drive, it does not work at all, so I don’t know how to test the speed.

I am going to check the wire connection later today.

I’ve attempted to re-seat the ribbon cables to no effect, though they are extremely hard to reach on the motherboard. Could you confirm which of the cables carries the relevant lanes so I can at least focus on the right one?

The error in dmesg is below:

image

The drive does not show up under “NVMe Configuration” in the BIOS, should I expect to see it there?

ok, I moved it to a different slot and now it does show up in BIOS under NVME Configuration, but does not show up in the OS

???

moved it to the remaining slot on the caddy, now it shows up in BIOS and the OS but gives the same I/O errors

furthermore, when it’s inserted into this (3rd) slot it gives a vendor and device ID of FFFF in BIOS which is a bad sign (in the 2nd slot it gives a correct vendor and device ID but the OS cannot see it)

this is a brand new drive from WD

Have not received any updates from support after emailing over a week ago :grimacing:

I should also add that I’ve tried swapping in the system (known good) nvme into the caddy, it also does not work

Unfortunately given the lack of response here/via email and it having been almost 3 months without a working system I will unfortunately have to request a refund/return. Have emailed support for the fourth time with no replies to date. Extremely disappointed in this experience.

Hello, Parkan. I consulted the icewhale after-sales staff and it seems that they have just found your email and will reply as soon as possible. If you have not received a reply today, please send me a private message.