Raid1 disk failed, but the woring drive is not acceasable

R0yk3, thanks for describing the symptoms. With only the behaviour so far (no disks showing in lsblk, plus monitor dropping out and coming back low-res), this looks more like system-level instability than a RAID configuration issue.

However, we can’t confirm the root cause without the actual error output. The kernel will log the reason (SATA link resets, CRC/I/O timeouts, power events, etc).

Can you please paste these right after it happens:

dmesg -T | tail -200
journalctl -xb | tail -200
lsblk

Once we have those logs we can stop speculating and identify exactly what’s failing.

R0yk3 — noted.

“I have various Debian instalations and messed up big time…”

In that case, a clean reinstall is the correct next step. With that much prior experimenting, it’s impossible to separate OS vs hardware without resetting to a known baseline first.

Having a clean install.

first command, output:

root@ZimaBoard2:/root ➜ # dmesg -T | tail -200
[Sat Jan 17 22:47:26 2026] i915 0000:00:02.0: [drm] ERROR AUX A/DDI A/PHY A: r eceive error (status 0x6e20023f)
–A lot of the same lines here.
[Sat Jan 17 22:48:36 2026] EDID has corrupt header

second command:

root@ZimaBoard2:/root ➜ # journalctl -xb | tail -200
Jan 17 22:48:12 ZimaBoard2 kernel: i915 0000:00:02.0: [drm] ERROR AUX A/DDI A/PHY A: receive error (status 0x6e20023f)
The above line repeats lots of times

Jan 17 22:48:12 ZimaBoard2 zimaos-user[1964]: {“time”:“2026-01-17T22:48:12.65145596Z”,“id”:“”,“remote_ip”:“127.0.0.1”,“host”:“127.0.0.1:37815”,“method”:“GET”,“uri”:“/v1/users/status”,“user_agent”:“Go-http-client/1.1”,“status”:200,“error”:“”,“latency”:19508222,“latency_human”:“19.508222ms”,“bytes_in”:0,“bytes_out”:90}
Jan 17 22:48:22 ZimaBoard2 zimaos-user[1964]: {“time”:“2026-01-17T22:48:22.690518945Z”,“id”:“”,“remote_ip”:“127.0.0.1”,“host”:“127.0.0.1:37815”,“method”:“GET”,“uri”:“/v1/users/status”,“user_agent”:“Go-http-client/1.1”,“status”:200,“error”:“”,“latency”:22590580,“latency_human”:“22.59058ms”,“bytes_in”:0,“bytes_out”:90}
Jan 17 22:48:32 ZimaBoard2 zimaos-user[1964]: {“time”:“2026-01-17T22:48:32.729422447Z”,“id”:“”,“remote_ip”:“127.0.0.1”,“host”:“127.0.0.1:37815”,“method”:“GET”,“uri”:“/v1/users/status”,“user_agent”:“Go-http-client/1.1”,“status”:200,“error”:“”,“latency”:21559202,“latency_human”:“21.559202ms”,“bytes_in”:0,“bytes_out”:90}
Jan 17 22:48:35 ZimaBoard2 kernel: i915 0000:00:02.0: [drm] ERROR AUX A/DDI A/PHY A: receive error (status 0x6e30023f)
Jan 17 22:48:35 ZimaBoard2 kernel: EDID has corrupt header

Jan 17 22:48:42 ZimaBoard2 zimaos-user[1964]: {“time”:“2026-01-17T22:48:42.769172338Z”,“id”:“”,“remote_ip”:“127.0.0.1”,“host”:“127.0.0.1:37815”,“method”:“GET”,“uri”:“/v1/users/status”,“user_agent”:“Go-http-client/1.1”,“status”:200,“error”:“”,“latency”:21643034,“latency_human”:“21.643034ms”,“bytes_in”:0,“bytes_out”:90}
Jan 17 22:48:52 ZimaBoard2 zimaos-user[1964]: {“time”:“2026-01-17T22:48:52.807964157Z”,“id”:“”,“remote_ip”:“127.0.0.1”,“host”:“127.0.0.1:37815”,“method”:“GET”,“uri”:“/v1/users/status”,“user_agent”:“Go-http-client/1.1”,“status”:200,“error”:“”,“latency”:22108086,“latency_human”:“22.108086ms”,“bytes_in”:0,“bytes_out”:90}
Jan 17 22:48:54 ZimaBoard2 systemd[1]: virtsecretd.service: Deactivated successfully.
Jan 17 22:48:54 ZimaBoard2 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg=‘unit=virtsecretd comm=“systemd” exe=“/usr/lib/systemd/systemd” hostname=? addr=? terminal=? res=success’
Jan 17 22:48:54 ZimaBoard2 systemd[1]: virtnetworkd.service: Deactivated successfully.
Jan 17 22:48:54 ZimaBoard2 systemd[1]: virtnetworkd.service: Unit process 1123 (dnsmasq) remains running after unit stopped.
Jan 17 22:48:54 ZimaBoard2 systemd[1]: virtnetworkd.service: Unit process 1124 (dnsmasq) remains running after unit stopped.
Jan 17 22:48:54 ZimaBoard2 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg=‘unit=virtnetworkd comm=“systemd” exe=“/usr/lib/systemd/systemd” hostname=? addr=? terminal=? res=success’

Jan 17 22:49:01 ZimaBoard2 crond[626]: exit status 127 from user root job sys-hourly
Jan 17 22:49:01 ZimaBoard2 crond[3817]: unable to exec /usr/sbin/sendmail: cron output for user root job sys-hourly to /dev/null
Jan 17 22:49:02 ZimaBoard2 zimaos-user[1964]: {“time”:“2026-01-17T22:49:02.847018067Z”,“id”:“”,“remote_ip”:“127.0.0.1”,“host”:“127.0.0.1:37815”,“method”:“GET”,“uri”:“/v1/users/status”,“user_agent”:“Go-http-client/1.1”,“status”:200,“error”:“”,“latency”:22281106,“latency_human”:“22.281106ms”,“bytes_in”:0,“bytes_out”:90}
Jan 17 22:49:06 ZimaBoard2 zimaos-message-bus[1144]: {“time”:“2026-01-17T22:49:06.700413694Z”,“id”:“”,“remote_ip”:“192.168.28.223”,“host”:“192.168.28.209”,“method”:“GET”,“uri”:“/v2/message_bus/socket.io/?EIO=4&transport=websocket”,“user_agent”:“Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/143.0.0.0 Safari/537.36”,“status”:200,“error”:“”,“latency”:569562,“latency_human”:“569.562µs”,“bytes_in”:0,“bytes_out”:0}
Jan 17 22:49:06 ZimaBoard2 zimaos-user[1964]: {“time”:“2026-01-17T22:49:06.724367412Z”,“id”:“”,“remote_ip”:“192.168.28.223”,“host”:“192.168.28.209”,“method”:“GET”,“uri”:“/v1/users/name”,“user_agent”:“Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/143.0.0.0 Safari/537.36”,“status”:401,“error”:“code=401, message=invalid or expired jwt, internal=code=401, message=Unauthorized”,“latency”:139092,“latency_human”:“139.092µs”,“bytes_in”:0,“bytes_out”:37}

A lot of the same

Jan 17 22:51:53 ZimaBoard2 zimaos-user[1964]: {“time”:“2026-01-17T22:51:53.483079833Z”,“id”:“”,“remote_ip”:“127.0.0.1”,“host”:“127.0.0.1:37815”,“method”:“GET”,“uri”:“/v1/users/status”,“user_agent”:“Go-http-client/1.1”,“status”:200,“error”:“”,“latency”:23234661,“latency_human”:“23.234661ms”,“bytes_in”:0,“bytes_out”:90}
root@ZimaBoard2:/root ➜ #

I have no idea what the output should be, but it does not look right.

The complete log is to much to paste here.

R0yk3, thanks, this is helpful.

The errors you pasted are i915/DRM display (HDMI/EDID) errors, not RAID/SATA errors:

  • i915 ... receive error
  • EDID has corrupt header

That explains the black screen and the 640x480 resolution fallback. It’s usually caused by monitor/HDMI cable/adapter/EDID handshake.

To check the actual disk/RAID issue, please run and paste:

dmesg -T | egrep -i "ata|sata|crc|reset|link|I/O error|md0|mdadm" | tail -200
lsblk
cat /proc/mdstat

That will show whether there are any SATA link resets or I/O errors affecting the drives.

Date: Saturday, January 17, 2026 | Uptime: up 0 minute

root@ZimaBoard2:/root ➜ # dmesg -T | egrep -i “ata|sata|crc|reset|link|I/O error|md0|mdadm” | tail -200
egrep: warning: egrep is obsolescent; using grep -E
[Sat Jan 17 23:24:10 2026] Command line: BOOT_IMAGE=(hd0,gpt2)/bzImage root=PARTUUID=8d3d53e3-6d49-4c38-8349-aff6859e82fd rootwait zram.enabled=1 zram.num_devices=3 net.naming-scheme=v250 systemd.machine_id=345baacdddcd4b4db81c5bbb59a5a360 fsck.repair=yes console=tty1 quiet splash loglevel=3 systemd.show_status=1 rd.udev.log_level=3 net.ifnames=0 biosdevname=0 intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1 thunderbolt.host_reset=false rauc.slot=A
[Sat Jan 17 23:24:10 2026] BIOS-e820: [mem 0x00000000733a3000-0x0000000073483fff] ACPI data
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x0000000000000000-0x000000000009dfff] usable
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x000000000009e000-0x000000000009efff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x000000000009f000-0x000000000009ffff] usable
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x00000000000a0000-0x00000000000fffff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x0000000000100000-0x000000006bb5e017] usable
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x000000006bb5e018-0x000000006bb6fe57] usable
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x000000006bb6fe58-0x000000006bb70017] usable
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x000000006bb70018-0x000000006bb81e57] usable
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x000000006bb81e58-0x00000000702a2fff] usable
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x00000000702a3000-0x00000000733a2fff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x00000000733a3000-0x0000000073483fff] ACPI data
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x0000000073484000-0x0000000073543fff] ACPI NVS
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x0000000073544000-0x0000000073efefff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x0000000073eff000-0x0000000073efffff] usable
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x0000000073f00000-0x0000000079ffffff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x000000007aa00000-0x000000007abfffff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x000000007b000000-0x00000000803fffff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x00000000c0000000-0x00000000cfffffff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x00000000fe000000-0x00000000fe010fff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x00000000fed00000-0x00000000fed00fff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x00000000fed20000-0x00000000fed7ffff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
[Sat Jan 17 23:24:10 2026] reserve setup_data: [mem 0x0000000100000000-0x000000047fbfffff] usable
[Sat Jan 17 23:24:10 2026] NODE_DATA(0) allocated [mem 0x47fbfbc80-0x47fbffdff]
[Sat Jan 17 23:24:10 2026] Kernel command line: BOOT_IMAGE=(hd0,gpt2)/bzImage root=PARTUUID=8d3d53e3-6d49-4c38-8349-aff6859e82fd rootwait zram.enabled=1 zram.num_devices=3 net.naming-scheme=v250 systemd.machine_id=345baacdddcd4b4db81c5bbb59a5a360 fsck.repair=yes console=tty1 quiet splash loglevel=3 systemd.show_status=1 rd.udev.log_level=3 net.ifnames=0 biosdevname=0 intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1 thunderbolt.host_reset=false rauc.slot=A
[Sat Jan 17 23:24:10 2026] Register File Data Sampling: Mitigation: Clear Register File
[Sat Jan 17 23:24:10 2026] Memory: 16109956K/16513288K available (20480K kernel code, 2684K rwdata, 7500K rodata, 3368K init, 4236K bss, 397968K reserved, 0K cma-reserved)
[Sat Jan 17 23:24:10 2026] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[Sat Jan 17 23:24:10 2026] audit: initializing netlink subsys (disabled)
[Sat Jan 17 23:24:10 2026] ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20240827/psobject-220)
[Sat Jan 17 23:24:10 2026] ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20240827/psobject-220)
[Sat Jan 17 23:24:10 2026] ACPI: PCI: Interrupt link LNKA configured for IRQ 0
[Sat Jan 17 23:24:10 2026] ACPI: PCI: Interrupt link LNKB configured for IRQ 1
[Sat Jan 17 23:24:10 2026] ACPI: PCI: Interrupt link LNKC configured for IRQ 0
[Sat Jan 17 23:24:10 2026] ACPI: PCI: Interrupt link LNKD configured for IRQ 0
[Sat Jan 17 23:24:10 2026] ACPI: PCI: Interrupt link LNKE configured for IRQ 0
[Sat Jan 17 23:24:10 2026] ACPI: PCI: Interrupt link LNKF configured for IRQ 0
[Sat Jan 17 23:24:10 2026] ACPI: PCI: Interrupt link LNKG configured for IRQ 0
[Sat Jan 17 23:24:10 2026] ACPI: PCI: Interrupt link LNKH configured for IRQ 0
[Sat Jan 17 23:24:10 2026] libata version 3.00 loaded.
[Sat Jan 17 23:24:10 2026] aic94xx: Adaptec aic94xx SAS/SATA driver version 1.0.3 loaded
[Sat Jan 17 23:24:10 2026] ahci 0000:00:17.0: AHCI vers 0001.0301, 32 command slots, 6 Gbps, SATA mode
[Sat Jan 17 23:24:10 2026] ata1: SATA max UDMA/133 abar m2048@0x80a02000 port 0x80a02100 irq 124 lpm-pol 4
[Sat Jan 17 23:24:10 2026] ata2: SATA max UDMA/133 abar m2048@0x80a02000 port 0x80a02180 irq 124 lpm-pol 4
[Sat Jan 17 23:24:10 2026] ata2: SATA link down (SStatus 4 SControl 300)
[Sat Jan 17 23:24:10 2026] ata1: SATA link down (SStatus 4 SControl 300)
[Sat Jan 17 23:24:10 2026] Write protecting the kernel read-only data: 28672k
[Sat Jan 17 23:24:10 2026] Freeing unused kernel image (rodata/data gap) memory: 692K
[Sat Jan 17 23:24:11 2026] EXT4-fs (mmcblk0p7): mounted filesystem d3de3a2e-6d1b-4a43-a4f5-5fc626428d2a r/w with ordered data mode. Quota mode: none.
[Sat Jan 17 23:24:14 2026] systemd[1]: Expecting device /dev/disk/by-label/casaos-data…
[Sat Jan 17 23:24:14 2026] systemd[1]: Listening on Network Service Netlink Socket.
[Sat Jan 17 23:24:15 2026] igc 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x1 link)
[Sat Jan 17 23:24:16 2026] EXT4-fs (mmcblk0p8): mounted filesystem 4c29ac33-0535-4bff-a0fa-bdc14bc2b495 r/w with ordered data mode. Quota mode: none.
[Sat Jan 17 23:24:16 2026] igc 0000:02:00.0: 4.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x1 link)
[Sat Jan 17 23:24:17 2026] Adding 5320060k swap on /DATA/.swapfile. Priority:-2 extents:28 across:43900928k SS
[Sat Jan 17 23:24:17 2026] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[Sat Jan 17 23:24:24 2026] igc 0000:02:00.0 eth1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[Sat Jan 17 23:24:28 2026] Initializing XFRM netlink socket
root@ZimaBoard2:/root ➜ # lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 1.1M 1 loop
loop1 7:1 0 208K 1 loop
loop2 7:2 0 4.8M 1 loop
loop3 7:3 0 1.1M 1 loop
loop4 7:4 0 4.8M 1 loop
loop5 7:5 0 208K 1 loop
nbd0 43:0 0 0B 0 disk
nbd1 43:32 0 0B 0 disk
nbd2 43:64 0 0B 0 disk
nbd3 43:96 0 0B 0 disk
nbd4 43:128 0 0B 0 disk
nbd5 43:160 0 0B 0 disk
nbd6 43:192 0 0B 0 disk
nbd7 43:224 0 0B 0 disk
mmcblk0 179:0 0 57.7G 0 disk
├─mmcblk0p1 179:1 0 32M 0 part /mnt/boot
├─mmcblk0p2 179:2 0 24M 0 part
├─mmcblk0p3 179:3 0 6G 0 part /
├─mmcblk0p4 179:4 0 24M 0 part
├─mmcblk0p5 179:5 0 6G 0 part
├─mmcblk0p6 179:6 0 8M 0 part
├─mmcblk0p7 179:7 0 96M 0 part /var/lib/rauc
│ /var/lib/zerotier-one
│ /mnt/overlay
└─mmcblk0p8 259:0 0 45.5G 0 part /var/log
/var/lib/libvirt
/var/lib/icewhale
/var/lib/extensions
/var/lib/docker
/var/lib/casaos
/var/lib/bluetooth
/opt
/media
/DATA
/var/lib/casaos_data
mmcblk0boot0 179:8 0 4M 1 disk
mmcblk0boot1 179:16 0 4M 1 disk
zram0 252:0 0 0B 0 disk
zram1 252:1 0 0B 0 disk
zram2 252:2 0 0B 0 disk
nbd8 43:256 0 0B 0 disk
nbd9 43:288 0 0B 0 disk
nbd10 43:320 0 0B 0 disk
nbd11 43:352 0 0B 0 disk
nbd12 43:384 0 0B 0 disk
nbd13 43:416 0 0B 0 disk
nbd14 43:448 0 0B 0 disk
nbd15 43:480 0 0B 0 disk
root@ZimaBoard2:/root ➜ # cat /proc/mdstat

I alsko tried with other harddrives, but outcome it the same. The drives are spinning so the get power.

I will fall back to the original cables because there it worked.

This log is conclusive.

Your SATA controller loads, but both ports show no device connected:

  • ata1: SATA link down
  • ata2: SATA link down

And lsblk confirms it: only the eMMC (mmcblk0) is detected, no HDDs at all.

So this is not RAID and not filesystem corruption — the HDDs are simply not being detected at the SATA link level (hardware layer).

That points to:

  • drives not powered
  • power splitter/adapter issue
  • loose SATA data/power connection
  • or a SATA board/controller hardware issue

Also: cat /proc/mdstat returning nothing is expected here, because no disks detected = no md RAID to show.

Next step is hardware isolation: connect one drive only (direct power + SATA), then re-check lsblk and dmesg.

Note: a drive spinning only confirms power, not SATA data link. SATA link down means the controller can’t establish the data connection.

i pulled from another machine a 2,5 inch drive and anothe 3.5 inch drive. Hooked them on the zimaboard.

now i see the 2,5 inch drive, and not the 3,5 inch drive? Power?? i bought the heavy poweradapter from the zima store just to prevent this kind of stuff..

So i fell back on the original poweradapter from Zima, not the 120W 12V/10A GaN Power Adapter for ZimaBoard & ZimaBoard 2, ZBA070.

now i see the drives and i’m going to stresstest the setup.

Damn poweradapters.. hope i get it refunded or send a new good one. Because i dont know if the adapter is powerfull enough for two 3,5 inch drives and a nvme pci drive.

Even the monitor is working again now? Only failed message i see flashing by is something about the nvidia daemon.

Thank you for you patience and advice!!

:grinning_face:

That result points strongly to power stability, not RAID.

It’s very common that:

  • a 2.5" drive is detected (lower spin-up demand)
  • but a 3.5" drive is not detected (high 12V spin-up/inrush current)

Even if the 3.5" drive “spins”, it can still brown-out/reset during startup, and the SATA link never fully comes up (which matches your earlier SATA link down logs).

If switching to a different adapter immediately makes the 3.5" drives appear again, that’s a strong indicator the first adapter/cable isn’t handling spin-up load reliably.

Your plan now is correct:

  • stay on the stable adapter
  • stress test with heavy read/write for a few hours
  • if stable, you’ve found the root cause.

Nice, the fact the monitor is stable again after switching adapters supports the same conclusion: the issue was power delivery stability affecting multiple components, not RAID.

The NVIDIA daemon message is usually unrelated unless you’re actively using a GPU.

If you want one final confirmation during the stress test, keep an eye out for any resets/I/O errors:

dmesg -T | egrep -i "ata|reset|crc|I/O error" | tail -50

If that stays clean under load, you’re sorted.

Looks good, one issue:

“Expecting device /dev/disk/by-label/casaos-data”,

but it has no negative effect (for now) remounting might fix it. but first i let it run.

That message is expected when ZimaOS can’t see the DATA disk.

Expecting device /dev/disk/by-label/casaos-data… simply means ZimaOS is waiting for the drive/partition labelled casaos-data (your main DATA/app storage). When the SATA disks weren’t detected, that label didn’t appear, so ZimaOS kept waiting.

So it’s another confirmation the issue at that moment was disk detection/power/SATA link, not RAID itself.