ZFS and cksum errors (Samsung 870 EVO raidz-1)

    i am "using" a zfs raidz-1 system with three Samsung 870 EVOs(4TB), non ECC RAM and AMD sata controller since july 23. Before i used a mdadm raid 10 with hdd. No problems before. After the migration i had hard resets caused by a broken psu.

    With a new psu the hard resets are gone and i was able to use the zfs system until i recognized cksum errors on all disks. Three weeks ago the system stopped working and was only usable in read only mode with some broken data. Thousand of cksum erros, no write/read errors. No chance to repair.

    I searched in some forums. It is a know bug with samsung ssds with amd sata controller. By disabling ncq,trim, switching sata cables and enabling 3gbit/s some people had succes. But it didnt work for me. So i bought a marvell controller and created a new zfs. But the problem still exist. I cannot see errors in dmesg related to sata/ahci. Smartctl is fine(except the hardresets at the beginning).

    Any ideas?

  • KM0201

    Approved the thread.
  • You've created a "new zfs" raidz1 with the same drives?

    I *think* if you have errors across all 3 then all is lost because you effectively need to issue a replace for all 3 disks. There is snapshots that allow rewind, but I've never enabled them (thus never used them). I'm curious to how you have so many checksum errors, could all of them be for just 1 file?

