less than 30 Mb/s ZFS Raidz2 - Analyze

  • Hi,

    probably one topic alot of people have. Yes, its not the newest system. But still i think (and hope) there is something wrong:


    - Intel(R) Xeon(R) CPU E3-1245 v5 @ 3.50GHz

    - 32GB ECC memory, no min/max ZFS settings

    - MSI C236A, latest BIOS

    - SATA Controller ASM1166 - also tried onboard connector only, but was the same

    I started with a ZFS consisting of SMR + CMR disks -> i thought this is/was the problem as a lot of people mention to get rid of SMR. So i replaced the SMR disks with CMR only


    Now, the "error" is still the same

    - e.g. copying 37 Gig ( 3 Gb files each) form one to another dataset: starts fast, drop to 110 MB/s and the continues with 30 MB/s with some drops to << 10 MB/s. Sometimes it stops for multiple seconds. Overall it took 21 minutes (30 MB/s)

    - cant see any SMART Errors (all green)

    - SMART Temperature below 30 °C

    - i used a script to measure before (with 2 SMR disks ( WD 4 TB EFAX)). With them replaced by ST4000VN008... disk, it get better. But still a major drop after a few GB

       /usr/bin/time -f "%e" sh -c 'dd if=/dev/zero of=/srv/NAS-Daten/20GB.img bs=20M count=1000 2> /dev/null'
    with SMR+CMR mix 196s, CMR only 64s

    - no dmesg errors

    - no compression activated, no dedup

    - no hot temperaturen using sensors

    - same happens if i copy a large amounts of data using 1Gb Ethernet

    - bad sata cables - dont think so because it starts good

    - bad cooling: HDD cages have fans, CPU + housing also

    -> how to determine the Problem?

    As the system is quite low in free space, i consider to by new disk soon**. But aslong i dont know what the problem with the system is, i postpone this

    - Mainboard issue: datatransfer starts promissing, but collaps after 10G or so: No Mainboard/CPU/Bridge Problem?

    - sata controller issue: also used onboard Sata without any major changes. Would be strange if both onboard and PCIe Controller do the same strange thing

    - temperature problem: Not the disks i think (SMART value ok)

    - replace raidz2.... no idea

    - **get rid or raidz2 and replace by mirror of two Seagate Exos X20, 18TB

    iostat with average over 60s when copying 37 gig test data: directly after i start the copy

    bonnie++ with the zfs arrangement above + iostat during bonnie++

  • votdev

    Approved the thread.
  • Hi,

    i did some further tests:

    copy from a nvme to a ssd using either onboard or pcie card: close to 500 Mbyte/s

    copy from nvme to a SMR Disk (the one i replaced)-> 130 MByte/s

    -> no HW issue

    -> yes, looks like wrong zfs settings

    I do a scrub right now after I did the replace+resilver

    After a few minutes it says

    2.02T scanned at 4.46G/s, 120G issued at 265M/s, 11.4T total

    265 M/s? Lets see later

    But for now


    get disk value:
    sudo smartctl -a /dev/sda | grep 'Sector Size'
    for all members sda/sdb/sdd/sde same output
    512 bytes logical, 4096 bytes physical
    -> 2^12
    zpool get all | grep ashift
    NAS-Daten  ashift                         12                             local

    So the setting for the pool seems to be ok

    "Common" settings

    NAS-Daten  atime                 off                                   local
    NAS-Daten  xattr                 sa                                    local
    Settings user access,...
    NAS-Daten  xattr                 sa                                    local
    NAS-Daten  acltype               posix                                 local
    NAS-Daten  aclinherit            passthrough                           local


    root@omv-neu:~# zfs get all NAS-Daten | grep compress
    NAS-Daten  compressratio         1.00x                                 -
    NAS-Daten  compression           off                                   default
    NAS-Daten  refcompressratio      1.00x                                 -

    I set it to off. With compression data transfer should be better. I am looking to get closer to 1GBit/s 100 MB/s. Aslong as i am as far away from that, I dont want to copy all data again so that it will be compressed. Also my Data are most Pictures and Vides. I dont expect that huge effect on compression

    Cache/Log: Not right now

    Kernel for OMV

    - i am using Debian GNU/Linux, with Linux 6.1.15-1-pve

    - once i wanted to use the server also for VMs, but i do this now on another PC

    Memory: The more the better. But does it help in that case?

    -> Arc Min/Max untouched

    /sys/module/zfs/parameters/zfs_arc_max -> 0

    ZFS Sync - often mentioned, untouched (not always what could slow down pool)

    NAS-Daten  sync                  standard                              default
  • maltejahn

    Added the Label resolved
  • Hi,

    after a few days I wanted to tell about the solution for my performance problem. I tried what most people suggested..... Avoid onboard SATA. The problems where atleast because i mixed onboard an additional sata controller ports.

    After i connected all of the disks to the ASM1166 6 port expander, I get really got good write values. One issue was a "write" error on one Disk when writing alot of Data. With this, the ZFS was degraded. It seems it was the cable (as also mentioned using google).

    For now, with the 4 disks i get (only tested a few times, 8 mp4 files, total 37GB, no compression):

    NVME -> ZFS -> 250MB/s

    NVME-> ZFS with 1MB recordsize -> 370 MB/s

    SSD 870 -> ZFS with 1MB recordsize + NVME log (for testing, removed it then) -> 390 MB/s

    For the movie dataset i will probably set recordsize to 256kb as there are also alot of small files for every movie. And with 1MB i think there would be alot of overhead/wasted space

    For Pictures i will do the same, the rest stays the same.

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!