SMB writes fast, but NFS writes almost a third the speed.

    • OMV 3.x (stable)
    • Resolved

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    • SMB writes fast, but NFS writes almost a third the speed.

      I'm having a nightmare of a time here. I have an ESXi 6.5 all-in-one with one VM for my media and one VM for my OMV. It's all running on a J3455B-ITX with 16 GB of RAM and four 3 TB WD Reds through an LSI 9211-8i. Since the LSI HBA (flashed to IT mode) is passed through to the OMV VM, I have 4 GB of RAM dedicated to it. Both VM's are set to use 4 CPUs of the quad core. The MTU is set to 9000 on every NIC in ESXi as well as in the interfaces settings for each VM OS (both Debian).

      The drives are setup using SnapRAID and mergerfs.

      The test is to copy a 1 GB file from my non-VM Windows 10 desktop workstation to my OMV VM. Another test is to copy a 1 GB file from my media box VM to my OMV VM.

      These are the baffling results:
      1. SMB shares will write a 1 GB file at around 105-110 MB/s using Windows copy
      2. SMB shares will read a 1 GB file at around 110-115 MB/s using Windows copy
      3. NFS shares will write a 1 GB file at around 40-45 MB/s using dd
      4. NFS shares will read a 1 GB file at around 120-125 MB/s using dd
      NFS server options are secure,async. I'm have tried sync,no_subtree_check,insecure,no_acl,no_root_squash,wdelay,crossmnt,fsid=1, and every combination thereof.

      The client is mounting as r/wsize=524288. I have tried vers=3, 4, and 4.1. I have tried r/wsize=65536 and 1048576,hard,intr,sync,actimeo=0,fsc,nosharecache,nolock,noatime,nodiratime, and every combination thereof.

      I have tried creating a separate port group for the OMV NIC. I've tried the E1000E instead of the VMXNET3 driver.

      CPU and RAM loads on OMV are minimal in all cases. They are not the bottleneck.

      I've run iperf tests and they look fine:

      Brainfuck Source Code

      1. ------------------------------------------------------------
      2. Client connecting to 192.168.1.13, TCP port 5001
      3. TCP window size: 1.84 MByte (default)
      4. ------------------------------------------------------------
      5. [ 3] local 192.168.1.11 port 55193 connected with 192.168.1.13 port 5001
      6. [ ID] Interval Transfer Bandwidth
      7. [ 3] 0.0-10.0 sec 2.36 GBytes 2.03 Gbits/sec
      I just can't figure out why a separate machine on the network would have much faster writes over Samba than another VM in the same box over NFS. Note that the OMV NFS share isn't even mounted within ESXi yet; this is straight VM to VM. When I do mount the OMV NFS in ESXi, I get the same, slow write speed.

      I've also tried OMV 2.1.


      Any help would be greatly appreciated.

      The post was edited 2 times, last by siddhartha ().

    • I'm working my way through everything I can find on troubleshooting performance and I think I might have found the culprit. After reading and writing a 1 GB file, the output of mountstats gives me this:

      Source Code

      1. NFS mount options: rw,vers=3,rsize=524288,wsize=524288,namlen=255,acregmin=3,acregmax=60,acdirmin=30,acdirmax=60,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.13,mountvers=3,mountport=48940,mountproto=udp,local_lock=none
      2. NFS server capabilities: caps=0x3fef,wtmult=4096,dtsize=4096,bsize=0,namlen=255
      3. NFS security flavor: 1 pseudoflavor: 0
      4. NFS byte counts:
      5. applications read 512000000 bytes via read(2)
      6. applications wrote 512000000 bytes via write(2)
      7. applications read 0 bytes via O_DIRECT read(2)
      8. applications wrote 0 bytes via O_DIRECT write(2)
      9. client read 512000000 bytes via NFS READ
      10. client wrote 512000000 bytes via NFS WRITE
      11. RPC statistics:
      12. 1971 RPC requests sent, 1971 RPC replies received (0 XIDs not found)
      13. average backlog queue length: 0
      14. GETATTR:
      15. 4 ops (0%) 0 retrans (0%) 0 major timeouts
      16. avg bytes sent per op: 91 avg bytes received per op: 112
      17. backlog wait: 0.000000 RTT: 5.000000 total execute time: 5.000000 (milliseconds)
      18. SETATTR:
      19. 1 ops (0%) 0 retrans (0%) 0 major timeouts
      20. avg bytes sent per op: 140 avg bytes received per op: 144
      21. backlog wait: 0.000000 RTT: 121.000000 total execute time: 121.000000 (milliseconds)
      22. LOOKUP:
      23. 1 ops (0%) 0 retrans (0%) 0 major timeouts
      24. avg bytes sent per op: 104 avg bytes received per op: 228
      25. backlog wait: 0.000000 RTT: 4.000000 total execute time: 4.000000 (milliseconds)
      26. ACCESS:
      27. 4 ops (0%) 0 retrans (0%) 0 major timeouts
      28. avg bytes sent per op: 102 avg bytes received per op: 120
      29. backlog wait: 0.000000 RTT: 2.000000 total execute time: 2.250000 (milliseconds)
      30. READ:
      31. 977 ops (49%) 0 retrans (0%) 0 major timeouts
      32. avg bytes sent per op: 116 avg bytes received per op: 524181
      33. backlog wait: 0.016377 RTT: 18.697032 total execute time: 19.113613 (milliseconds)
      34. WRITE:
      35. 977 ops (49%) 0 retrans (0%) 0 major timeouts
      36. avg bytes sent per op: 524177 avg bytes received per op: 136
      37. backlog wait: 4610.141249 RTT: 491.918117 total execute time: 5102.354145 (milliseconds)
      38. READDIRPLUS:
      39. 1 ops (0%) 0 retrans (0%) 0 major timeouts
      40. avg bytes sent per op: 116 avg bytes received per op: 448
      41. backlog wait: 0.000000 RTT: 4.000000 total execute time: 4.000000 (milliseconds)
      42. FSINFO:
      43. 2 ops (0%) 0 retrans (0%) 0 major timeouts
      44. avg bytes sent per op: 84 avg bytes received per op: 80
      45. backlog wait: 0.000000 RTT: 0.500000 total execute time: 1.000000 (milliseconds)
      46. PATHCONF:
      47. 1 ops (0%) 0 retrans (0%) 0 major timeouts
      48. avg bytes sent per op: 84 avg bytes received per op: 56
      49. backlog wait: 0.000000 RTT: 1.000000 total execute time: 1.000000 (milliseconds)
      50. COMMIT:
      51. 1 ops (0%) 0 retrans (0%) 0 major timeouts
      52. avg bytes sent per op: 116 avg bytes received per op: 128
      53. backlog wait: 0.000000 RTT: 0.000000 total execute time: 0.000000 (milliseconds)
      Display All

      Notice the lines:

      Source Code

      1. WRITE:
      2. 977 ops (49%) 0 retrans (0%) 0 major timeouts
      3. avg bytes sent per op: 524177 avg bytes received per op: 136
      4. backlog wait: 4610.141249 RTT: 491.918117 total execute time: 5102.354145 (milliseconds)
      A backlog wait of 4610 seems excessively high. I'm looking at ways to lower it, but I'm not finding much. I found this article, but I don't have a subscription.
    • I've made some progress based on the high backlog wait times. I found the following tunable:

      Source Code

      1. echo -e "options sunrpc tcp_slot_table_entries=128\noptions sunrpc udp_slot_table_entries=128"> /etc/modprobe.d/sunrpc.conf

      This increases my speed by almost 25%, bringing it up to around 55-60 MB/s. I'm not sure why this work, as according, to this article, these values (even udp?) are dynamically managed by the server. mountstats:

      Source Code

      1. WRITE:
      2. 977 ops (98%) 0 retrans (0%) 0 major timeouts
      3. avg bytes sent per op: 524177 avg bytes received per op: 136
      4. backlog wait: 3191.623337 RTT: 321.162743 total execute time: 3513.044012 (milliseconds)
    • One last post to close out the thread. If you are thinking of using SnapRAID as a data store for VM storage, don't. SnapRaid is for files that don't change much, and VM files will be changing all the time. I decided to go with ZFS instead. In doing so, I've been able to drop mergerfs as the zpool is now presented as once filesystem. In this configuration I get speeds of 140-150 MB/s over NFS. It's a much better solution. Cheers.
    • NFS is generally stateless so it's probably issuing many more commands which greatly increases the mergerfs overhead (since most of the overhead is in the non read & write commands. I'm unfamiliar with SMB but it sounds like it might be issuing fewer commands to the underlying FS which would lead to better overall performance.

      And yes, snapraid is for write once, read many kind of layouts. As is mergerfs.