High IO-Wait while copping files with samba

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    • High IO-Wait while copping files with samba

      Hi Folks,

      Im runing OMV2 (didnt upgrade since installe because back then there was a bug that broake the system while having automated updates on)

      Source Code

      1. ii openmediavault 2.2.4 all Open network attached storage solution
      2. ii openmediavault-downloader 2.1 all OpenMediaVault downloader plugin
      3. ii openmediavault-extplorer 1.2 all OpenMediaVault eXtplorer plugin
      4. ii openmediavault-forkeddaapd 2.0 all OpenMediaVault forked-daapd (DAAP server) plugin
      5. ii openmediavault-keyring 0.4 all GnuPG archive keys of the OpenMediaVault archive
      6. ii openmediavault-omvextrasorg 2.13.2 all OMV-Extras.org Package Repositories for OpenMediaVault
      7. ii openmediavault-openvpn 1.1 all OpenVPN plugin for OpenMediaVault.
      8. ii openmediavault-remoteshare 1.1 all remote share plugin for OpenMediaVault.
      9. ii openmediavault-virtualbox 1.3 all VirtualBox plugin for OpenMediaVault.





      Thats my System.

      Here is a picture of CPU usage out of OMV GUI:


      As you could see, IO-Wait is the main CPU usage.

      The System is a HP Pro Liant Micro Gen 8 Server.

      I do not use the onboard soft raid controler for the data drives. Only the OMV OS runs of a single 2,5 disc thats conected with the on board SATA (raid) controler.
      For the data drives: I use a HP P420 hardware raid controler powering 2pc 10TB drives and 2 1TB SSDs as cache and 1GB RAM.

      Ive upgradedt the Stock HP Micro Server with the biggest CPU and 16GB RAM.

      What caused the load you saw in the pictuer was 3 different SMB coppy jobs only reading form the OMV server.

      so I did run IOTOP and found that jbd2/sdb1-/8 is causing the load.

      also while VMs are online, VirtaulBox (also the clients are "sleeping") causes a very hi IO Wait

      Source Code

      1. Total DISK READ: 47.90 M/s | Total DISK WRITE: 794.67 K/s
      2. TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
      3. 289 be/3 root 0.00 B/s 249.08 K/s 0.00 % 20.09 % [jbd2/sdb1-8]
      4. 62287 be/4 Manne 47.90 M/s 0.00 B/s 0.00 % 5.02 % smbd -D
      5. 1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init [2]
      6. 2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd]
      7. 3 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/0]

      And also I see a flush-8:16:

      Source Code

      1. Total DISK READ: 46.99 M/s | Total DISK WRITE: 1419.60 K/s
      2. TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
      3. 1390 be/4 root 0.00 B/s 0.00 B/s 0.00 % 80.02 % [flush-8:16]
      4. 62287 be/4 Manne 46.99 M/s 0.00 B/s 0.00 % 5.27 % smbd -D
      5. 1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init [2]
      6. 2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd]
      7. 3 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/0]
      8. 6 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/0]
      9. 7 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [watchdog/0]
      10. 8 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/1]


      and, I was not able to catch it, but while all virtualbox clients where off, I had jbd2/sdb1-8 causing a load of 99,9% in line 1 and smb 99,9% in line 2

      for just 4 SMB files transfers read only from OMV

      This is what TOP gives me.

      Source Code

      1. top - 12:17:52 up 10 days, 15:15, 2 users, load average: 0,36, 0,58, 1,16
      2. Tasks: 166 total, 2 running, 163 sleeping, 0 stopped, 1 zombie
      3. %Cpu(s): 0,7 us, 0,7 sy, 0,0 ni, 96,5 id, 2,0 wa, 0,0 hi, 0,2 si, 0,0 st
      4. KiB Mem: 16428680 total, 16241696 used, 186984 free, 356996 buffers
      5. KiB Swap: 12683260 total, 0 used, 12683260 free, 15503696 cached
      6. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      7. 62287 Manne 20 0 125m 29m 18m R 5,7 0,2 78:48.63 smbd
      8. 32822 root 20 0 54964 11m 3740 S 1,0 0,1 0:21.66 iotop
      9. 26607 openmedi 20 0 105m 27m 2848 S 0,7 0,2 0:21.37 php5-fpm
      10. 38865 openmedi 20 0 87580 7084 3028 S 0,7 0,0 0:00.63 php5-fpm
      11. 54 root 20 0 0 0 0 S 0,3 0,0 2:02.28 kworker/0:2
      12. 2699 root 20 0 26120 1876 1016 S 0,3 0,0 17:08.72 cmasm2d
      13. 3736 vbox 20 0 275m 15m 7176 S 0,3 0,1 19:34.01 VBoxSVC
      14. 1 root 20 0 10656 784 648 S 0,0 0,0 0:06.46 init
      15. 2 root 20 0 0 0 0 S 0,0 0,0 0:00.00 kthreadd
      16. 3 root 20 0 0 0 0 S 0,0 0,0 5:07.50 ksoftirqd/0
      17. 6 root rt 0 0 0 0 S 0,0 0,0 0:17.36 migration/0
      18. 7 root rt 0 0 0 0 S 0,0 0,0 0:03.81 watchdog/0
      19. 8 root rt 0 0 0 0 S 0,0 0,0 0:05.20 migration/1
      20. 9 root 20 0 0 0 0 S 0,0 0,0 0:00.00 kworker/1:0
      21. 10 root 20 0 0 0 0 S 0,0 0,0 0:09.02 ksoftirqd/1
      Display All

      any suggestions?

      Thanks
      Manne
    • HP p420 raid cards have very picky settings especially with SSDs that can kill performance. I would look into that.
      omv 4.1.22 arrakis | 64 bit | 4.15 proxmox kernel | omvextrasorg 4.1.15
      omv-extras.org plugins source code and issue tracker - github

      Please read this before posting a question and this and this for docker questions.
      Please don't PM for support... Too many PMs!
    • thanks for that hint,
      i found this

      T_VBO wrote:


      T_VBO Jun 2, 2016 at 8:52 AM
      Hello,
      This is why is better to use SSD without "ssd internal cache" or most simple.... Server's SSD (datacenter ssd's) like SM/PM863
      DC S3510, ...
      I do a check with Hp P420 and Samsung 850 Pro ssd's.
      The best performances is with smart array cache enabled(FBWC) AND hard drive cache enabled too.
      I had the physical drive write cache on disabled.
      And by some reason my HP Smart Cache Array for my spinning discs was "off". now its aktivated again. maybe due to chaning the SSD once, and forgot to reaktivate while waiting for migration ill have a look how it performs now. if you dont hear back from me, this fixed it.

      Ah, I recall, I added 2 10tb discs, changed from raid 1 to 10, and that rebuild did take plenty of time. so i forgot totally to change my disc size from 9 to 18 tb and reactivate my cache.

      thanks folks anyway, since the initial report was with hp smartcache active, as it was before I added thos 2 new discs.

      The post was edited 1 time, last by mannebk ().

    • first tests indicate that a 100% write cache is not optimal config.

      my hp smart array cache is 2pc of 500gb ssds in raid 0

      i do see an io wait on reading from omv at 75mb/s
      i nearly dont see io wait on writing to omv at 103mb/s

      also i did test this today with a spinning disc client.

      Testfile was an 8gb video file

      i just repeted it with my ssd powerd notebook quadcore

      the perfmon shows an allmost sleeping notebook sucking data in at about 83mb/s from my spinning 18TB array (4x 10tb raid 10)

      pushing the data back to the omv array, it runns at about 90% of 1gb link speed with 103mb/s

      the io wait increases dramatically if i suck several diferent large file from the array at the same time. (not copy in a row but rather several copy jobs at the verry same moment)

      but when i push data from 2 clients to my OMV i now see speeds of about close to 115mb/s thats about the maxed out smb speed for 1gb links. (need to get my 2gb back, but the netgear switches have a firmware bug)

      And its with allmost no io wait now, for writing, but now i have about 8-10% soft-irq... ?

      As well as having a CPU load spike when the jobs finish. see picture



      cheers manne

      The post was edited 6 times, last by mannebk ().