Errors/Slowness when transferring over ethernet

  • My NAS system is a HP Z400 desktop with the following specs:
    - Xeon (Dual Core) 2.53 GHz processor
    - 16 GB RAM
    - 2 x 2 TB WD Red HD's
    - 2 x 2 TB WD Black HD's
    - 3ware 9650SE-8LPML Raid Controller


    All 4 HD's are in a RAID 5 config. I'm booting ESXI 6.0 off a thumb drive. My OMV installation is a thin provisioned VM I gave one core and 5 TB of HD space.


    I have several shares created. For my first big transfer I dropped about 60 GB into one of my shares from an external HD hooked up to my Ubuntu machine via smb. The transfer started, 20 MB/s transfer speed down to 0, up to 5, down to 0, up to 5, then I got a time out error. Tried this with several specific smaller folders, got time out errors. Tried transferring via SCP, SFTP and FTP. Same problem every time, connection errors.


    In an attempt to further troubleshoot I tried on a Win 10 machine to transfer via SMB and got a "Error 0x8007045D: The request could not be performed because of an I/O device error." Trying a transfer via SCP (using WINSCP) I got time out errors again.


    I directly attached the HD to the machine and everything's transferring fine but I'd like to get this network transfer issue figured out. Can anyone shed some light on what's going on here?


    Any help is appreciated.

  • Several of these:

    • Sep 19 16:04:15 HoHomv monit[813]: 'HoHomv.local' loadavg(5min) of 4.5 matches resource limit [loadavg(5min)>1.0]
    • Sep 19 16:04:15 HoHomv monit[813]: 'HoHomv.local' loadavg(1min) of 4.9 matches resource limit [loadavg(1min)>2.0]
    • Sep 19 16:04:12 HoHomv ool openmediavault-webgui: Unknown or unsupported timezone [tz=US/Eastern]
    • Sep 19 16:04:14 HoHomv omv-engined[21377]: Unknown or unsupported timezone [tz=US/Eastern]
    • Sep 19 16:02:38 HoHomv collectd[3309]: rrdcached plugin: rrdc_update (/var/lib/rrdcached/db/localhost/cpu-0/cpu-user.rrd, [1474315346:110356], 1) failed with status -1.
    • Sep 19 16:02:38 HoHomv collectd[3309]: Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1.
  • After doing some Googling I'm not sure what the "Built-in target 'write' " error is. It looks like the "resource limit" error may be linked to not having enough room for the OS, which I'm not sure how that's possible since the OS partition is 42 GB and it's only using 1.18 GiB. I just changed my timezone to a specific city based time zone, so hopefully that clears that up.


    Could this be an ethernet driver issue?

  • votdev, here's my results:

    root@omv# omv-firstaid
    Checking all RRD files. Please wait ...
    All RRD database files are valid.
    Action failed -- Other action already in progress -- please try again later


    I'm researching what that means.

  • So after I ran the firstaid now my logs are getting spammed these two lines:


    Sep 20 13:12:16 omv collectd[3309]: Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1.
    Sep 20 13:12:16 omv collectd[3309]: rrdcached plugin: rrdc_connect (unix:/var/run/rrdcached.sock) failed with status 2.


    When I say spammed I mean creating several pages of log files per minute. I'll be looking for a way to fix this or stop the log file ballooning.

  • Ok, just for fun I decided to run sudo omv-firstaid and the process ran through fine this time. Surprised at this result I ran monit restart collectd and watched the syslog file. No new instances of the rrdcached error message came up. Then accessed a share via samba and transferred 2 GB of music files, that ran quick. Pleasantly surprised at this result I dropped 6 GB of music files. After about 1.5 GB the transferred slowed to a crawl then the connection timed out.


    Checking syslog I'm getting the following:


    Sep 20 15:22:40 omv monit[813]: 'omv.local' cpu wait usage of 97.3% matches resource limit [cpu wait usage>95.0%]
    Sep 20 15:22:40 omv monit[813]: 'omv.local' loadavg(5min) of 3.0 matches resource limit [loadavg(5min)>1.0]
    Sep 20 15:22:40 omv monit[813]: 'omv.local' loadavg(1min) of 2.8 matches resource limit [loadavg(1min)>2.0]
    Sep 20 15:22:45 omv monit[813]: 'nginx' failed protocol test [HTTP] at INET[127.0.0.1:80] via TCP -- HTTP: Error receiving data -- Resource temporarily unavailable
    Sep 20 15:22:45 omv monit[813]: 'nginx' trying to restart
    Sep 20 15:22:45 omv monit[813]: 'nginx' stop: /bin/systemctl
    Sep 20 15:22:52 omv collectd[17586]: rrdcached plugin: rrdc_update (/var/lib/rrdcached/db/localhost/rrdcached/operations-receive-flush.rrd, [1474399359:200], 1) failed with status -1.
    Sep 20 15:22:52 omv collectd[17586]: Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1.
    Sep 20 15:22:52 omv collectd[17586]: rrdcached plugin: rrdc_update (/var/lib/rrdcached/db/localhost/rrdcached/operations-write-updates.rrd, [1474399359:45], 1) failed with status -1.
    Sep 20 15:22:52 omv collectd[17586]: Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1.
    Sep 20 15:22:52 omv monit[813]: 'nginx' start: /bin/systemctl
    Sep 20 15:23:22 omv monit[813]: 'omv.local' 'omv.local' cpu wait usage check succeeded [current cpu wait usage=28.7%]
    Sep 20 15:23:22 omv monit[813]: 'omv.local' loadavg(5min) of 2.9 matches resource limit [loadavg(5min)>1.0]
    Sep 20 15:23:22 omv monit[813]: 'omv.local' loadavg(1min) of 2.1 matches resource limit [loadavg(1min)>2.0]
    Sep 20 15:23:22 omv monit[813]: 'nginx' connection succeeded to INET[127.0.0.1:80] via TCP
    Sep 20 15:23:52 omv monit[813]: 'omv.local' loadavg(5min) of 2.6 matches resource limit [loadavg(5min)>1.0]
    Sep 20 15:23:52 omv monit[813]: 'omv.local' 'omv.local' loadavg(1min) check succeeded [current loadavg(1min)=1.3]
    Sep 20 15:24:22 omv monit[813]: 'omv.local' loadavg(5min) of 2.4 matches resource limit [loadavg(5min)>1.0]
    Sep 20 15:24:52 omv monit[813]: 'omv.local' loadavg(5min) of 2.1 matches resource limit [loadavg(5min)>1.0]
    Sep 20 15:25:22 omv monit[813]: 'omv.local' loadavg(5min) of 1.9 matches resource limit [loadavg(5min)>1.0]
    Sep 20 15:25:53 omv monit[813]: 'omv.local' loadavg(5min) of 1.7 matches resource limit [loadavg(5min)>1.0]
    Sep 20 15:26:23 omv monit[813]: 'omv.local' loadavg(5min) of 1.6 matches resource limit [loadavg(5min)>1.0]
    Sep 20 15:26:53 omv monit[813]: 'omv.local' loadavg(5min) of 1.4 matches resource limit [loadavg(5min)>1.0]
    Sep 20 15:27:23 omv monit[813]: 'omv.local' loadavg(5min) of 1.3 matches resource limit [loadavg(5min)>1.0]
    Sep 20 15:27:53 omv monit[813]: 'omv.local' loadavg(5min) of 1.2 matches resource limit [loadavg(5min)>1.0]
    Sep 20 15:28:23 omv monit[813]: 'omv.local' 'omv.local' loadavg(5min) check succeeded [current loadavg(5min)=1.1]


    So more resource limit error messages, a different rddcached error message came up and a bunch of loadavg messages. It feels like there's a buffer that's filling up then once it hits the limit everything times out.

  • Here's my collectd.conf if that will help:


    PIDFile "/var/run/collectd.pid"
    Hostname "localhost"
    FQDNLookup true
    LoadPlugin syslog
    <Plugin syslog>
    LogLevel info
    </Plugin>
    LoadPlugin rrdcached
    <Plugin rrdcached>
    DaemonAddress "unix:/var/run/rrdcached.sock"
    DataDir "/var/lib/rrdcached/db/"
    CreateFiles true
    CollectStatistics true
    </Plugin>
    LoadPlugin unixsock
    <Plugin unixsock>
    SocketFile "/var/run/collectd.socket"
    SocketGroup "root"
    SocketPerms "0660"
    </Plugin>
    LoadPlugin cpu
    LoadPlugin df
    <Plugin df>
    MountPoint "/"
    MountPoint "/media/57c40fe1-f0fc-4e43-934f-7c2cfa55bd99" (RAID 5 Datastore)
    MountPoint "/media/9444843044841760" (External hard drive)
    IgnoreSelected false
    </Plugin>
    LoadPlugin interface
    <Plugin interface>
    Interface "eth0"
    IgnoreSelected false
    </Plugin>
    LoadPlugin load
    LoadPlugin memory

  • A thought: is it because the CPU on the OMV machine maxes out eventhough it's nowhere near maxed out on my ESXI host? I guess I could see OMV "maxing out" it's CPU (I only gave the VM 1 of the 2 cores I have on my ESXI host) and that killing a transfer after a short period of time?


    I'm just spitballing here, I have a little bit of general Linux under my belt but a total OMV noob.

  • The mystery deepens. I tried rsync'ing from my ubuntu machine to OMV since it was only method I hadn't attempted yet.


    First attempt:
    sync -avp 2013\ Tapes/ user@IP address:/media/57c40fe1-f0fc-4e43-934f-7c2cfa55bd99/Music
    user@IP address's password:
    Could not chdir to home directory /home/user: No such file or directory
    sending incremental file list
    rsync: failed to set times on "/media/57c40fe1-f0fc-4e43-934f-7c2cfa55bd99/Music/.": Operation not permitted (1)
    Some files transferred

    sent 492,271,376 bytes received 1,713 bytes 16,687,223.36 bytes/sec
    total size is 1,964,934,730 speedup is 3.99
    rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1183) [sender=3.1.1]


    Second attempt:
    rsync -avp --info=progress2 2013\ Tapes user@IP address:/media/57c40fe1-f0fc-4e43-934f-7c2cfa55bd99/Music
    user@IP address's password:
    Could not chdir to home directory /home/user: No such file or directory
    sending incremental file list
    All files transferred
    sent 1,965,443,038 bytes received 5,441 bytes 10,710,890.89 bytes/sec
    total size is 1,964,934,730 speedup is 1.00
    ------------------------------------------------------

    I added the info progress2 flag and set it to correctly copy the home folder over that was enough to transfer with no errors. Something I did notice is my transfer speed did go down as the rsync job progressed. I started at 77 MB/s and ended at around 10 MB/s. Also, no entries in the syslog file the entire time the transfer happened.

  • So it looks like everything but rsync is giving timeout errors and rsync transfer speed isn't staying steady. That last test was from the HD on my Ubuntu machine to my OMV VM. Next I'll try my USB 3.0 external drive on a machine the supports 3, then rsync that over.


    Ideally I'd like more options than rsync but my Googling the syslog error messages hasn't gotten me anywhere.

  • Here's my test data:
    - One folder roughly 2 GB in size
    - 15 subfolders (music albums)
    - 279 files


    Here's my test conditions:
    Test 1: Transfer the data via rsync from the HD on my Ubuntu machine to my OMV VM
    Test 2: Transfer the data via rsync from USB 3.0 HD plugged into another machine to my OMV VM

    Results:
    Test 1: Starting speed around 70 MB/s, Ending speed around 10 MB/s
    Test 2: Starting speed around 55 MB/s. Ending speed around 15 MB/s


    Conclusion: I'm guessing the slow down in speed has to do with the way the data is transferred to the disk, not the protocol. I'm also guessing this could be tuned. That's something I'd like to tackle, but I'd like the ability to share via SCP and Samba first.


    Any suggestions or solutions on any of these issues is appreciated.

  • So after the rsync transfers I just decided to try scp...and that worked. And so did ftp and samba...so I have no idea what's going on. I mean I'm happy that it seems to be working now, but I don't know what was blocking it from working in the first place so if it breaks again I won't know how to fix it.


    During the samba transfer and the scp transfer I'm still getting the 'computer name' loadavg (Xmin) of X.X matches resource limit [loadavg(Xmin)>X.X] error messages but it finishes. Also, during the Samba transfer the transfer speed would fluctuate between 20 MB/s and under 1 MB/s.


    My two question lines remain. Question #1: Why was I getting all of these error messages when trying to transfer any method other than rsync? And I guess why would it be working now? Question #2: What transfer speeds should I be seeing? Is the speed fluctuations I'm seeing normal?

  • JanN, what are you thinking as far as ESXI problems go? Is this an issue with storage or maybe a network driver?

    There is obviously a malfunction that generates a lot of log entries, what eats ressources. Normally even a VM on the described hardware should be able to do this job without nearly freezing. That still make me think, that it is an ESXI-problem with I/O and/or the NIC-driver and/or cpu-scheduling or whatever.
    Did you try a bare metal installation of OMV on your machine? If that works as expected, you can use the Virtualbox-PlugIn to run other VMs on the box under OMV (Debian) as the Hostsystem - i run two VB-VMs with Windows7 on my much slower N54L Microserver this way...


    BR
    Jan

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!