Bonded network problem

  • Hi folks.


    I have set up my OMV initially with a single ethernet nic(The one on the motherboard) But I also installed an intel twin nic card as well. I have set the two nics on this card to be bonded round robin. On my switch I have the relevant twin connections for the card set in LAG. (it infact came from a windows server set up where the lag worked fine)


    Both the single nic and the bonded twin nics have been allocated fixed IPs.


    With all nics connected I can connect to either IP.



    If I disconnect the single nic, then the bonded nic cannot be accessed either.


    Could someone help me as to where I have gone wrong. I really want to disconnect the single nic and just have the bonded nic.

  • Re,


    seems to me that you mixed up your config on the bonded ethernets - on side with rr, and the other side with 802.3ad. Both sides have to be equal, so you should use an your Linux-Box mode 4 ...


    So the best conclusion without seeing your config is, that your IP connect works through the single line, while the bond isn't working at all ...


    But: how looks the bond on your switch? Is the LAG up or down? On your Linux-Box you can check it with:
    cat /proc/net/bonding/bond0 (please provide the output)


    Anyway, i only hope you have the right "environment" for bonding - under normal circumstances (private / home usecase) it's not worth the time wasted ...


    Sc0rp

  • the right "environment" for bonding

    +10 clients but only if the correct algorithm used on both server and switch (I've seen very expensive and stupid implementations using even 4 NICs with all users ending up on a single link and maximum transfer speeds when needed that every crappy Banana Pi as server can exceed easily)

  • Re,

    +10 clients but only if the correct algorithm used on both server and switch

    Naaa, if you use a correct configured 802.3ad (LACP) on bonded interfaces, it will truely work/scale ... on NFS.
    Samba until v3 don't benefit from link aggregation ... (or is it "didn't benefit" ?) ... at all :(
    (unfortunately i didn't test SSH/SCP or FTP ... on my temporarily set up 4x1GBit-LACP bond)



    Sc0rp

  • Hi gents I'll take a look tomorrow and post back what I have.


    I believe I have set the two intel nics as the bond because when I set the device up I only had the single nic on the motherbord connected, therefore when I created a new bonded interface the only two available were the nics on the card.


    AS a bit of background the two wires used in for the bind are currently set to lag and worked fine with windows server 2012, that being said I am a little hazy on all the choices in OMV, on windows it was just a case of adding the two nics to the LAG, I do not recall choices for round robin and all that stuff.


    The reason I wanted it (and perhaps not so important on OMV) is that in windows on a single nic speed would max at 110MB/s but would be up and down up and down, where as with the two nics it would be a constant 110MB/s leading to faster transfers.


    I don't suppose its the end of the world!

  • if you use a correct configured 802.3ad (LACP) on bonded interfaces, it will truely work/scale ... on NFS

    Huh? How's that?


    I know that I can create a 'bond' in rr fashion directly between two Linux hosts with some preparations to get nice synthetic benchmark numbers (that are trashed in every normal 'NAS scenario' with more than 2 peers involved) or to accelerate a point-to-point connection between exactly two Linux hosts... but besides that... how should this be possible especially with 802.3ad (LACP)?

  • in windows on a single nic speed would max at 110MB/s but would be up and down up and down, where as with the two nics it would be a constant 110MB/s leading to faster transfers.

    You should diagnose this behaviour since what you're trying now won't work (IMO). With a single GbE link you get 110 MB/s SMB performance at the application layer. If it's below then there's something wrong (these days). You mentioned Win2012 server which is using SMB Multichannel by default. Bandwidth in such a situation with 2 NICs on each side and no configured LACP (no bonding, just individual GbE links that are accessible between both hosts) should exceed 200MB/s.


    What you observed is most probably already the result of misconfiguration and it won't get better now when you try to use LACP (the only reasonable choice if more than one OS is involved).

  • Re,

    Huh? How's that?

    Put the switch either in "slave" mode, or use a switch, which provides more options regarding "frame distribution" and on the linux-box do some "magic" with the "transmit hash policy to use for slave selection" ... article for algorythm's: reference (german).


    It is not easy, need's lot of knowledge and ... of course ... it will work on 1to1 links too (even server to server). We tried it only once for a customer, cause he was totally sure about gaining 800MiB/s+ on 8x1GbE-Bond ... i think you can imagin, how was that failin' ...
    Another aproach is using VLANs (without LACP),it's tricky too, but separates the clients same as the other hashes i think (never tried)



    But best alternative is 10GbE ... when the switches will become payable for home users ... i think.


    For bonding/link aggregation you'll need simply more clients, may be virtualization can be an aproach?


    (Personally i gave up using bond's in my home-network, even you can mix up fiber and copper-links - for now i just simple job my transfers to the night at single 1GbE-Links, if i got data-masses ...)


    Sc0rp

  • You should diagnose this behaviour since what you're trying now won't work (IMO). With a single GbE link you get 110 MB/s SMB performance at the application layer. If it's below then there's something wrong (these days). You mentioned Win2012 server which is using SMB Multichannel by default. Bandwidth in such a situation with 2 NICs on each side and no configured LACP (no bonding, just individual GbE links that are accessible between both hosts) should exceed 200MB/s.
    What you observed is most probably already the result of misconfiguration and it won't get better now when you try to use LACP (the only reasonable choice if more than one OS is involved).

    Hi tkaiser, could you expand on this a little bit, sorry I am the most dangerous kind of PC guy, someone who has half a clue!


    On windows server I was able to aggregate (i think is the term) two nics which then showed as 2gb connection rather than 1.


    This was also done on the switch.


    The net result was a very consistent throughput speed.


    I am not sure I understand what you mean regarding different OSs, indeed I have OMV, Windows 10 and OSX on the same network?


    I have never seen 200MB/s on my network. The switches in use are gigabite.

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!