Posts by micdon

    Happy to share - eventually it helps someone to squeeze out more performance.

    I'm using a small SSD (120GB) to boot an esxi host and to store a VM called NAS01 running the latest version of OMV. This Virtual Machine has native access (PCI-passthrough) to two AHCI cards presenting 10 physical disks and 4 SSDs. Everything else is pretty normal - OMV has an NFS export that is mounted by the ESXI hosts as datastrore. To tune this to optimum performance I did the following:
    - Dedicated virtual switch (backbone)
    - Dedicated 10Gbe network cards for storage
    - Dedicated physical 10Gbe Switch

    I was running a similar setup since almost 8 years - just rebuild and tuned things after freeing up more HDDs to throw into the mix

    - MTU 9000 on all 10Gbe interfaces
    - Parameters for NFS export: "async,no_subtree_check,insecure"

    On the OMV side I ran the following fine-tuning options, basically treating the Vmxnet adapter like a physical Mellanox adapter

    - # Disable timestamps
    sysctl -w net.ipv4.tcp_timestamps=0

    # Selective acks
    sysctl -w net.ipv4.tcp_sack=1

    # Increase maximum length of processor input queues
    sysctl -w net.core.netdev_max_backlog=250000

    # Increase the TCP maximum and default buffer sizes
    sysctl -w net.core.rmem_max=4194304
    sysctl -w net.core.wmem_max=4194304
    sysctl -w net.core.rmem_default=4194304
    sysctl -w net.core.wmem_default=4194304
    sysctl -w net.core.optmem_max=4194304

    # Increase memory thresholds to prevent packet dropping:
    sysctl -w net.ipv4.tcp_rmem="4096 87380 4194304"
    sysctl -w net.ipv4.tcp_wmem="4096 65536 4194304"

    # Enable low latency mode for TCP:
    sysctl -w net.ipv4.tcp_low_latency=1

    # Buffer split evenly between TCP Window and Applications
    sysctl -w net.ipv4.tcp_adv_win_scale=1

    I have also attached a pic of the setup and the final test results with 10 x 1 TB 7200RPM disks in Raid 10 (mixed vendors, all more than 5 years old)



    Just setup 4 x SSDs as Raid 10....
    I didn't do anything different than with hard disks.
    So no need to have anything fancy like "overprovisioning and trim" stuff going ?
    I'll stress test this setup for a bit before I move anything productive on it.... basically it's an datastore for ESXI served via NFS
    10Gbe Ethernet - currently only one port used but I might use them as twin (2 x 10Gbe) with network teaming / LAG if I find the time

    I think I figured things out.... mostly
    I was able to get local speeds with peaks in the 1500MB/s with SSDs in Raid 10 :-)
    Over the wire speeds to another host in the same cluster have been close to 1100 MB/s and I think that's about what there is to get from 10Gbe.
    Happy camper - this is definitively good enough for home use....


    I'm curious what speeds other people are seeing if OMV is serving an datastore to ESXI via NFS and single 10Gbe link ?

    My setup seems to hit a limit somewhere around 500 MB/s write and 700 Mb/s read (left picture)
    If I remove the 10Gbe limit (right picture) I see writes in the 700 Mb/s and reads hitting more than 1 GB/s
    OMV 5.x, ConnectX 10Gbe cards, Dell Switch with Jumbo Frame Support, MTU 9000 - everything else default.

    Many Thanks


    • Capture.JPG

      (176.75 kB, downloaded 300 times, last: )

    You need one controller for esxi and another one that you can pass through. E.g. on my DL165G7 the onboard HP controller is used by ESXI to access a Kingston 240GB ssd. All standard until here...
    One of the PCie slots is used for an SAS 2008 controller (crossflashed IBM M1015). Now I configured a new VM for OMV with 4 CPUs and 8GB RAM and assigned the SAS 2008 controller to the VM. That's all that needs to be done...
    From inside the VM you will see SDA as "Virtual Disk" and all other disks as physical disks including SMART attributes and without any difference to a physical machine. I'm even running an storage works D2700 enclosure as part of the mix and I have full control about all LEDs, all sensors, positions of disks in bays and so on. Took me an hour reading through the SES-2 wiki and putting together a few lines of python code.

    I have added some details about the configuration... This is one out of two instances - both run well and one will shortly reach 1 PB data that has been moved without troubles :)
    The reason why I throw so many CPUs and RAM into the pot is that I'm running an 10Gbe backend and some tuning to speedup the end to end chain.... E.g. I see up to 1200 MB/s within VMs accessing storage from OMV which allows to boot a Windows 10 VM in less than 5 seconds. I run pretty much everything virtual - including gaming and GPU number crunchers...


    does your server support PCI Passthrough ? I'm also running OMV instances on top of ESXI as virtual machines and one of my hosts is an DL165 G7. Here is the trick:
    1.) You need an additional SAS or SATA controller in infrastructure (non raid) mode
    2.) Use PCIpassthrough to handover the SAS controller directly to your OMV VM
    This setup gives you native access to all harddisks, including SMART which is important. In addition it is as fast as the native machine would be without esxi in the middle...

    The best part is that you can use the VM with OpenMediaVault to provide super fast NFS storage to the same esxi host....

    This setup seems to be very reliable.... My ESXI hosts boot from USB, use a small 240GB SSD to host the OMV VM and everything else works via NFS.


    Checkout the Athena 1u and 2u products... They are 80 Plus certified, they do spindown the fans and they are not too expensive.
    In most cases you will have a hard time to find something that fits perfectly and you will have to do some improvisation. I deployed an 2u Athena to replace the tripple redundant PSU in an Supermicro case and it worked out pretty well.

    Best Regards

    Your array (Raid5) normally needs 3 disks but one is failed. Very likely when you added your replacement disk it wasn't empty which can create a lot of troubles.

    To evacuate the data (that's the low risk approach) set it to read only prior to copying data to an other disk set.
    To do this you would unmount your file system
    umount /dev/md127
    Stop the array
    mdadm --stop /dev/md127
    And start it again in readonly mode:
    mdadm --readonly /dev/md127
    And mount it again with
    mount /dev/md127
    Now all of your data should be accessible but of course in readonly mode. You can't rebuild in this mode but you can copy data to an evacuation disk.

    If you want to recover / rebuild the array you need to set it to writable first - use the same steps than above (unmount, stop, start, mount) but replace --readonly with --readwrite
    mdadm --readwrite /dev/md127

    Once your array is set to readwrite it will accept the new disk and it will start to rebuild itself. Prior to adding the new disk please make sure that it is completely empty. You can wipe it with the OMV GUI.
    After this is done you can also can also add the disk with the OMV GUI under Raid Management / Recovery. This should fully do the job.

    If you want detailed status messages you can can SSH into your box and run
    watch -n 1 cat /proc/mdstat

    Hopefully this gets you back to an healthy setup :-)
    Good Luck

    The yearly graph isn't great to judge on performance.... Here are some benchmarks inside VMs

    I'm quite happy with the performance of the system... of course it fluctuates if a lot of stuff is going on such as replication of data.

    To quickly comment on the settings question:
    I'm running two OMV servers as VMs on two different Esxi 5.5 hosts. Both VMs are equipped with 4 vCPUs and 8 GB vRAM and use PciPassthrough to get native access to an LSI2008 SAS controller.
    There is a third Esxi hosts that operates disk less and just consumes storage via NFS. All three hosts interconnect via 10Gbe to an dedicated storage VLAN on an dell X1052 switch. Disk wise I stick to either RAID 10 or RAID 6 depending on the task. To get the crazy spikes (1200 MB/s) you have to optimize the entire chain for speed and add enough spindles :-)

    I posted the chart in my original post only for the quantity of data and not for the speed of the setup... If you are interested I can copy/paste some parameters related to nfs, kernel, 10gbe optimization...

    ssh into your sever and post the output of this command:
    mdadm --detail /dev/md127

    Assuming that md127 is your broken raid5 array...

    One thing that you can try as well is to stop and mount your array in read only mode.
    mdadm --readonly /dev/md127

    GUI Question: I echo that its in most cases not a good idea to run a GUI... If you badly want it "apt-get install gnome-core" will give you a GUI that isn't too bloated
    Drive Question: The web GUI effectively just executes commands underneath. Everything is open and transparent and 100% debian based
    VLAN Question: You can configure VLANs via SSH / Console like mentioned by ryecoaaron all debian underneath - there is an wiki for VLAN setup somewhere
    RAID1 Question: If I got this right you want to create a raid 1 setup with only one disk ?
    OMV is using mdadm underneath and mdadm supports stuff like this but you have to do it via SSH
    "mdadm --create --verbose /dev/md0 --level=mirror --raid-devices=2 /dev/sdb missing"
    This creates an mirror using only one real disk (/dev/sdb) and missing. This should show up as degraded raid1 in the web GUI and once you add your drive it can be synchronized

    I hope this helps :-)


    just out of curiosity.... How much data have you moved with your OMV Server (traffic by year) ?
    Well, I don't think I'll see the petabyte unless I do some crazy stuff....

    Most traffic is generated by the 15 VMs that are running 24x7 in my home setup...
    OMV clearly does an excellent job to provide NFS storage to ESXI.


    Gibt es in Deutschland eigentlich auch sowas wie Crashplan ?

    Fuer ca. 60 Dollar im Jahr kann man dort unbegrenzte Backups fahren....
    Die Software is genial und funktioniert problemlos auf OMV. Natuerlich braucht es einen gescheiten uplink (ich habe 75mbit synchron). Das initiale Backup (ca. 10TB) hat wochen gedauert :-)
    Die Incrementals (Alle 30 Minuten) sind super schnell weil sich ja nicht wirklich soviel aendert.

    Gruesse aus NJ

    Thanks for the info :-) That's good to know that match and mix should work. I ordered a 12 pack of 500 GB disks, Trays and 4 x 60mm Noctura FANs to complete the project.

    Quick update:
    I managed to get the SES stuff going. I can see all temperature probes, all status messages from the box, the power supplies plus SAS expanders and I can fully control the 50 LEDs (2 per drive) to e.g. flash the red LED in case of SMART failures or the blue one to locate a drive. The only thing that doesn't work is to set the fan speed lower - I have tried all options / variants without success. Today the silent PWM fans should come in and eventually I drop in a picture of the unmodified and modified FAN pods. In addition I will create an Nagios python script to fully monitor the box including RPMs and Temperatures to see if the noise reduction works out without overheating....

    Some update on the noise optimization of the fan pods:
    1.) The stock fans are Delta 12v 1.68 Ampere 60x60x38 fans. Each fan pod has 2 of them and there are 2 fan pods. This adds up to 12 x 1.68 x 4 = 80 Watts cooling. Absolutely not required for 2.5" desktop drives with 7200 RPM that are designed to run without active cooling in notebooks and other devices....
    2.) There aren't many PWM fan options in 60x60x38 -The Noctua that I ordered are expensive and a joke for this application - expensive and zero airflow
    3.) I was switching the pods from dual fan into single fan mode which worked out fairly well and with zero cost. The blue wire on the proprietary PWM connector is the speed signal. Basically I was cutting the leads from FAN2 and only connected the blue wire with an Y connector to the blue wire on FAN1. As result the box is quieter and draws less power without raising an alarm. ( FAN 1 and 2 report the same speed and FAN 3 and 4 report the same speed)
    4.) Unpluging both fan pods doesn't bother the disks at all but the temperature of the SAS I/O modules slowly goes up until it reaches 48 C. If someone doesn't mind the fault LEDs this might be an ok way to go. I can't stand fault leds so I'm still investigating to find the best possible compromise having full stock functionality with less noise and power


    I'm adding a second NAS instance to my network and after quickly zapping through the alternatives I decided to setup OMV number #2 :-). Similar to my primary OMV installation also this box will be virtualized on top of ESXI 5.5 with 4 vCPUs, 8 GB RAM and dedicated M1015 cross-flashed to IT Mode (P20 firmware).The hardware underneath is an Proliant DL165G7 with 96GB RAM, 2 x 12 Core CPUs and Dual 10Gbe card. The server has a small internal SSD to provide a datastore for the OMV system disk. All the data drives will be externally in an D2700 enclosure.

    About the D2700 Enclosure:
    I had a hard time to find these facts on the internet - so I thought I'd share:
    -Dual 6 Gbps ports on two separate I/O cards
    -Redundant 80 Plus Gold Powersupplies and FAN Pods
    -Heavy, solid and build like a tank
    -Idle power consumption: 49 Watts which is what I was expecting for newer gear
    -Noise Level: The only significant source of noise are the two FAN pods, each equipped with dual PWM FANS. Overall really not too noisy but I might give it a try to flip the PWM fans into lighter duty low noise fans.
    -Costs I got mine for 40 bucks and it looks like new and came with all cables which is a real steal

    SAS vs SATA ?
    I'm not yet sure what I will throw in but as far as I know I can't mix SAS and SATA on a SAS channel. I got 2 SAS 10K drives just to test this thing and they are drawing 12 Watts on the killawatt which is a lot. Once the box is fully hooked up I will try throwing in various SAS and SATA drives including SSDs and I will post my test results here....

    The 10'000 RMP SAS disks (single channel) performed poor and did not provide SMART data both effects might be related to the fact that I'm not using a SMART array controller and / or that the firmware on this box isn't up-to-date. I got these two disks 72GB for free and I'm disposing them :-)
    I used the frames to hotplug an Kingston SSD Now V300 that was on my shelf and it does report SMART and it runs at the expected speeds (Average read with 235 MB/s / write 395 MB/s)
    I also tested an Toshiba 500GB SATA that also provided full SMART data and measured with average read 115.4 MB/s and write with 83.5 MB/s)
    Interestingly adding these two disks didn't increase the idle power consumption which defines SATA as way forward. I'm glad that my offer for 10K SAS disks expired :-)

    As next step I'm shopping some disks and frames and I'll try to hack the FAN pods - they are a bit too noisy for me...

    I hope this adds some valuable information


    Here is what I'm doing - it's a slightly different philosophy:
    Hardware Gigabyte 990FXA-UD5, 32GB HyperX, FX8350 which gives you a 4 x SATA and a 6 x SATA controller onboard. I'm using the 4 x Sata to run an SSD which is used as datastore for Vsphere. On top are 3 VMs called omv (with the 6 SATA ports assigned) /plex/owncloud. One could argue that this setup should be slower than running everything on baremetal but interestingly it is not.These machines interconnect with unlimited bandwidth if they are on the same host and you can snapshot / restore each of these machines individually which is a great advantage to test stuff and revert back in seconds if something goes wrong. It also gives you the headroom to clone / copy machines or to quickly setup something to test and play with it.

    Just to share my conclusions after running OMV for one month 24x7 as central storage platform to provide disk space to 2 x ESXI Nodes and lots of VMs.

    • The concept of virtualizing OMV worked out very well and Vsphere 5.5 and newer provides some options to keep VMs alive even if central storage goes down (Important to apply patches)
    • Running 10Gbe requires a lot of specific tuning (Adapter settings, TCP/IP settings, Vswitch settings, NFS configuration) but once this is done remote storage is equally fast than local attached storage
    • A single SSD that connects with a SATA port doesn't utilize a 10Gbe link if everything is configured correctly... A good test to see where the bottlenecks are is to use a ramdisk on both systems (source and destination)
    • If MDADM and 10 Gbe play together it is required to provide enough CPU horsepower. I managed to create a 90 Load Peak while putting the system under stress (datastore migration of 20VMs in parallel to different MDADM
      Pretty mean stuff but I really like to test things before I trust them

    Here some statistics:

    I really like your writeups and they did safe me a lot of time when I was implementing OMV a few weeks ago. If you do DEFCON this year drop me a PM and I'll buy you some beers.
    Out of curiosity: Did you measure the VPN throughput (iperf or similar) with the PI2 ?

    It's a pretty cool device for stuff like this. Some month ago I started to use a PI for the exact same thing and recently replaced it with a PI2 because USB on the old PI really sucked badly. My PI controls multiple UPS devices and does intelligent power management to maintains a meaningful shutdown / startup sequence of my ESXI Nodes and virtual machines. It does communicate with my Nagios server and it can respond and fix stuff like services or devices that fail. I used to ravel a lot and the PI plus some core devices share a car battery which provided me 2 days remote capability - even if everything else should be out of order. I envy those people that make time to finish and publish their projects. I'm simply to busy so I rather donate to great projects than to share my own stuff...