Posts by crashtest

    Is it that catastrophic? It's not recommended to be used in RAID5, but it's the default filesystem for both Fedora and Suse (openSUSE and SLES) and has been for years

    Workstation / client use of a filesystem (Suse / Fedora) is a bit different. Much as it is with Windows, client problems are frequently written off to the user and the idea that client hardware is often bottom on the barrel. Given the complexity of client software, viruses, etc., clients need to be rebuilt from time to time. (Sometimes as often as just two or three years.) Rebuilding a client is a broadly accepted solution to client problems.

    Servers are another matter. Servers tend to concentrate irreplaceable data in one place. Accordingly, long term stability is ranked much higher. Two additional factors of significance, when it comes to Linux noobs, is there's this attraction to RAID5 (without a good understanding of what it is) AND, rather than giving thought to backup, a noob will invest their money in a single platform with large drives. Since large drives are still relatively expensive, noobs tend to shy away from spending additional money on drives, for the needed data backup / redundancy, because of their perception of the safety they have in RAID5.

    With all of that noted, whether it's a client or a server, hard drive issues are inevitable. Since server OS's are far more stable, when compared to a client, there's a good chance that the first issues users will experience might be with data drives.

    Now, throw BTRFS RAID5 problems and BTRFS filesystem recoveries (difficult to nearly impossible) into the mix, along with users that have no backup. Yeah, from a forum support perspective, that would be a freaking disaster. It's already bad enough, as it is, with home users insisting on using mdadm RAID without backup and SBC users setting up RAID over USB connections. If even 1 in 100 to 250 users had a problem with BTRFS, the forum would be overrun. The last thing this forum needs, from a support perspective, is the wide adoption of unstable filesystem being used on data drives.

    If you wonder about these statements then perhaps this will convince you:…

    Wow. That list is extensive and the issues go far beyond what they were -> advertising several years ago. (There was zero progress, for years at a time, on the problems they would admit to.) While I observed the effects of a couple of those behaviors, I didn't attempt to get to the bottom of "why" it was happening. My thoughts were, if there are obvious issues out in broad daylight with a simple volume, there would almost have to more that are hidden, waiting to be discovered. (I simply moved on to an FS that could deal with abrupt power loss.)

    Some years ago, there was a forum moderator that was pushing BTRFS on Linux newbies largely because it was integrated into the kernel. I never understood his reasoning or the "unfounded trust". Since most newb's don't have backup, I saw recommending BTRFS as a potential disaster from a forum support perspective. Thankfully, it wasn't widely adopted.

    For those who may be interested in a plain language explanation of the status of BTRFS, as of a few years ago, this -> article might be an interesting read.

    I still wonder about these statements. Didn’t we point out multiple times that BTRFS RAID 5 is as (un)stable as any RAID 5? As far as I know every RAID 5 would have a potential write hole issue in case of power loss. BTRFS devs are just the only ones who flag their RAID 5 as unstable due to it.

    In my case (several years ago) the instability have nothing to do with RAID5. I was working with a plain BTRFS single drive volume. I was testing BTRFS for a mobile application where power disconnects would not be unusual. In theory, power losses should have no effect on a COW volume but that was not the case with BTRFS.

    After a sudden power loss; I forget what the exact error was (it was a sequenced number for operations, that the file system found to be out of sequence or something like). I delved into BTRFS' utilities and errata to "fix" the issue. On the first occasion, I managed to "patch" the filesystem and bring it out of read-only. In two other occasions, I couldn't recover it and had to move to more drastic measures where the filesystem was lost.

    In recent times, I'm still using the same drive (it's a WD USB external) with BTRFS in a backup role. Lately, I haven't had any issues,, but,, after dealing with those earlier events my faith in BTRFS has been damaged. While BTRFS Dev's may have fixed the issues responsible, based on my own experience, I couldn't recommend BTRFS as a primary data store. (That's just my opinion.)

    - EXT4 with mergerfs and snapraid

    => Easy and simple but, if I understand well, when a disk fail data will be offline during the fix command ?

    Are "rebuild" time as long as raid/zraid ?

    Good data and drives won't be off-line, but you should take them off-line. Since the file data on good drives are part of the "fix" command's parity calculations, for recreating a failed disk, you wouldn't want users altering those files during the fix operation. That might result in unrecoverable errors.

    Rebuild times are dependent on many things. In the case of a SnapRAID rebuild, the largest factor would be the number and size of data files to be recreated.

    - BTRFS with raid 5.

    => Best integrated in OMV, data available during replacement (even if take a long time to rebuild)

    I know about the write hole, I have a ups and since my data are not crucial, I maybe could take my chances ?

    I wouldn't do RAID5 in BTRFS but that's your call. While it was years ago and I'm sure things have improved in the mean time, I had a couple of experiences with BTRFS and it's utilities that resulted in the loss of the entire filesystem.

    - ZFS with zraid1

    => Could be the answer but isn't that overkill for my need ?

    What downside are to ZFS apart from integration that needs plugin (like mergerfs and snapraid btw) and not upgradable (not a problem for me, won't change the hdd capacity for a while) ?

    While I have 1 server configured with RAIDZ1, I've been using ZFS in Zmirrors, for data integrity, for at least 8 years. I've replaced two drives, in a Zmirror, during that time period without a problem. What I appreciate most is that ZFS gives indicators that a drive is going bad before SMART begins to give warnings. That's far better than taking unrecoverable errors as a drive actually begins to die.

    The plugin, in my opinion, is not even a consideration. That's more a matter of licensing that anything else. ZFS integration in OMV is very good. The plugin provides 95% of the ZFS features most users will need, in the GUI, along with displaying snapshots captured, etc. If something special is needed, there's always the CLI.

    This -> document will guide you through the installation of zfs-auto-snapshot for preconfigured fully automated, self rotating and purging snapshots. The document even covers "unhiding" past snapshots for simple file and folder recoveries using a network client.

    - Some hybrid BTRFS snapraid mergerfs

    => I could have data availability during disk replacement if I got it right. But it would maybe be a too "messy" install (like if I replace a disk, it wouldnt appear in OMV GUI cause I gotta do it in cli)

    While SnapRAID will work with BTRFS, a more simplified filesystem like EXT4 or XFS would be better. And while mergerfs does a great job of aggregating disks, understanding the effects of it's storage polices is key to understanding how it works and what to expect.

    SnapRAID and MergerFS work well together, but throwing a complex filesystem into the mix seems to be asking for unforseen issues.


    Zfs is good but does have higher ram requirements (official recommendations for zfs usually state ecc ram and 1GB per TB of storage if I recall correctly).

    I think the 1 to 1 memory requirement is mostly TrueNAS / FreeNAS propaganda. With dedup off (I've never used dedup in any case) I've been running a 4TB mirror with 4GB of ram, on an Atom powered Mobo with zero issues. While I realize that meets their 1GB to 1TB requirement, my little Atom backup server is not utilizing anything near the available ram. Even with page cache included, it's typically using around 1GB, maybe a bit more.

    While ZFS will use a LOT of ram during copy operations, that's simply good utilization of the resource. ZFS gives ram back if other app's need it.

    I'm guessing without log's. Using PuTTY, could you capture the full install log to a file?

    Can you ping OMV's address from a client? If your not getting a web page, it sounds like the install is still incomplete, not completing, errors, etc. This seems to be the case if you're being force to manually reboot the install.

    On your local router, what address are you using for DNS? If you're using your ISP for DNS (a blank entry on the LAN side) you might try or

    Realize that you're in some sort of unique situation. OMV is installed by thousands of users (literally) without these issues. Finding what the issue is will be the key.

    Perhaps I could set them to spin down, and only have them spin up once a month? I'm not sure that would help.

    Without actually powering the older drives down, they'll continue to age. Further, since the major stress on a motor is when power is applied, frequent powering on and off will create a good amount of wear and tear as well.

    The balance to be struck is powering them on once every month or two to keep the drive's spindle bearings from drying out while minimizing wear. With those conditions in mind, installing them in separate device that could be powered off most of the time would provide a tangible benefit as a backup last resort.

    Based on what you've said:
    You have 5 users so, while an outage wouldn't be the end of the world, being able to recover quickly would be a bonus.

    Backing up data, some of which might not be replaceable, is far more important that backing up the OS,, so...
    I believe I'd setup rsync between the 4TB internal drive and the 4TB external drive.

    The reasons being:
    - Rsync will create a full backup of the internal drive, to the external drive, on an interval of your choice.
    (Don't set the replication interval to a matter of hours. Disk failures are not like flipping a light switch. You'll need time to discover that the primary drive is failing AND to intercede before corrupted data is replicated to the backup drive. Once a week would be enough.)
    - If (actually when) the internal drive fails, you could quickly fail over to the external drive (10 minutes or so) and use it as your data source until the primary drive is replaced.
    - The above techniques are relatively simple. You don't have to be a Linux expert to implement them.

    The particulars of how to use rsync to replicate a data drive are AND how to recover to the backup drive are -> here.

    For the 1 and 2TB drives:
    I'd think about using them in a mergerfs pool for creating a backup server (maybe using an SBC or an old PC?). Assuming that your data store doesn't exceed 3TB, if they're healthy, the age of the drives wouldn't matter much if the backup server replicates the primary once every month or two and is powered down. I have a 4TB USB external drive that is over 10 years old but, since it's only powered on for maybe a day, once every two or three months, it's skipping through time in a healthy state. As little as the drive is in use, it may outlast me.

    If you're interested in the above, creating a full backup server, there's a doc for that as well.

    OS backup is another matter.
    If the OS fails, an OS rebuild using your preexisting data drives would restore data. "But", if you're extensively configured and forgot how you set it up, getting back to where you were might take awhile. For OS backup, I believe that cloning thumbdrives is the way to go. It's easy do, test and it simply works.

    Since you appear to be using another form of media to boot OMV, there are other possibilities for OS backup like the backup plugin, Clonzilla and others. (Search the forum.) However, none of them are as simple as cloning and swapping out an external thumbdrive (or other form of USB connected boot drive).

    One thing I do not understand is what are volumes and filesystems in OMV's ZFS implementations? Which is the equivalent of TrueNAS' dataset? I am already running a simple ZFS pool with 2 mirrored SSDs for my docker apps, where I have created two folders, one for the application data and one for the images. Is there a way to convert those to "datasets" (or their OMV equivalent) so i can snapshot them separately?

    It sounds like once you created a ZFS pool, you're using it by creating Linux folders at the root of the pool. While that can be done, you're missing out on a lot of functionality.

    Once a ZFS pool is created; in the OMV GUI, you would highlight pool, click on the add(+) button and "add filesystem" from the popdown. It's called a "filesystem" because the created data set has the characteristics of a formatted partition. While filesystems can inherit ZFS properties from the parent pool, a filesystem can have it's own set of ZFS properties and, as you suggested, they can be snapshot-ed separately from the pool with different snapshot intervals and different retention periods.

    I think I'm leaning towards a mergerfs of the 1tb and 2tb allowing me to learn how to do it, while making my 4tb the main drive. Any thoughts on that?

    The above brings forth a few questions:

    - About how much data do you plan to store? (I realize, at this point, you might not have even a rough estimate.)
    - Are you planning to backup as much as 7TB?
    - How large is the external USB drive?

    I should backup my OS too... which I currently don't.

    While it's not hard to rebuild OMV from scratch, remembering configuration details and other considerations can make the clean rebuild route a bit more difficult. My preference is booting from a USB thumb drive. It's easy (as in dirt simple) to clone and test a thumb drive.

    1. ZFS, while works in OMV, its not supported out of the box
    2. the plugin offers just the bare minimum of options
    3. when replacing a drive,
    4. scheduling scrubs
    5. snapshots and restoring data from them basically I have to google the zfs commands and do it from the command line.

    1. Whether ZFS is available by default or it's added as a plugin is inconsequential. The only difference is installing it. With the Proxmox kernel, ZFS is well supported by the kernel and trouble free.

    2. Not true. It's a matter of knowing where to find and edit various properties. The following is an example of the numerous editable properties of an individual filesystem. However, setting ZFS attributes should be a one time thing. These selections should be decided before implementing a pool and creating child filesystems. Editing ZFS properties after a filesystem has been created and populated with data, creates folders / files with mixed attributes.

    3. Over the years (at least 8 years), I've replaced two drives in my primary server's Zmirror. Replacing a drive is an area where I'd rather be on the command line.

    4. One scrub, every two weeks, is scheduled when the plugin is installed. For most users a 2 week scrub is fine.

    5. This -> doc would guide you through how to setup and configure zfs-auto-snapshot. The zfs-auto-snapshot package will automate snapshots and purge unneeded snapshots on a configurable schedule. Thereafter, it's on auto-pilot. The document also explains snapshot interval considerations and how to "unhide" snapshots, enabling easy restorations from a network client.

    2) MegerFS with Snapraid is what I was using during the past years, and while it works well, this is also implemented via external solutions. Also I am quite happy I haven't had to restore any data via Snapraid, I am not sure what my success rate would be (and also it has to be done via the command line with no GUI support)

    Have you checked the tools for both the MergerFS and SnapRAID plugins lately? You'll find that most of the functions that may be required for restorations, drive swaps, etc., are in the GUI. Doc's -> MergerFS -> SnapRAID.

    I recently (a few weeks ago) ran into the same behavior with very similar hardware and 4TB drives. In a Zmirror, during a scrub, one of my drives was showing errors while the other drive was fine. SMART stat's were fine along with clean short drive tests. Scrubs where repairing a couple of MB of data but I've seen that before. There were no uncorrected errors.

    Since all checked out, I used zpool clear. However, after clearing the pool, the next scrub indicated that the same drive had errors. Finally, I got "DEGRADED" and "Too Many Errors".

    In the bottom line, I ordered another drive and replaced the drive ZFS was complaining about. As follow up, I'll examine the replaced drive with a LONG SMART test and I'll probably run Spinrite against the drive, in another PC, to see if anything turns up.

    In the bottom line, if you want to keep your data clean, I'd replace the drive.

    The balance tool's default behavior is to balance a drive pool, to within 2% fill of each member, at the file level.

    **Edit** I found some arguments for the balance tool on the -> mergerfs github page but that doesn't mean the OMV "button" implements them.

    Note that the balance tool's default 2% behavior can be problematic when running tests in a VM, with tiny virtual drives (+/- 4GB) and while using large video files. Achieving a balance of 2%, or less, may not be possible in that scenario. That can result in a ping-pong effect, as file(s) are written, moved, written again, etc.

    Regretfully, I don't have anything to offer. I've set up remote mount at least a 100 times through different versions of OMV, to SMB shares hosted by various remote boxes, both bare metal and VM's; Windows, OMV, a couple of Linux Desktops (Debian w/KDE and something else with Xfce as I remember) and haven't had a problem. That's not to say that I've had success with connecting to ALL Linux client distro's. That's not the case.

    The key is, the remote client or server must accept the username and password offered, the same as it would if a user(name) and password accessed a remote SMB share over the network. If some other security related credential "extra" is required by the client, that's another matter.

    You're saying "andy" is "root", as in root account was renamed to andy? Or are you saying andy is in the root group? Renaming the root account would almost have to be problematic in that nearly all the Linux software out there assumes that "root" exists. Having the user "andy" in the root group is not the same as user "root".

    As a test:
    Use the user name root and the password for root as it is set on the remote machine to, set up the remote mount.
    (Essentially, use the same administrative username and password, that is used from administration of the remote server or PC If it's a Linux box, it's usually root.)

    I don't get why it is doing fine now but since it still denies my ssh login and web connection,

    "Fine now" doesn't look fine to me. Latency is all over the place.
    The bottom line is that the connection is not stable. There's no way to understand what effects this would have on the installation, as in what packages are not installed, partially installed or corrupted.

    If I redo the installation with ethernet will I still be able to use the wifi after the installation?

    While it's not a good idea to run a server's primary connection over Wifi:

    Yes. However, the installation itself requires a stable connection. Again, the reason why "wired" is specified is because of the build issues associated with wifi. (Bandwidth contention, interference, slow, in your case latency, etc., etc.) Even florescent lights can have a significant impact on Wifi.

    Do the setup (all of it) over a wired connection then worry about setting up wifi.
    You may have to manually define the wifi interface, in the GUI. Here's a sub-section from another install document that covers it. -> Wifi.

    Nearly 50% packet loss AND high latency? There appears to be a serious wifi network and / or adapter problem. If the external software portion of the netinst build (from the mirror) was done over this connection, it's no wonder why it took all night. I'm somewhat surprised it completed at all.

    As noted in prerequisites, "This installation process requires a wired Ethernet connection and Internet access."