My head is spinning... (ZFS? Btrfs? Bitrot? NUC? Pi?)

  • Hello,

    So, I have been researching various ways to improve my backup setup and I have fallen into a bit of a rabbit hole... I am hoping using the sounding board of this forum will help me to climb out! You can see how lost I am by how much text is below...


    Current Backup Setup

    A hodgepodge basically. Despite using computers for years, I didn't really even have any kind of solid backup strategy (aside from "it's in Dropbox/Google Drive" - which I am now painfully aware is not a backup tool, it's a sync tool!) until recently.

    My various laptops around the house are backed up onto a bog standard WD MyPassport 1TB 2.5" external USB hard drive using Borgbackup - each machine has a separate repo on the disk. I connect this disk to my desktop (when I remember - I try to do it once a week) and push backups from clients to desktop over SSH. I then use rclone to mirror all those repos into my Gsuite account (I put them into a hidden "appfolder" so they don't appear in the web interface and I don't bother encrypting them as Borg already does that). So, rather than 3-2-1 backup, I feel it is more sort of 2.5-1... The Borg docs suggest (correctly) that a better way to do this would be to have two repos for each machine - one on the USB disk (that would be the separate physical media onsite to make two copies) plus a Borg repo in the cloud that I push backups directly into (making it a proper third offsite copy).


    The reason why my setup is currently "2.5-1" is because the onsite and offsite copies are merely just a direct mirror of each other - if something goes wrong with one, it will be replicated with the other. At the time I set this system up, there weren't really many affordable places that offered direct Borg to cloud backup (but that has changed - you can get a 2TB Hetzner Storage Box for about €12 a month and it allows you to push Borg backups directly to it. There is also rsync.net as well as a couple of others). Also, I feel like Google are going to start clamping down on storage - they closed Google Play Music, they are limiting Google Photos in June and they seem like they are closing the "it's 1TB per Gsuite account unless you have five, but in reality we don't check" loophole that people have been using (and abusing). I am still a touch under 1TB, but I might surpass that soon... Might be time to think about another place to put stuff!

    As a second (pretty poor) "failsafe", I run Syncthing on some clients. A limited amount of key files from them (amounting to about >50GB total) are synced to a 1TB 3.5" WD Red on my desktop machine (also running Syncthing). I have versioning activated on Syncthing running on the desktop machine. I then use rclone to mirror these files (with the Syncthing versions) into an Rclone crypt on Mega (it used to be Pcloud, but I only had a free account with 20GB of space and it ran out... Mega is free for up to 50GB, but again, I am probably going to run short on space again soon!)

    Basically, it is merely OK, but I know that I can do better...

    Existing Hardware


    My main concerns at the moment are that there isn't enough redundancy (I really want 3-2-1 properly) and there is absolutely no provision for error checking and bitrot prevention.

    I looked into TrueNAS, but I found the forums and subreddits to be quite unwelcoming, and I just don't think I have the cash (or space, my flat is small!) to stump up for a decent TrueNAS capable rig with ECC memory and lots of drives. OMV seems like a better fit for me (and the forum seems a hell of a lot more friendly!)

    Hardware (drives to come) I have that I think I could use:

    - A desktop with a Core i5 4590, 16GB RAM

    - An Asus Chromebox, Haswell Celeron 2955U, 4GB RAM. It has one m.2 SATA 2242 slot currently occupied with a 16GB SSD. I have flashed a full UEFI BIOS onto it and at various times it has been a LibreElec/Kodi box, a Retroarch/Lakka retro gaming box, and even a router on a stick using OpenWRT and smart switch! Now it is just sat there gathering dust.

    - A Raspberry Pi 3 - running PiHole and a little Ampache music server (to replace Google Play Music - I hate YouTube Music, so I went this way instead). The music is stored on one of those tiny Samsung USB flash drives (not the only copy of those files I have of course!) It just means it can sit in there in a very low profile way serving up my personal music collection (even remotely if I Wireguard back home whilst out).

    Drives I have:

    In the desktop (which I use for day to day desktop stuff and a little light gaming) is:

    1x Crucial MX500 250GB 2.5" SATA SSD running the OS

    1x WD Red (CMR) 1TB 3.5" HDD

    1x Seagate (presumably SMR) 1TB 2.5" HDD (salvaged from a laptop and really only used for downloaded stuff that isn't sentimental and I could easily download again)

    Going spare:

    1x Intel Optane 16GB NVMe (which I believe if put into an NVMe slot can actually just be used as a standard (if small) NVMe SSD)

    1x Samsung CM871a OEM 256GB 2.5" SATA SSD

    1x Samsung OEM 256GB m.2 SATA SSD (not sure of the model number)

    1x Seagate SMR 2TB 2.5" HDD (salvaged from a laptop, wiped and with nothing on it)

    Enclosures/Docks

    1x TeckNet 3.5" UD027 SATA to USB vertical docking station with its own power supply. No S.M.A.R.T. reporting.

    1x Inateck FE2004 2.5" SATA to USB enclosure (bus powered). Has S.M.A.R.T.

    Client machines all have either a single SSD or an SSD and one 2.5" spinning rust drive salvaged from elsewhere (those are laptops I have removed the DVD drive from and used a SATA adapter to add these rust drives alongside the SSD).

  • Current thoughts

    SFF Server


    I buy an oldish office SFF PC and stuff it with drives and install OMV. I can get something like an HP Elite 8300 SFF (i3 3220, 4GB RAM, 4 SATA slots) for about £40. I could maybe even add more SATA slots if I use some kind of reflashed SAS to SATA HBA or just a simpler PCI-e to SATA card (although these might get too hot??) The largest SFF cases (like the HP Elite 8300) have two 3.5" bays, plus a 5.25" bay that I could easily add a converter bracket into to make it into a third 3.5" bay. Then I would either creatively mount the spare 2.5" SATA drive using Command Strips (I have done this before!), or use a PCI-e to m.2 card with the spare m.2 SATA SSD or the Optane drive as the OMV OS drive (or I could even use a USB drive for OMV, either a thumb drive or one of those SSDs in an enclosure hanging off of a USB 3.0 port). Ideally I don't want anything bigger than SFF as I don't have the space...

    My idea would be to put the existing WD Red 1TB from my desktop into it, then:

    1. Buy additional drives so I can create one or more ZFS mirrors (or something with BTRFS? I haven't really read anything about it so not sure, but I hear if you want bitrot detection and repair, it is either ZFS, BTRFS or Snapraid (coming to that) in that order). I know you can run "check" with Borgbackup, but it can only detect bitrot in the backup repos, and whilst it can attempt repair, it certainly isn't as robust or effective as ZFS, BTRFS or Snapraid from what I have read).

    2. Snapraid. Buy additional drives to make up data disks plus parity. Means I can mix disks sizes, but have to run manual checks/scrubs etc.

    Is using 2.5" drives a no-no? Even on Snapraid? As each of the 3.5" bays could become 2x2.5 bays with an adapter bracket, making more space.

    I guess I could also just buy additional drives and put them in my desktop (there is space as it is a mini tower). But I don't like the idea of having a Frankenstein rig that I use for desktop usage and gaming alongside using it as a backup server - I would rather have a separate dedicated OMV machine I think.


    Chromebox


    I use the Chromebox with one (or more) external USB drive(s) hanging off of it. Unlike a Raspberry Pi, it has four true USB 3.0 ports that can provide power. Or I could utilise some kind of proper powered JBOD USB enclosure (either 2 bay or 4 bay). I'm not sure about this... From what I read on these forums, the USB>SATA route isn't really as reliable as defacto SATA ports. There isn't really bitrot protection either, unless I run a shonky ZFS/BTRFS setup over USB - and I have read on here that this is a terrible idea!


    NUC(-ish)


    I have seen cheap Foxconn NanoPC AT-7300s online for about £40 (with 4GB RAM and a (presumably crappy old DRAM-less) 120GB SSD). It has a decent (if older) i3 3217u (passively cooled), two SO-DIMM slots that can take up to 16GB and the real fun bit - a proper defacto 2.5" SATA-III slot. Is it mad to somehow use this? Maybe put the spare 2.5" 2TB Seagate spinner I have into it and use that to run OMV, rysnc/rsnapshot my files over to it from all clients, taking Borg backups to an external USB drive? Or maybe even (as they are so cheap) grab two, put a drive in each one and setup rsync between them, or run Borg from one to the other. No bitrot protection here though...


    SBC


    The SBC route - either an Odroid HC2 (or two), an Odroid HC4 or upgrade my Pi 3 for a Pi 4. The Odroids are better as they have proper SATA ports. The HC4 could probably just about do a simple ZFS mirror (or BTRFS?) - but I don't like the toaster case. Also, I am in the UK - an HC4 is basically £100 shipped with PSU, an HC2 is about £70 shipped with PSU. I feel like the Foxconn above is a better bet, especially compared to the HC2... Also, no bitrot protection unless I use the HC4. The Pi4 is just basically a weaker version of using the Chromebox.


    Get OVER it!


    Calm down, stop worrying, put the spare 2TB 2.5" SMR laptop drive in the dock/enclosure (or Foxconn/Odroid) and use it to take incremental rsync/rsnapshot/borg backups. This would either be by connecting it directly via USB to each client, or connecting it up to the Chromebox/Desktop/Foxconn/Odroid and doing backups over SSH. Also no bitrot protection but at least it would be 3-2-1 (if I carried on using my existing Borg setup as well).

    Whatever I did I would continue to take Borg backups as I have been doing - either as I do now, onto a USB mirrored into G Suite with Rclone, or sack off G Suite and pay instead for something like the Hetzner Storage Box that I can push a secondary Borg repo for each client directly to.

    So... what would you do?

    Particularly paging Adoby here as, having read a lot of his posts, I feel like this is something he has also grappled with (especially the "should I ZFS?" and "but what about bitrot?!" dilemmas).

  • I am experimenting, mostly for fun, with writing (C++) a snapshot style backup utility that use checksums to detect bitrot and fix it. To fix the bitrot it copies over the backup copy if the original file has bitrot and copies over the original file again if the backup copy has bitrot. The utility works fine between locally mounted filesystems. I am testing on EXT4. Mergerfs and NFS.


    My thinking is that during a backup the utility has access to previous snapshot backup copies of all files. So that provides the redundancy needed to fix bitrot without any need for parity, mirroring or RAID. All that is needed is backups. But of course, it is not real time. But it should still work OK for large media libraries that are mostly just growing slowly. Video/music/photo archive.


    But it is still buggy and too slow. And use too much memory. A backup utility should not be buggy... I am currently rewriting it (almost) from scratch for the third time. Sequential file read performance set a hard limit on the speed of bitrot testing, I want to come close to that speed.


    No idea when it will be done. I have been working on this on my spare time, off and on, for a few years now...

    Be smart - be lazy. Clone your rootfs.
    OMV 5: 9 x Odroid HC2 + 1 x Odroid HC1 + 1 x Raspberry Pi 4

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!