Please, sanity-check my storage/ backup-strategy

  • Please, review and comment on my storage-and-backup-stategy. It's still (almost) entirely on paper.


    I plan on having two shared folders:

    Shared folder "DeviceData" consists of a single SSD.

    Shared folder "Storage" consists of a couple of spinning disks in a union-fs.


    Here's the scheme and tiers:

    • Our "devices" (phones, tablets, laptops etc.) use Syncthing to back all their essential data to DeviceData.
    • DeviceData is rsynced to Storage every 10 minutes or so.
      Storage is what's used by Plex-or-Emby-or-Kodi-but-lets-not-have-that-fight-here.
    • Storage is backed up to Backblaze or Wasabi (a few times per day, maybe?)

    I have two main philosophies here:

    A: install as few apps as possible. That's why I've gone for rsync (already in OMV).

    1: Forward as few ports as possible. That's why I've gone for Syncthing - also it's pretty awesome.


    Some issues:

    Afaik, rsync can't monitor a disk/folder for changes. Rather is has to run scheduled.

    Is there a better option to backup DeviceData to Storage?

    What happens if a rsync job hasn't finished when the next one starts?


    Looking forward to reading your input and advice.

  • Sounds pretty good to me overall. A few thoughts:


    Rsyncing DeviceData to Storage every 10 minutes is excessive IMHO. Once a day or maybe every 12 hours should be plenty. SSDs are pretty reliable, and your DeviceData already serves as a backup in itself, namely of the data on your phones, etc.


    Similarly, I wouldn't bother backing up Storage more than once a week, maybe every other day. It's a question of "how much is your data going to change between backups, and how much recent data since the last backup can you afford to lose, given the worst-case scenario?"


    I can't think of a simple method to sync two shared folders in realtime right now, but I can answer your other question about rsync - if a new rsync job launches and notices that another one is still running, it will silently exit and not interfere with itself. (Tested this myself.)

    • Offizieller Beitrag

    When using Rsync or similar backup tech that doesn't do incremental backups or versioning:


    The longer the time interval between automated backups is, the more time you'll have to
    [1] Discover there is data problem (failing hard drive, accidental delete, etc.)
    AND
    [2] Stop any automated backup
    BEFORE
    [3] The problem is replicated to the backup device.


    Just a thought

    • Offizieller Beitrag

    This is one reason I do not enable the delete trigger on my automatic rsync jobs. In the event I discover I need to "go back" on a file, I just SSH my server and go to my backup drive, which will have two copies of the file... I then just copy it back to the main drive and delete the "new" one on my source drive.


    Usually once a month or so, I log into the webUI, enable the delete trigger, run the job manually to bring the two drives completely in sync, then turn it off again.


    It may not be the best or most practical practice, but I've done it forever and it has never caused me an issue and has in fact saved me a few times.

    • Offizieller Beitrag

    Usually once a month or so, I log into the webUI, enable the delete trigger, run the job manually to bring the two drives completely in sync, then turn it off again.


    It may not be the best or most practical practice, but I've done it forever and it has never caused me an issue and has in fact saved me a few times.

    I like that method and it's mentioned in the User Guide. It allows for some accidental delete protection.

    I'm using ZFS Snapshots, providing versioned protection for up to a year. Still, that's local and there's no point in closing off options when replicating to the backup server.

  • I couldn't imagine a work flow requiring a backup every 10 minutes

    You're probably correct. I didn't do my research homework, and thought that rsync could do delta-copy.


    Here's my "imagined workflow" that required the short interval (and assumed delta-copy):

    1. Device 1 takes a photo.
    2. It gets picked up by Syncthing and is copied to DeviceData.
    3. 10 minutes later it's rsync:ed to Storage.
    4. Device 2 can see the photo in Plex-or-Emby-or-Kodi-or-someOther4-letter-word.


    So the idea is that devices can see photos taken recently by their peers.


    But if rsync can't delta-copy, maybe I could achieve this by giving "Kodi" access to DeviceData as well.


    What's a good alternative to rsync (with delta-copy functionality)?

  • Shared folder "DeviceData" consists of a single SSD.

    considering that:

    - performance of any file server is limited by network between server and user device

    - SSD are quite expensive in relation to HDD per GB


    What benefits would see for fast server data storage?

    omv 6.9.6-2 (Shaitan) on RPi CM4/4GB with 64bit Kernel 6.1.21-v8+

    2x 6TB 3.5'' HDDs (CMR) formatted with ext4 via 2port PCIe SATA card with ASM1061R chipset providing hardware supported RAID1


    omv 6.9.3-1 (Shaitan) on RPi4/4GB with 32bit Kernel 5.10.63 and WittyPi 3 V2 RTC HAT

    2x 3TB 3.5'' HDDs (CMR) formatted with ext4 in Icy Box IB-RD3662-C31 / hardware supported RAID1

    For Read/Write performance of SMB shares hosted on this hardware see forum here

    • Offizieller Beitrag

    But if rsync can't delta-copy,

    What is meant with delta-copy?

    rsync only copies data that have changed.


    If you want to have versioned backup, have a look at rsnapshot. There is a plugin for it from omv-extras.

  • rsync only copies data that have changed.

    Well ... then: horray!


    Inititally I thought it did, and then I read crashtest's first reply in this thread:

    When using Rsync or similar backup tech that doesn't do incremental backups or versioning:

    And then I thought it didn't.


    And now I don't see the problem with a short rsync interval.

  • What benefits would see for fast server data storage?

    It's not so much the speed, but rather being resistant to abuse.

    With 6-10 devices taking photos, saving documents etc. and almost

    immediately Syncthing them to the OMV NAS, I thought an SSD would last longer.

  • I'd recommend reading this reliability comparison and

    considering the key statement

    "If you write a lot of data to a drive 24/7, HDDs are generally more reliable. SSDs are pretty reliable nowadays for consumers and some server applications, and you would normally replace your SSD by the time you hit its write limit (assuming average use)"

    omv 6.9.6-2 (Shaitan) on RPi CM4/4GB with 64bit Kernel 6.1.21-v8+

    2x 6TB 3.5'' HDDs (CMR) formatted with ext4 via 2port PCIe SATA card with ASM1061R chipset providing hardware supported RAID1


    omv 6.9.3-1 (Shaitan) on RPi4/4GB with 32bit Kernel 5.10.63 and WittyPi 3 V2 RTC HAT

    2x 3TB 3.5'' HDDs (CMR) formatted with ext4 in Icy Box IB-RD3662-C31 / hardware supported RAID1

    For Read/Write performance of SMB shares hosted on this hardware see forum here

Jetzt mitmachen!

Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!