ZFS setup+long term care, automated snapshots questions

  • Hello! I am looking for guidance when it comes to caring for and getting the most use out of my ZFS pool.


    I have been using ZFS for about nine months on my old server, but I did not know much about it when I set it up. I simply installed the PVE kernel, installed the ZFS plugin, created a mirrored pool with two drives (raid 1 style), and called it a day. I have never had a problem with it.


    Recently, I bought a nice old computer to be my new server. I am taking my time setting it up and moving my data and services, as I want to do it right. I want to take advantage of more features that ZFS offers, such as snapshots.


    This brings me to my question: What are some steps I should take when setting up these drives in the new machine? (I backed everything up, and can wipe if necessary).


    I also cannot find a way to automate ZFS snapshot creation and deletion in the webUI. I found posts back from OMV4 saying that it was not possible, and how to do it manually. Is that still the case, or is there a way to do snapshots from the webUI now?


    Thanks for any replies!

    Dual Intel Xeon E5-2690 @ 2.90GHz | 128GB ECC DDR3 1600MHz

    OMV 6 | Proxmox 6.1 kernel | ZFS | docker | SWAG

    Join the openmediavault discord server! https://discord.gg/qcGj2upevS

  • This brings me to my question: What are some steps I should take when setting up these drives in the new machine? (I backed everything up, and can wipe if necessary).

    Only need to import pool (from shell or from webGUI -> probably need to force import).


    Once imperted, is ready to start to share folders on webGUI .


    PD: Perhaps, you need to take control as root of folders to avoid old user, so new user must be your webGUI created user that might be on de "userts" group

  • in this post, RE: ZFS settings it looks like I need to run commands like

    zfs set aclinherit=passthrough ZFS1

    zfs set acltype=posixacl ZFS1

    zfs set xattr=sa ZFS1

    before there are files on the pool. Would this mean I should just wipe the drives and create a new pool?


    Also, does OMV automatially run scrubs on the pool to check it for errors, or do I need to automate that somehow?

    Dual Intel Xeon E5-2690 @ 2.90GHz | 128GB ECC DDR3 1600MHz

    OMV 6 | Proxmox 6.1 kernel | ZFS | docker | SWAG

    Join the openmediavault discord server! https://discord.gg/qcGj2upevS

  • Okay, I researched scrubbing more, and it looks like OMV automatically scrubs ZFS sometimes.


    As far as snapshots though- is it even worth having it automated, or would it be better to just do them occasionally before I make a big change to the files I have on the disks?

    Dual Intel Xeon E5-2690 @ 2.90GHz | 128GB ECC DDR3 1600MHz

    OMV 6 | Proxmox 6.1 kernel | ZFS | docker | SWAG

    Join the openmediavault discord server! https://discord.gg/qcGj2upevS

  • Okay, I researched scrubbing more, and it looks like OMV automatically scrubs ZFS sometimes.


    As far as snapshots though- is it even worth having it automated, or would it be better to just do them occasionally before I make a big change to the files I have on the disks?

    I do it manually because I copy/move a small amout of data a month, if your escenario is different ( move/copy large amount of data a month/week), better to do it automatically.

  • Chipwingg Now you have a shiny new old server with more cores and ram you can shake a stick at, I wonder if it had crossed your mind that you could use KVM within OMV to setup a virtual machine running an instance of OMV. The virtual machine environment gives you a safe place to learn, experiment and get to grips with zfs and what the OMV zfs plugin provides.


    Back to your initial question. I don't know what plans you have beyond just using a zfs pool with a single mirror vdev in terms of adding disks. For multiple HDDS, zfs begins to shine when used with 4+ drives and a lot of home users find 6 HDD plus 2 SSD offers the combo of performance, capacity and redundancy they want.


    A happy zfs experience does mean forward planning and matching pool design to use case. What's suitable for mostly streaming plex data for example, may not be suitable if you wanted to use the zfs pool for virtual machine storage or iscsi shares or even just to share data via NFS.


    Things like what properties you set on the pool to be inherited by all datasets created on the pool as opposed to setting the properties of individual datasets are all part of how you intend to create/use the data. A classic example being the difference between datasets suitable for downloading torrents and streaming the stored data.


    The zfs plugin allows for manually creating snapshots or scrubs when you choose. The next step is to create simple "Scheduled Task" for either dataset snapshots or pool scrubs, eg an hourly time stamped dataset snapshot:



    Beyond that their are several sophisticated scripts available on the web that can be used for snapshots and replication, etc.


    I'm not going to give you a list of refs for zfs, some people prefer just to get their hands dirty while others prefer to read background material first. Ask if you want what I think are helpful references about zfs.

    • Official Post

    This -> doc will help you to set up automated ZFS snapsshots.

    zfs-auto-snapshot is a simple approach. It uses cron scripts with no software dependencies. I've used zfs-auto-snapshot beginning with OMV4 up to the present day. The doc will provide guidance on how to install zfs-auto-snapshot, how to setup snapshot intervals and how do roll backs or selective restores.

  • more cores and ram you can shake a stick at

    You are right about that! I intend to use some resources for more intensive VMs like headless docker-OSX and running LSIO Kasm, but a semi permanent OMV VM sounds like a good idea.


    I don't know what plans you have beyond just using a zfs pool with a single mirror vdev in terms of adding disks. For multiple HDDS, zfs begins to shine when used with 4+ drives and a lot of home users find 6 HDD plus 2 SSD offers the combo of performance, capacity and redundancy they want.


    A happy zfs experience does mean forward planning and matching pool design to use case.

    For now, I am happy with my 2 4TB drives. I am nowhere near filling them up, I have roughly 300gb used now. However, I have convinced my mom to store her photos on my server in addition or in place of google photos. I do not know how fast that will fill up my storage, and I assumed when these drives fill up I'll just invest in bigger hard drives to have a new larger 2 drive pool. I haven't thought much about using more than 2 drives, other than in standard RAID configurations.

    Ask if you want what I think are helpful references about zfs.

    Any knowledge you're willing to teach me/send me links for, I appreciate!


    zfs-auto-snapshot is a simple approach. It uses cron scripts with no software dependencies. I've used zfs-auto-snapshot beginning with OMV4 up to the present day. The doc will provide guidance on how to install zfs-auto-snapshot, how to setup snapshot intervals and how do roll backs or selective restores.

    I appreciate this doc! I can't believe I didn't see it on OMV extras before.


    I have another question that I would like some clarification for:

    I've read about "child filesystems" on ZFS pools. What are they exactly? I currently have a ZFS pool named "tank", and then I created a filesystem on it that I also named "tank". If I am understanding correctly, I could theoretically have the pool named "tank" and then multiple filesystems (tank, tank2, tank3, etc) all accessible from individual mount points (/tank1, /tank2, /tank3). Is this correct?

    Dual Intel Xeon E5-2690 @ 2.90GHz | 128GB ECC DDR3 1600MHz

    OMV 6 | Proxmox 6.1 kernel | ZFS | docker | SWAG

    Join the openmediavault discord server! https://discord.gg/qcGj2upevS

    • Official Post

    I've read about "child filesystems" on ZFS pools. What are they exactly? I currently have a ZFS pool named "tank", and then I created a filesystem on it that I also named "tank". If I am understanding correctly, I could theoretically have the pool named "tank" and then multiple filesystems (tank, tank2, tank3, etc) all accessible from individual mount points (/tank1, /tank2, /tank3). Is this correct?

    Yes, if I'm understanding what you're saying. The analogy is the pool (parent) with created zfs filesystems (child) under the pool.

    A ZFS "pool" is the collection of hard drives (or block devices if one wants to get technical). In my case I called the pool ZFS1.
     
    After the pool is created, run the attributes following against the pool name. (In your case the pool name is tank so I set them accordingly. Compression is optional.)


    zfs set aclinherit=passthrough tank

    zfs set acltype=posixacl tank

    zfs set xattr=sa tank

    zfs set compression=lz4 tank


    After this is done, acl's and extra attributes will be inherited by all files, folders, and ZFS filesystems created under the pool "tank".


    **From here the pool could be divided with Shared Folders, using standard Linux folders, but I believe that's a mistake.**


    Once the pool is created and adjusted a bit (above): (in OMV6) highlight the pool and click on the + icon and add a filesystem. Name the filesystem whatever you want. In my case I went with names that I would be using as network shares. (See below.)


    A ZFS file system has individual editable ZFS properties AND it allows you to run customizable snapshots on each individual filesystem. This also means that "rolling back" and file or folder restorations can be done on an individual filesystem basis, versus doing operations on the entire pool. (As noted in the zfs-auto-snapshot doc, it's better to be as granular as possible when doing any kind of roll back or restoration.)
    ________________________________________________________________________________________

    BTW, since the following is a common snag point:

    When setting up Shared Folders, using ZFS filesystems, you'll have to change the relative path.

    Consider the following:

    This Shared Folder (when SMB is added) will be shared to the network as Backups. Backups is the name I gave it.

    Since Backups already exists as a filesystem, I selected the ZFS filesystem Backups from the pop down menu. The relative path will require modification. In my case it defaults to Backups/. That must be changed to a single /



    _________________________________________________________________________________________

    Lastly, if you choose to set it up, the snapshots created by zfs-auto-snapshot will be available within the GUI.

  • Chipwingg


    One part of maintaining a zfs pool not mentioned so far is the monitoring of your storage devices “S.M.A.R.T” state. It’s just as important with zfs as any other filesystems, so use it.


    You’ve been given information about snapshots, but it must emphasised that snapshots on the same host is not backup. So don’t ignore proper backups of your data.


    It’s possible to use zfs on OMV by only using the plugin ( plus “scheduled tasks”) but you probably won’t have pickup any transferable skills nor appreciated the limits of the plugin nor what might be necessary or more convenient to do via the CLI as root. You certainly wouldn’t know why certain pool or dataset properties have recommended settings.


    Getting your head round zfs terminology is an early apart of understanding the basics of zfs. Of the recent web articles I’ve seen, this is a good starting place:


    ZFS for Dummies · Victor's Blog
    A ZFS cheat sheet for beginners
    blog.victormendonca.com


    It gives clear explanations of the terms: pool, vdev, dataset and zvol. Explains raidz types, and shows how to deal with pools and zfs filesystems by the various forms of just two commands “zpool” and “zfs”. The article includes this listing


    Code
    root@ubuntu-vm:~# zfs list
    NAME                                USED  AVAIL     REFER  MOUNTPOINT
    tank                                253K  9.36G     30.6K  /tank
    tank/dataset1                      30.6K  9.36G     30.6K  /tank/dataset1
    tank/dataset2                      91.9K  9.36G     30.6K  /tank/dataset2
    tank/dataset2/childset2            61.3K  9.36G     30.6K  /tank/dataset2/childset2
    tank/dataset2/childset2/childset2  30.6K  9.36G     30.6K  /tank/dataset2/childset2/childset2


    It illustrates why you might want to adjust your thinking about using /tank /tank1 /tank2 etc.


    That’s probably enough for one day.

  • crashtest Your post was very helpful, I appreciate instructions with brief explanations like that (km0201 and you seem to write like that often). The screenshot illustrating the difference between your pool and your filesystems cleared up the remaining questions I had :)


    One part of maintaining a zfs pool not mentioned so far is the monitoring of your storage devices “S.M.A.R.T” state. It’s just as important with zfs as any other filesystems, so use it.

    Good point. I currently do not have SMART tests scheduled, but each drive has a green "good" icon next to them. I will look into scheduling real SMART tests.


    That link was very informative and helpful! Thanks for sharing.

    Dual Intel Xeon E5-2690 @ 2.90GHz | 128GB ECC DDR3 1600MHz

    OMV 6 | Proxmox 6.1 kernel | ZFS | docker | SWAG

    Join the openmediavault discord server! https://discord.gg/qcGj2upevS

  • A ZFS file system has individual editable ZFS properties AND it allows you to run customizable snapshots on each individual filesystem. This also means that "rolling back" and file or folder restorations can be done on an individual filesystem basis, versus doing operations on the entire pool.

    If I created a snapshot of the entire pool using zfs-auto-snapshot, could I "roll back" only one specific filesystem and leave the rest of the pool at the current state? Theoretical example: I have a pool called tank and two filesystems (data1 and data2). I tell zfs auto snapshot to make a snapshot of tank weekly. I accidently mess up some files on data2. Could I restore only data2 without touching data1, even though they were snapshotted in the same snapshot of tank?

    Dual Intel Xeon E5-2690 @ 2.90GHz | 128GB ECC DDR3 1600MHz

    OMV 6 | Proxmox 6.1 kernel | ZFS | docker | SWAG

    Join the openmediavault discord server! https://discord.gg/qcGj2upevS

    • Official Post

    Pool snapshots are for standard (Linux) files and folders contained within the pool that are not inside of a child ZFS filesystem. If you decided to create shared folders using standard Linux folders, all operations (rollbacks or restores) would be at the pool level. I went along this route early on, where all Linux files and folders were shared from the root of the pool, with no ZFS filesystems. It works but the differences in datasets / filesystems can't be taken into account with setting up snapshots. (Which led me to the following.)

    If child filesystems are created from the parent pool, each individual ZFS filesystem will have it's own snapshots that can be set up with their own time capture intervals. For rollback and restorations purposes, individual filesystem snapshots are separate from the parent pool. From filesystem snapshots, that specific filesystem can be rolled back or individual (Linux) files and folders can be restored.

    Theoretical example: I have a pool called tank and two filesystems (data1 and data2). I tell zfs auto snapshot to make a snapshot of tank weekly. I accidently mess up some files on data2. Could I restore only data2 without touching data1, even though they were snapshotted in the same snapshot of tank?

    You can restore Data2's (Linux created) files and folders from Data2's snapshots. Data1, a separate filesystem, would not be involved.

    Along these lines, it might be useful to use human readable names that indicate what's contained in your filesystems, (Documents, Pictures, etc.) Your call.
    ________________________________________________

    BTW: Krisbee is right about backup. I run "zmirrors" (similar to RAID1) but I do not consider a mirror to be "backup" in any way. If you're not interested in server to server replication, you might consider an external drive of sufficient size to replicate your pool with rsync. From one of my backup servers, following is the command line that I use for that purpose.

    rsync -av --delete /ZFS1/ /srv/dev-disk-by-label-DATA1/


    The above sync's a 4TB pool with a 4TB external drive.
    More details are available on OMV-extras.org (In the OMV6 maint and backup doc)

  • Chipwingg Something else to consider about backups


    The advantage of rysnc is you can easily read data from an ext4 formatted external drive on another Linux system. But unless you are syncing unchanging data, by the time the rsync job is complete it could already be out of sync.


    The zfs way is to combine the use of zfs point in time snapshots with zfs send & receive which can work in incremental mode, is resumable and can take advantage of zfs encryption. The downside ( or possible advantage ) is you can only read data from your external drive on another system running zfs.

  • It's been awhile, but now I am going to setup the automatic snapshots. If I understand correctly, snapshots do not cause very much extra storage space to be used?

    Along with a backup strategy, make sure you know how to replace a failed disk before you commit all your date to the pool.

    I read some articles and threads from this forum, and it looks like I would have to: 1. connect new drive, 2. "resilver" the pool, and 3. remove the failing drive from the machine.


    Also, having a complete copy of the pool on an ext4 drive would be a good decision in case something went wrong. I intend to set up an rsync as described here:

    From one of my backup servers, following is the command line that I use for that purpose.

    rsync -av --delete /ZFS1/ /srv/dev-disk-by-label-DATA1/


    The above sync's a 4TB pool with a 4TB external drive.

    If child filesystems are created from the parent pool, each individual ZFS filesystem will have it's own snapshots that can be set up with their own time capture intervals. For rollback and restorations purposes, individual filesystem snapshots are separate from the parent pool. From filesystem snapshots, that specific filesystem can be rolled back or individual (Linux) files and folders can be restored.

    If I do not have any linux folders or anything stored on the ZFS pool, and everything is in a child filesystem, should I even enable snapshots for the pool itself? Or only on the child filesystems.


    The zfs way is to combine the use of zfs point in time snapshots with zfs send & receive which can work in incremental mode, is resumable and can take advantage of zfs encryption. The downside ( or possible advantage ) is you can only read data from your external drive on another system running zfs.

    I do not quite understand all of this message. Is this only applicable if I am running ZFS on 2 systems, and want to back one of them up to the other system?

    Dual Intel Xeon E5-2690 @ 2.90GHz | 128GB ECC DDR3 1600MHz

    OMV 6 | Proxmox 6.1 kernel | ZFS | docker | SWAG

    Join the openmediavault discord server! https://discord.gg/qcGj2upevS

    • Official Post

    It's been awhile, but now I am going to setup the automatic snapshots. If I understand correctly, snapshots do not cause very much extra storage space to be used?

    It depends on "turn over". If a child filesystem is used for downloading where there are frequent large file deletes, if auto-snapshot is used, snapshots will retain deleted files for the time specified before they are purged. In such a case where there's a lot of turn over, I'd only save daily snapshots (31 each) where deleted files will begin to self purge after 31 days. 31 days is plenty of time to do an un-delete, if necessary, while automatically doing the housekeeping required to purge older deleted and unneeded files. (This is mentioned in the zfs-auto-snapshot doc.)

    Other than the possibility mentioned above:
    Since most user and business data is largely static (Documents, Photos, Home Videos, etc.), overhead for snapshots will be less than 25% (generally speaking). Usually, it's far less.


    If I do not have any linux folders or anything stored on the ZFS pool, and everything is in a child filesystem, should I even enable snapshots for the pool itself? Or only on the child filesystems.

    I just checked. The only "visible" file that I have at the root of my pool is ZFSversion.txt. Still, zfs-auto-snapshot sets up pool snapshots by default and there's no harm in letting them run to catch something that might accidentally end up at the root of the pool. If there are no changes, the cost of pool snapshots would "0" bytes. Further, if some type of malware / ransomware is deposited at the root of the pool, a rollback would remove it.

  • Still, zfs-auto-snapshot sets up pool snapshots by default and there's no harm in letting them run to catch something that might accidentally end up at the root of the pool. If there are no changes, the cost of pool snapshots would "0" bytes. Further, if some type of malware / ransomware is deposited at the root of the pool, a rollback would remove it.

    Good point. I guess I will leave snapshots enabled for the whole pool, and then individually set up snapshots for the filesystems.

    Dual Intel Xeon E5-2690 @ 2.90GHz | 128GB ECC DDR3 1600MHz

    OMV 6 | Proxmox 6.1 kernel | ZFS | docker | SWAG

    Join the openmediavault discord server! https://discord.gg/qcGj2upevS

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!