Which energy efficient ARM platform to choose?


  • @ekent I've also thought about buying 1 or 2 helios4. If you also live in germany and intereseted in helios4, we may order together to save shipping costs. We are not in a hurry because Kobol statet this:

    Well based on the current traction of the Helios4 3rd campaign, I don't think we will manufacture that much of extra kit anymore... we might end up just manufacturing few dozens more on top. So if you want to be guaranty to get an Helios4 you should order before the end of the campaign. I must admit this was a complete marketing newbie mistake I did, because now potential buyers are being in 'wait-and-see' mode.


    To try to boost a bit the sale we are now including a free OLED screen for the first 300 orders ;-) https://blog.kobol.io/2019/03/04/3rd-campaign-update/

  • The Helios4 has a dual core ARM Cortex A9 CPU and the Odroid-HC2 has an Octa Core CPU featuring a Samsung Exynos5422 Cortex-A15 2Ghz and Cortex-A7. Now I know that it's not as simple as saying one is dual core and one is octa core. But just wondering if someone could outline what the difference is between these two CPU configurations? As to me it "sounds" as though the CPU in the Odroid-HC2 is more powerful...

    Definitely the HC2 is the more powerful device if it's about 'raw CPU horsepower'. You can check the 7-zip MIPS column when comparing sbc-bench results: https://github.com/ThomasKaise…ch/blob/master/Results.md (HC2 is roughly 3 times faster if it's about 'server stuff'). But HC2 somehow needs this horsepower since both networking and storage are USB3 attached here and as such IRQ processing alone eats up a lot of CPU cycles. And the HC2's SoC is a smartphone SoC while Helios4 relies on a NAS SoC consisting of one AP (application processor --> the ARM cores) and one CP (communication processor --> a dedicated engine where as much stuff as possible is offloaded to).


    With normal NAS use case in mind those theoretical CPU performance differences don't matter since you're limited by Gigabit Ethernet anyway.


    you counseled that SBCs are not really appropriate for RAID as RAID requires absolutely reliable hardware. I suppose that this is still the case with the Helios4?


    Of course. You should think about what separates a self-built OMV installation from an expensive professional NAS box like a NetApp filer with 'built-in expertise' and 'hardware that does not suck'. Such a NetApp appliance you simply attach to your server (or storage network), switch it on and can totally forget about it. Power losses? Server crashes? Don't matter at all since this appliance combines both sufficient software and hardware to be not affected by this.


    At the hardware side that's a battery backed cache/NVRAM inside able to cope with sudden power losses. No need for a lengthy fsck any more due to the chosen filesystem and neither data corruption nor data loss since sufficient hardware is combined with sufficient software (a 'copy-on-write' approach and log/journal functionality as it's common now in the professional storage world). Also the vendors of these appliances either provide a HCL (hardware compatibility list consisting of tested and approved HDDs listing also drive firmware version) or you're only able to buy (approved) HDDs from them anyway.


    OMV users usually are no storage experts, combine some commodity hardware in random ways and then think by playing RAID they would've done something for data safety. Asides this misbelief (RAID is only about data availability and with RAID5/6 also partially about data integrity) those users face two major hardware challenges they're not aware of:

    • users might go with unreliable powering (picoPSUs with x86 setups or external power bricks too small/old). Once more than one HDD looses power at the same time (or just a voltage dip affecting all HDDs at the same time resulting in drive resets) your whole array is gone. You might be able to recover with a lengthy rebuild (what about 'data availability' now?) but you have to fear data corruption or even data loss, see below.
    • users choose drives that do not correctly implement write barriers (there's a reason HW RAID vendors provide hardware compatibility lists with approved drives they tested against their own few SAS/SATA HBA they use in their appliances). If you got such drive/controller combination with broken write barrier behavior your array is at risk in case of a crash or power failure (though same is true for every modern filesystem approach like journaled or checksummed filesystems)


    I don't believe novice NAS/OMV users are aware of either problem.


    Let's look closer at the 'write hole' issue since that's also mostly misunderstood or OMV/Linux users are simply not aware of its existence.


    The 'write hole' similarly affects all partity RAID attempts like mdraid 5/6, RAIDz and btrfs' raid56 implementation just to name what we have with OMV. In case of a crash or power loss you're very likely to end up with corrupted data. What would be needed is some sort of a log/journal on a SSD to close the write hole. Such an attempt is available for mdraid since years but not directly supported by OMV as of now. You'll find some details here: https://lwn.net/Articles/665299/


    The good news: the whole 'write hole' issue only affects you when you're running a degraded array since otherwise the existing parity information can be used to reconstruct the data without data corruption. Crashes and power losses are fine as long as the array is intact, it just needs a scrub to repair potential mismatch between data and parity information after an unsafe shutdown.


    The bad news: if you're running a degraded array you need to replace the failed drive as soon as possible and then run a rebuild to get an intact array again. Since you would now be affected by power losses adding an UPS to the mix seems like a good idea. But also your system is now more likely to crash/hang or to run into 'out of memory' errors especially on SBC with their low amount of DRAM. I know several installations/people who struggled with interrupted rebuilds and after they eventually succeeded had to realize that their mdraid array contained corrupted data and the necessary fsck one layer above sent vast majority of filesystem data to the lost+found directory (you were able to successfully recover the RAID at the block device layer but above mostly garbage exists any more)


    If you run with a reliable AC UPS power efficiency is gone (so why choosing an SBC in the first place) but unfortunately an UPS doesn't protect you from the 'write hole' problem since crashes/hangs are the major issue. Adding to that I've seen in my professional life within the last 2.5 decades way more powering problems caused by UPS than without and almost all servers we deploy today have redundant PSUs with one connected to the UPS and the other directly to mains.


    To be sure your mdraid is really able to be rebuilt without data corruption you would need to test this sufficiently which will cost you days or weeks of your life since you need some old/crappy drives to test with (drive failure is not binary, those things die slowly). Just check the documentation: https://raid.wiki.kernel.org/i…Raid#When_Things_Go_Wrogn


    There's a reason modern attempts like RAIDz or btrfs raid56 have been developed. To overcome the downside of anachronistic RAID modes like mdraid.

  • @tkaiser thank you so much for your detailed responses! It's really appreciated and there is a lot of information there that will take me some time to sift through and understand and I can only imagine how much time it spent putting it together.

    @ekent I've also thought about buying 1 or 2 helios4. If you also live in germany and intereseted in helios4, we may order together to save shipping costs.

    @sirleon I'd be more than happy to do a combined order with you (and thank you for the offer). Unfortunately I live in Australia ;) so I feel it wouldn't work.

  • Hello Everyone,


    I am new to the game, but I have been reading up the various alternatives for a DIY NAS (in particular this thread and comments by @tkaiser in various forums all over the net).


    What I figured so far (mostly because @tkaiser and other people that seem to know very well what they talk about stresse these points relentlessly) is:
    1. Do not use 'classical' RAID configurations in a home/small-business setup ever (hardware or software)
    2. Use a checksummed file system (i.e. btrfs or ZFS - I don't think there are much alternatives, bcachefs maybe?)
    3. Use ECC RAM, if possible.
    4. Avoid hard drives connected via USB, and preferably use 'real' SATA connections.


    Now, my list of requirements (or preferences) for a NAS is:
    a) Small form factor (and preferably somewhat decently looking, so I can place it in the living room)
    b) Low energy consumption as the thing should run 24/7
    c) Silent (preferably passive cooling)


    Combining these two lists, leaves me essentially with the two options @ekent already mentioned:
    I. RockPro64 with the NAS case and a decent PCIe SATA adaptor (i.e. using either a Marvell 88SE9235 or ASMedia ASM1062 chip, where the latter is much cheaper at least when ordering from Australia)
    II. Helios4


    My use case for this is a NAS for home environment where I keep pictures, documents and stuff. I don't really have movies or music anymore (spotify and netflix does the trick). Also, I would like to use it as a backup location for my laptop, using restic, which in itself shares many traits with those next-generation file systems. Last, I plan to backup the NAS to a cloud storage, again using restic (or something similar that integrates better with the filesystem of the NAS). Thus, I will probably end up having one or two 2TB HDD in the NAS (kind of depending on whether I should keep a local backup or not).
    Last, I should mention that I am not the biggest fan of btrfs. I have used it around 6 years ago on various workstations and only had trouble with that setup. (Back then OpenSUSE recommended btrfs in the installer). The issue was somehow related to the snapshots btrfs automatically did and eventually ran out of disk space in /var. As this left my systems various times in an unusable, though recoverable, state, I switched back to ext4.



    My questions/concerns on all this are:
    - Can I use ZFS with either one of the boards mentioned above? I know that the Helios4 does not have the most powerful CPU (and it is 32bit), but there are apparently people that got ZFS running on a Raspberry Pi or Rock64 (see e.g. [1,2,3]). The RockPro64 has a much more powerful 64bit CPU and twice the RAM (4GB vs 2GB for the Helios4), but I am still unclear if that would be enough for ZFS. Obviously, a downside of the RockPro64 is that the RAM is non-ECC.
    - Thus, my 2nd question: In case ZFS can be used on these boards, should I chose ECC RAM over a more powerful CPU?
    - Is btrfs usable (or even recommended) for a NAS nowadays? (As said, my experience with btrfs hasn't been the best a few years back.)
    - I am still unsure of whether adding a 2nd HDD and setting up a ZFS/btrfs mirror is a good idea or just waste of an HDD. Any recommendations?
    - Last, which board to chose, RockPro64 or Helios4 or something else?


    I am aware that ZFS on Linux, and especially on OMV, might mean that some tinkering is required. I don't see this as a problem, and I am fairly well accustomed to a Linux environment.


    Any hint or further insight, especially regarding ZFS on these ARM boards, is much appreciated!


    Cheers,
    Armin


    [1] https://www.raspberrypi.org/forums/viewtopic.php?t=165247
    [2] https://icicimov.github.io/blo…ZFS-NAS-ROCK64-NFS-Samba/
    [3] https://forum.armbian.com/topi…ab=comments#comment-53681

  • Any hint or further insight, especially regarding ZFS on these ARM boards, is much appreciated!

    With ZFS exactly the same is important as with btrfs: you need storage that correctly implements flush/write barriers which can be a real challenge when running on SBC since the majority of them and especially the most popular one (Raspberry Pi) is 'most crappy storage setup possible' by definition.


    SBC with a quirky USB controller combined with random external USB storage are a great recipe for silent data corruption with ancient filesystems like ext4 or total data loss with modern filesystems like ZFS or btrfs (users lost their ZFS pools years ago for the same reasons they still loose their btrfs filesystems). But with PCIe attached SATA and the controllers you mentioned or native SATA on the Helios4 you shouldn't be affected by host controller issues and when you use modern drives using both filesystems shouldn't be an issue.


    Neither filesystem needs ECC RAM (the 'scrub of death' is a myth) but if you love your data you better use it. Wrt amount of RAM needed there's confusion with ZFS (another myth talking about 1GB RAM per TB usable storage -- such formulas only apply if you want to use deduplication) but once you set the ARC to a reasonable minimum even 1 GB RAM should be fine for normal operation. Things might be different when running scrubs or resilver (if you opt for a zmirror) and this is something that needs testing especially on usual SBC with their small amounts of DRAM.


    But the latter is even true for ext4 at least when tons of hardlinks are involved (rsnapshot/backup target) since then e2fsck might need huge amounts of memory and checking/repairing an ext4 filesystem runs in out of memory errors or requires a huge swap partition and takes ages.


    Now talking about software support: for 'obvious reasons' ZFS is not part of the Linux kernel and as such you're either on your own building kernel/modules yourself maybe automating this with DKMS or you choose distros that takes care of this. With OMV on x86 there's the 'proxmox kernel' for this purpose (though no idea whether this works 100% flawlessly all the time), with ARM devices you're on your own. You're clearly entering expert area and need to double check with each and every kernel update whether your ZFS pools will be accessible afterwards or not (and with ARM there's no GRUB fallback to latest known working kernel too).


    For the latter reason I prefer to use btrfs on ARM, taking care to use at least kernel 4.4 and with more sophisticated features at least kernel 4.19 (then also being prepared for potential OMV integration issues and hoping for btrfs becoming a first class citizen in OMV5 and above).

  • I'm sticking to my HC2s with huge hdds and nfs/ext4. Has worked fine so far. But I'm also experimenting with a backup utility that can provide automatic bitrot detection and correction on ext4.

    Be smart - be lazy. Clone your rootfs.
    OMV 5: 9 x Odroid HC2 + 1 x Odroid HC1 + 1 x Raspberry Pi 4

  • Neither filesystem needs ECC RAM (the 'scrub of death' is a myth) but if you love your data you better use it. Wrt amount of RAM needed there's confusion with ZFS (another myth talking about 1GB RAM per TB usable storage -- such formulas only apply if you want to use deduplication) but once you set the ARC to a reasonable minimum even 1 GB RAM should be fine for normal operation. Things might be different when running scrubs or resilver (if you opt for a zmirror) and this is something that needs testing especially on usual SBC with their small amounts of DRAM.


    But the latter is even true for ext4 at least when tons of hardlinks are involved (rsnapshot/backup target) since then e2fsck might need huge amounts of memory and checking/repairing an ext4 filesystem runs in out of memory errors or requires a huge swap partition and takes ages.

    @tkaiser Thanks a lot for the detailed explanations! This is very interesting and informative to get such insight.


    So, what I am not yet completely understanding is why I should use ECC RAM. I have read now so many times the statement "if you love your data, use ECC RAM", which simply quotes Matthew Ahrens when talking about ECC memory for ZFS. While that statement is true, it is true for any file system and only refers to the fact that data can be corrupted when there is an undetected error in the RAM.


    What I figured is that these very abstruse scenarios in which people depict the complete loss of all data on a system using ZFS due to an undetected memory error (e.g. in the FreeNAS forum) are just bogus. But also the scenario of a partial data loss where the same file would be loaded over and over again into the same faulty memory location (see e.g. ZFS on Debian guide) is just unlikely (simply because that is not how memory allocation, at least under Linux, works) and in no way worse than on any other file system. Also, as already mentioned by @tkaiser, the myth that ECC would be strict a requirement for ZFS has been debunked by several people that explain this whole issue in a sane way, see for instance: ZFS and non-ECC RAM and the video on Why Scrubbing ZFS Without ECC RAM Probably Won't Corrupt Everything.


    So, from what I understand, there is always a risk that a corrupted file gets written to disk due to an undetected error in the RAM (irrespective of the file system), and this holds true for any file that does not yet exist on the file system (i.e. be it a new file are a file that's opened, modified and then written back to disk). In fact, in case of a NAS system, this memory error doesn't even need to be on the server, it may as well be on the client side - either way a corrupted file will be written to disk and ZFS has no way of detecting this. Now, ECC memory detects and corrects for some of these memory errors and thus is a very good thing to have as it mitigates the risk of corrupted data significantly. However, once a correct file is written to a check-summed file system (e.g. ZFS or btrfs), there is no way a memory error will corrupt that file (as outlined in the links above).


    I guess the issue boils down to evaluating the risk of have some faulty data (i.e. a number of files, but not the entire dataset) stored on disk due to a memory error.


    I am currently leaning towards the Helios4 option (due to the ECC RAM and the fact that the whole SoC is designed for the NAS use-case), but I am afraid I might not be able to use ZFS on that one. I could not find a single mention of ZFS on a Helios4, or any Marvell ARMADA 38x SoC for that matter, anywhere on the net.


    An interesting third alternative is MACCHIATObin board, which can be used with ECC RAM, but doesn't seem to sold with ECC RAM by default (at least it's nowhere mentioned) and is significantly more expensive.

  • But I'm also experimenting with a backup utility that can provide automatic bitrot detection and correction on ext4.

    I have looked into that topic as well and found that there are basically two options:
    1) restic
    2) BorgBackup


    Both of them seem to be doing an excellent job and are very similar in the way they work. I went with restic (though not yet in a full scale), which was more like a gut feeling rather than hard facts. Probably the fact that the original author of restic has done such an excellent job in designing the tool, and seems generally very conscious about the design decisions. Also the fact that restic borrows many ideas from git, which is a tool I use excessively everyday, has influenced my decision.

  • I took a quick look at the documentation for both. I couldn't see anything about bitrot detection and correction?

    Be smart - be lazy. Clone your rootfs.
    OMV 5: 9 x Odroid HC2 + 1 x Odroid HC1 + 1 x Raspberry Pi 4

  • I took a quick look at the documentation for both. I couldn't see anything about bitrot detection and correction?

    OK, I think I know what you mean. So, no, from what I understand restic would not protect you from bitrot (i.e. a bit flip) of the original data, i.e. not the data in the restic repository. See this issue for details: https://github.com/restic/restic/issues/805


    However, the data stored in a restic repository itself is encrypted and signed. You should just be able to detect bitrot of the backup data easily when checking the repository. To correct for that, you would then obviously need some kind of data redundancy. This is discussed here https://github.com/restic/restic/issues/256, but obviously if your repository is stored on a checksummed file system with some kind of redundancy you could correct that as well.


    @Adoby: Do you have any better suggestion for a backup software?

  • So, what I am not yet completely understanding is why I should use ECC RAM

    Since bit flips happen. We have a lot of servers in our monitoring and some of those with ECC RAM report correctable single bit errors from time to time -- even those that survived 72 hours memtest burn-in. Those without ECC ram have no ability to report bit flips and as such we can only speculate what happens there (if an application crashed for example).


    The consequences of a bit flip range from 'nothing at all' over silent data corruption to application or kernel crash.

    • Nothing at all if the flipped bit got detected and corrected by higher layers (nearly every protocol uses CRC mechanisms to detect corrupted data since hardware is not perfect)
    • Silent data corruption happens. Users without checksumming filesystems like ZFS or btrfs maybe will never know or only if it's way too late, with those modern fs the next scrub will tell (and if redundancy was used it will be automatically corrected)
    • Application or kernel crash is as bad as it sounds. So ECC RAM is also an investment in 'business continuity' or availability. And such crashes often are accompanied by silent data corruption or even data loss as well

    I could not find a single mention of ZFS on a Helios4, or any Marvell ARMADA 38x SoC for that matter, anywhere on the net

    I did some tests with a RAIDz on my Clearfog Pro (with just 1GB) back in 2017. Rather underwhelming especially since my usual ZFS setups are big Xeon boxes with SAS disks and enterprise SSDs for ZIL and L2ARC.


    With a Helios4, 4 SATA disks and data integrity + 'self healing' in mind I would most probably choose a btrfs raid1 with 2 to 4 disks. Please note that the btrfs raid1 mode works totally different compared to both mdraid1 and a zmirror: https://github.com/openmediava…01#issuecomment-466901395

  • @Adoby: Do you have any better suggestion for a backup software?

    No, not today.


    I would like some software that creates backups and that at the same time creates checksums of the files. When the backups are updated the old unchanged files are checked and if any no longer match the checksum, it is automatically replaced if possible. Either the original file is replaced by the backup file, or the other way around. However this would be extremely computionally intensive, so the checks would likely have to be partial. Say you daily check 4% of the files each day so that all files are checked at least once per month. This should greatly reduce the risk of bitrot in sets of files that don't change often. Say collections of scanned family photos or private recordings. It would be less efficient for big sets of files that change and are replaced often.


    Filesystems that create checksums can help a lot. But to provide enough redundancy you need to use something like RAID, and then you still need backups, so you look at keeping at least three copies of each file. And these filesystems creates checksums on the fly, which may slow down the filesystem noticeably in some circumstances.


    Modern hdd/ssd firmware already correct many errors and ECC memory can also help.


    But I still would like extra protection, as I described it above. Storing data on small power efficient SBCs with fast networking and SATA drives makes it possible to distribute the work somewhat. For instance a backup storage SBC with a huge hdd can perhaps daily check more than 4% locally stored backup files without it having any detrimental effect on the data storage SBC that serve the original files. And the data storage SBC only check 2% during the early morning hours. If the system could use backups on the LAN then that would provide enough redundancy to fix errors detected while checking the files. And two copies of the files, together with checksums, would be enough to provide fully automated bitrot protection.


    As far as I know nothing like this does exist today. At least not for home users that worry about bitrot eating their scanned family photos.


    Not yet.

    Be smart - be lazy. Clone your rootfs.
    OMV 5: 9 x Odroid HC2 + 1 x Odroid HC1 + 1 x Raspberry Pi 4

  • With a Helios4, 4 SATA disks and data integrity + 'self healing' in mind I would most probably choose a btrfs raid1 with 2 to 4 disks


    Disclaimer: I personally try to keep storage pools as small as possible (maybe because I deal with storage for a living and am constantly confronted with the downsides of really large storage setups like an fsck or RAID rebuild taking days or even weeks and such stuff).


    With btrfs and 32-bit systems like the Helios4 there seems to be a 8TiB limitation and in general ARM devices with their low amounts of DRAM can run into trouble when dealing with (checking/repairing) huge storage setups.

  • Modern hdd/ssd firmware already correct many errors and ECC memory can also help


    But this is more or less unrelated to the 'bit rot' problem since ECC used inside storage entities or ECC RAM only try to ensure that a 1 is a 1 and a 0 is a 0 now. Data degradation on (offline) storage is not addressed here at all.


    IMO you should really have a look at btrfs. Set up 2 HC2 with btrfs, use one for the productive data and the other for backups. Then use btrbk to do the backup thing (creating/maintaining/deleting snapshots at source and destination and transferring them efficiently via btrfs send/receive). Periodic scrubs at both HC2 will reveal corrupted data that then needs to be transferred manually from either system to the other.

  • Disclaimer: I personally try to keep storage pools as small as possible (maybe because I deal with storage for a living and am constantly confronted with the downsides of really large storage setups like an fsck or RAID rebuild taking days or even weeks and such stuff).


    With btrfs and 32-bit systems like the Helios4 there seems to be a 8TiB limitation and in general ARM devices with their low amounts of DRAM can run into trouble when dealing with (checking/repairing) huge storage setups.

    Yes, running fsck on a 12TB ext4 HDD with a lot of rsync snapshots with lots of hardlinks, is not fun on a HC2... I tried. Once. Then I removed the drive and stuck it in my desktop Linux PC. It was done the next morning. Or you can reformat and restore from backups. If you have them...


    Luckily this don't happen often.

    Be smart - be lazy. Clone your rootfs.
    OMV 5: 9 x Odroid HC2 + 1 x Odroid HC1 + 1 x Raspberry Pi 4

  • I would like some software that creates backups and that at the same time creates checksums of the files. When the backups are updated the old unchanged files are checked and if any no longer match the checksum, it is automatically replaced if possible. Either the original file is replaced by the backup file, or the other way around. However this would be extremely computionally intensive, so the checks would likely have to be partial. Say you daily check 4% of the files each day so that all files are checked at least once per month. This should greatly reduce the risk of bitrot in sets of files that don't change often. Say collections of scanned family photos or private recordings. It would be less efficient for big sets of files that change and are replaced often.

    OK, I think restic is almost there already. Obviously, some key components to protect from bitrot are missing, but I reckon the largest building blocks are there. In particular, since restic is based on calculating sha256 checksums of all data that is backed-up for deduplication. So it knows the checksums of the original data, but on the next backup, it skips all files for which the mod time didn't change (again, more details in this github issue).

  • Well, it seems that currently, restic does not provide any bitrot detection or protection at all? Apart from that it looks like a really nice backup system.
    I currently use plain old versioned rsync snapshots between SBC servers in a LAN. Simple and easy. I would prefer something very similar, but with added fully automatic bitrot protection.

    Be smart - be lazy. Clone your rootfs.
    OMV 5: 9 x Odroid HC2 + 1 x Odroid HC1 + 1 x Raspberry Pi 4

  • Since bit flips happen. We have a lot of servers in our monitoring and some of those with ECC RAM report correctable single bit errors from time to time -- even those that survived 72 hours memtest burn-in. Those without ECC ram have no ability to report bit flips and as such we can only speculate what happens there (if an application crashed for example).

    Thanks again for the elaborate explanations, they are much appreciated. I probably should have been a bit more precise in my statement though. I am aware of the advantages of ECC RAM in general, ECC memory is after all essential for my daily work. I see those error messages occasionally on our local cluster (which has of course ECC RAM). That would be a real pain in the neck, if I had to debug random crashes of my code when running on a 1000 cores and one of them experiences a memory error.


    My statement was more targeted at my home NAS use case where I am dealing with a small SBC with 2 GB RAM and 2 TB disk space, which is quite different compared to that cluster I mentioned above with 2000 GB RAM and 500 TB disk space. Business continuity and availability are not that crucial in my home setup - it's all about the probabilities of a failure. But then again, I like my data as it's mostly photos and memories, and it would be sad to loose them due to some memory error.



    Disclaimer: I personally try to keep storage pools as small as possible (maybe because I deal with storage for a living and am constantly confronted with the downsides of really large storage setups like an fsck or RAID rebuild taking days or even weeks and such stuff).


    With btrfs and 32-bit systems like the Helios4 there seems to be a 8TiB limitation and in general ARM devices with their low amounts of DRAM can run into trouble when dealing with (checking/repairing) huge storage setups.

    At the moment I don't need more than 2TB (and that is already generous), so two disks where one of them is to enable the 'self healing' capabilities of the file system, was what I had in mind. The Helios4 sounds indeed like a good option in that regard - I am only hesitating because my previous experience with btrfs wasn't really the best (as mentioned earlier already) and ZFS on that board starts to look less like an option (I haven't totally given up on that thought though).

  • Well, it seems that currently, restic does not provide any bitrot detection or protection at all? Apart from that it looks like a really nice backup system.

    Agreed! Restic is a very nice backup system, but has currently no bitrot protection.

  • btrfs. I have used it around 6 years ago ... OpenSUSE ... the snapshots btrfs automatically did and eventually ran out of disk space in /var

    Btrfs doesn't do snapshots on its own. I would believe you're talking about OpenSuse's snapper instead?


    Anyway: there's no need to create constant snapshots (if you do snapshots then choose a tool that automates everything for you and deletes unneeded snapshots based on a retention policy -- I already mentioned btrbk above). And experiences from 6 years ago are pretty much worthless (the work on btrfs started a decade ago!). Same with all the 'reports' on the net blaming btrfs for almost everything that could go wrong with storage (people are fine with ext4 and silent data corruption but if they use a great filesystem able to spot data corruption they get mad and blame the filesystem instead of crappy hardware or a failed mdraid below or whatever).

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!