Which energy efficient ARM platform to choose?

JohnStiles · 22. Februar 2019

Zitat von ekent

After reading through this thread I thought that I'd settled on getting an Odroid-HC2, but now I'm thinking I'd like the ability to have 2 or more 3.5" drives (probably up to 4) connected, possibly even in a RAID1 configuration.

Then buy yourself two HC2, or three or even four HC2. Or go towards a normal pc / x86 server .... And if you really like the idea with ARM as a server then Gigabyte has a cool R281-T91

Zitat von ekent

Does that mean that the whole setup SBC and drives would be powered by the one power supply?

Adoby · 22. Februar 2019

Zitat von ekent

Does that mean that the whole setup SBC and drives would be powered by the one power supply? If so, what would that power supply look like?Thanks

I use one 12V 20A PSU, without a fan, to power 6 HC2, a Netgear GS316 GbE switch, a Asus Lyra mesh unit and a Noctua fan. When I bought the stuff I made sure everything ran on 12 volts.

This is the PSU I use: https://www.amazon.de/dp/B01MRSAT39

There are some pics in this thread: My new NAS: Odroid HC2 and Seagate Ironwolf 12TB.

(I bought two PSU, one as a spare, just in case...)

ekent · 23. Februar 2019

Zitat von tkaiser

Can't really help here since

RAID1 is IMO just fooling yourself (talking about mdraid, a zmirror or a btrfs raid-1 is something entirely different)

For RAID in general you need absolutely reliable hardware while SBC setups are usually quite the opposite. Especially all those RAID setups don't deal that great with power losses and undervoltage (which is one of the most common problems with SBC, at least those powered with 5V)

I like to have my backups physically separated from my productive data (that's the amazing thing with those SBC, they are that inexpensive that you can buy more than one and put the backup disks in another location --> another room, building or even area when using a VPN)

Alles anzeigen

I hadn't realised this or really thought about. But it makes a lot of sense, especially with my experience with SBCs up to this point. I guess that's why i'm posting and asking questions here. So thanks for taking the time to respond.

Having backups separated physically from your productive data is interesting and again something I really wasn't considering. Having said that, I like the idea! When you say that you have your backups physically separated from your productive data, how are you handling your backups? Is that an automatic process? Or are you doing it manually from your productive data?

Zitat von tkaiser

I mentioned the SATA HAT with an if clause: 'If I would want to add up to 4 disks to an SBC' (but I really don't want to add a bunch of disks to an SBC, at least not with all disks active at the same time)

Also you should keep in mind that this SATA HAT is brand new and currently not tested by any of us here ( ryecoaaron ordered a kit but I would believe this will take some time). I would better wait for people sharing their experiences.

Thank you for clarifying your points there. I think I took what you said out of context.

ekent · 23. Februar 2019

Zitat von JohnStiles

Then buy yourself two HC2, or three or even four HC2. Or go towards a normal pc / x86 server .... And if you really like the idea with ARM as a server then Gigabyte has a cool R281-T91

I think I will go back to the HC2 option. I was really taken with it initially so didn't need to try too hard to convince myself. BTW that Gigabyte R281-T91 is off the chain!! Super cool, but slightly more than what I need I feel .

ekent · 23. Februar 2019

Zitat von Adoby

I use one 12V 20A PSU, without a fan, to power 6 HC2, a Netgear GS316 GbE switch, a Asus Lyra mesh unit and a Noctua fan. When I bought the stuff I made sure everything ran on 12 volts.

This is the PSU I use: amazon.de/dp/B01MRSAT39

There are some pics in this thread: My new NAS: Odroid HC2 and Seagate Ironwolf 12TB.

(I bought two PSU, one as a spare, just in case...)

Alles anzeigen

Thanks a heap for the information in your post. That's quite the setup you've got there. And you clearly are a fan and believer in the HC2!

tkaiser · 23. Februar 2019

Zitat von ekent

how are you handling your backups? Is that an automatic process? Or are you doing it manually from your productive data?

Manual backup doesn't work in my experience (you always have good reasons to skip backing up prior to a data loss). We're using checksummed filesystems everywhere since data integrity is important and as such btrfs together with btrbk does the job on ARM SBCs and with large x86 installations ZFS combined with znapzend is used (or proprietary solutions like Open-e/Jovian). The differentiation between btrfs on ARM and ZFS on x86 is due to kernel support here and there.

But if you're not a 'storage pro' and really familiar with those contemporary filesystems using the older established variants like XFS or ext4 might be a better idea (then combined with traditional approaches like rsnapshot which integrates nicely with OMV).

JohnStiles · 23. Februar 2019

Zitat von ekent

I think I will go back to the HC2 option. I was really taken with it initially so didn't need to try too hard to convince myself

HC2 is ok. I'm using HC1.

ekent · 23. Februar 2019

Zitat von tkaiser

Manual backup doesn't work in my experience (you always have good reasons to skip backing up prior to a data loss). We're using checksummed filesystems everywhere since data integrity is important and as such btrfs together with btrbk does the job on ARM SBCs and with large x86 installations ZFS combined with znapzend is used (or proprietary solutions like Open-e/Jovian). The differentiation between btrfs on ARM and ZFS on x86 is due to kernel support here and there.

But if you're not a 'storage pro' and really familiar with those contemporary filesystems using the older established variants like XFS or ext4 might be a better idea (then combined with traditional approaches like rsnapshot which integrates nicely with OMV).

Again, thank you.
Yeah, I'm not a storage pro and not familiar with btrfs and ZFS. Seeing as I haven't even ordered the SBC or hdd for my new setup, I've got time to research and investigate btrfs. If I'm struggling to wrap my head around getting it setup, then I can fall back to ext4 (what I'm using now) and use rsnapshot which I can see is available as a plugin in OMV.

sirleon · 25. Februar 2019

Zitat von tkaiser

Manual backup doesn't work in my experience (you always have good reasons to skip backing up prior to a data loss). We're using checksummed filesystems everywhere since data integrity is important and as such btrfs together with btrbk does the job on ARM SBCs and with large x86 installations ZFS combined with znapzend is used (or proprietary solutions like Open-e/Jovian). The differentiation between btrfs on ARM and ZFS on x86 is due to kernel support here and there.
But if you're not a 'storage pro' and really familiar with those contemporary filesystems using the older established variants like XFS or ext4 might be a better idea (then combined with traditional approaches like rsnapshot which integrates nicely with OMV).

Does a checksummed filesystem really replace ECC-RAM? I'm unsure about this point because i read the opposite very often.

Edit: Okay, it's not that risky without ECC, if i got it right:

Externer Inhalt www.youtube.com

Inhalte von externen Seiten werden ohne Ihre Zustimmung nicht automatisch geladen und angezeigt.

Durch die Aktivierung der externen Inhalte erklären Sie sich damit einverstanden, dass personenbezogene Daten an Drittplattformen übermittelt werden. Mehr Informationen dazu haben wir in unserer Datenschutzerklärung zur Verfügung gestellt.

tkaiser · 25. Februar 2019

Zitat von sirleon

Does a checksummed filesystem really replace ECC-RAM?

If you love your data then care about data integrity. That means

Use a checksummed filesystem if possible
Use ECC RAM if possible
Neither is 1) a requirement for 2) nor vice versa

Some FreeNAS guy spread the rumor that using a checksummed filesystem without ECC RAM would kill your data (most probably the try to get more people to use server grade hardware with ECC RAM) but that's not true.

ECC RAM is a bit more expensive, a checksummed filesystem you get for free. But it won't provide any protection if run on really crappy hardware. One example is using a checksummed filesystem like btrfs on a host with quirky USB implementation combined with USB drives that do not support flush/barrier semantics.

sirleon · 26. Februar 2019

Zitat von tkaiser

ECC RAM is a bit more expensive, a checksummed filesystem you get for free. But it won't provide any protection if run on really crappy hardware. One example is using a checksummed filesystem like btrfs on a host with quirky USB implementation combined with USB drives that do not support flush/barrier semantics.

Okay, so the Helios4 is the only SBC with native SATA & ECC i've found so far. It looks like the most users in this thread are happy with the HC2. Is this a sign of an reliable SATA-USB-Bridge implementation which i can use with btrfs?

JohnStiles · 26. Februar 2019

Zitat von sirleon

It looks like the most users in this thread are happy with the HC2. Is this a sign of an reliable SATA-USB-Bridge implementation which i can use with btrfs?

I do not know about HC2, but I'm using HC1 which is almost similar, not observing any problem for now.

tkaiser · 26. Februar 2019

Zitat von sirleon

It looks like the most users in this thread are happy with the HC2. Is this a sign of an reliable SATA-USB-Bridge implementation which i can use with btrfs?

I'm not entirely sure. I'm using various devices with JMS578 (the USB-to-SATA bridge on ODROID HC1 and HC2), JMS567 and ASM1153 without any issues with btrfs (and Samsung Spinpoint 2.5"HDD or various SSDs for testing purposes). Since an USB-to-SATA bridge is involved its firmware could be relevant and also semantics of the SATA drive in question.

The btrfs FAQ is pretty clear about the problem and mentions it at the top for a reason: https://btrfs.wiki.kernel.org/…m._What_does_that_mean.3F

So you need to check for this barrier problem first (and I would strongly suggest to read through the whole btrfs FAQ first prior to using it). Also strongly recommended is to use different mount options than OMV's defaults (OMV relies on default btrfs relatime setting which destroys read performance on shares with a lot of files in it as explained in the btrfs FAQ).

With OMV currently this means manually adjusting the opts entry for your btrfs filesystem of choice in config.xml as @votdev explained here. I use the following options and hope they get new OMV defaults at least starting with OMV 5.

Code

<opts>defaults,noatime,nodiratime,compress=lzo,nofail</opts>

ekent · 27. Februar 2019

Hi all,

After going through the thread Which energy efficient ARM platform to choose? I was really taken by the Helios4, but thought that it was out of my price range. As a result I then was considering the Odroid-HC2 and was very impressed by that machine, but realised that in order to get what I wanted I'd need to get 2-3 of them. Once I realised that, that brought me back to the Helios4. I think that I'm going to get one, and am seriously excited by the prospect (as it offers the opportunity to combine ECC Ram and a checksummed filesystem) and it's performance.

Just wanted to ask a couple of questions though before I commit to it (sorry very nervous Non-pro, first time home NAS setup user):

The Helios4 has a dual core ARM Cortex A9 CPU and the Odroid-HC2 has an Octa Core CPU featuring a Samsung Exynos5422 Cortex-A15 2Ghz and Cortex-A7. Now I know that it's not as simple as saying one is dual core and one is octa core. But just wondering if someone could outline what the difference is between these two CPU configurations? As to me it "sounds" as though the CPU in the Odroid-HC2 is more powerful...
@tkaiser in this thread Which energy efficient ARM platform to choose?, you counseled that SBCs are not really appropriate for RAID as RAID requires absolutely reliable hardware. I suppose that this is still the case with the Helios4? With the Helios4, I'd be keen to utilise it's capability of 4 HDDs down the track and have 1-2 for backing up the other 2 Productive disks. Considering this, would the appropriate method to perform backups be btrbk (for btrfs) or rsnapshot (for ext4)?

Thanks

ekent · 3. März 2019

Zitat von tkaiser

If you love your data then care about data integrity. That means

Use a checksummed filesystem if possible

Use ECC RAM if possible

I'm at the point where I can't decide whether to get:

2 or 3 HC2s (and have backups physically separated from productive data). Positive = easy to multiply
Spend a little more and get a RockPro64 with NAS case (have backups connected to the same SBC and therefore in the same physical location) (but this has downside of limiting to 2 hard drives) or I get another one down the track....
Or to spend a little more again and get a Helios4 (campaign ending in about 3 weeks) (and have backups connected to the same SBC and thus in the same physical location).

I'm planning on using a checksummed filesystem (btrfs). As far as I can tell, Helios4 is the only SBC in this thread that has ECC Ram. Is the Helios4 worth the extra cost for a home NAS? I do care about my data, as does everyone I think .

In the long run, I think I'll be able to live with the extra cost (though it might be painful in the short term).

tkaiser · 3. März 2019

Zitat von ekent

I'm planning on using a checksummed filesystem (btrfs)

The most important thing to know about checksummed filesystems is that they seem to fail in situations where in reality hardware fails. With old/anachronistic filesystems in all these situations you simply get silent data corruption (something the average NAS user can happily live with since silent data corruption will only be noticed way too late).

With btrfs and ZFS you'll be notified about hardware problems almost immediately (at least with the next scrub you run) but an awful lot of people then start to blame the software or filesystem in question instead of realizing that they're about to loose data if they don't fix their hardware.

Another important and related thing (hardware) are the so called 'write barriers'. Back in the old days when we had neither journaled filesystems nor modern approaches like btrfs and ZFS a crash or power loss most of the times led to a corrupted filesystem that needed an fsck to hopefully be repaired at next boot (taking hours up to a day back then, with today's drive sizes and tons of files we might talk about weeks instead).

With all modern (journaling or checksummed) filesystems crashes or power losses are not that much of an issue any more but there is one huge requirement for this to be true: that's correct flush/write barrier semantics available. The filesystem driver needs to be able to trust into the drive really having written data to it if 'the drive' reports as such.

If write barriers are not correctly implemented then a crash or power loss now has different consequences: with old filesystems like a journaled ext4 for example you get silent data corruption but with modern attempts like ZFS or btrfs you're very likely to loose your whole pool at once.

Some more details (not mentioning btrfs since the problem is a very old one that should be well understood in the meantime. But people today loose their btrfs pools for the same reason they lost their ZFS pools almost a decade ago: insufficient hardware with broken write barrier implementations).

https://lwn.net/Articles/283161/ (details about write barriers and journaled filesystems)
https://forum.synology.com/enu/viewtopic.php?t=9150 (NAS improvements related to write cache and barriers)
https://utcc.utoronto.ca/~cks/…aris/ZFSLosingPoolsWaysII (write barriers and loosing ZFS pools)
https://access.redhat.com/docu…n_guide/writebarrieronoff (how different filesystems report this problem)

If you run into this problems with flush/write barriers not correctly implemented you have to fear simple crashes as well as power losses since both can result in your whole filesystem being lost (there's a reason why this issue is mentioned at the top of the btrfs FAQ). Same problem applies to mdraid in general but that's another story.

How has the above an impact on type of storage? Let's take the worst choice first: USB storage. Whether write barriers are in place or not depends on

Host controller (USB host controller in this case)
Host controller driver (therefore OS and OS version matters)
USB-to-SATA bridge used in a drive enclosure
The bridge's firmware (controller firmware version matters and also whether it's a branded one or not)
The drive's own behavior (drive firmware version matters)

With native SATA (as on the Helios4) or PCIe attached SATA (RockPro64 with NAS case and a PCIe HBA) you're only affected by 1), 2) and 5) any more (and the first two are usually not a problem today). So even if I never had any issues in this area with all my USB storage scenarios (using JMS567, JMS578, ASM1153 and VIA VL716 bridges) avoiding USB should be the obvious choice. Please also note that I'm no typical USB storage user since I would never buy 'USB disks' (e.g. from WD or Seagate) but only drive enclosures and drives separately.

I mentioned WD and Seagate for a reason since while their disk enclosures rely on the same USB-to-SATA bridges as above their 'branded' firmwares differ and this alone causes a lot of problems (see this commit comment to fix broken behavior affecting all Seagate USB3 disks used with Linux).

sirleon · 3. März 2019

@tkaiser Thank you for this good overview that answered lots of questions i still had.

@ekent I've also thought about buying 1 or 2 helios4. If you also live in germany and intereseted in helios4, we may order together to save shipping costs. We are not in a hurry because Kobol statet this:

Zitat von Kobol

We are using the same Pre-order approach that we did for previous campaigns. The goal is to reach at least 300 units ordered before starting production. We are planning to manufacture 750 units to have enough inventory for the late buyers, but don’t wait too long since the stock won’t last.

gprovost · 4. März 2019

Zitat von sirleon

@ekent I've also thought about buying 1 or 2 helios4. If you also live in germany and intereseted in helios4, we may order together to save shipping costs. We are not in a hurry because Kobol statet this:

Well based on the current traction of the Helios4 3rd campaign, I don't think we will manufacture that much of extra kit anymore... we might end up just manufacturing few dozens more on top. So if you want to be guaranty to get an Helios4 you should order before the end of the campaign. I must admit this was a complete marketing newbie mistake I did, because now potential buyers are being in 'wait-and-see' mode.

To try to boost a bit the sale we are now including a free OLED screen for the first 300 orders https://blog.kobol.io/2019/03/04/3rd-campaign-update/

tkaiser · 4. März 2019

Zitat von ekent

The Helios4 has a dual core ARM Cortex A9 CPU and the Odroid-HC2 has an Octa Core CPU featuring a Samsung Exynos5422 Cortex-A15 2Ghz and Cortex-A7. Now I know that it's not as simple as saying one is dual core and one is octa core. But just wondering if someone could outline what the difference is between these two CPU configurations? As to me it "sounds" as though the CPU in the Odroid-HC2 is more powerful...

Definitely the HC2 is the more powerful device if it's about 'raw CPU horsepower'. You can check the 7-zip MIPS column when comparing sbc-bench results: https://github.com/ThomasKaise…ch/blob/master/Results.md (HC2 is roughly 3 times faster if it's about 'server stuff'). But HC2 somehow needs this horsepower since both networking and storage are USB3 attached here and as such IRQ processing alone eats up a lot of CPU cycles. And the HC2's SoC is a smartphone SoC while Helios4 relies on a NAS SoC consisting of one AP (application processor --> the ARM cores) and one CP (communication processor --> a dedicated engine where as much stuff as possible is offloaded to).

With normal NAS use case in mind those theoretical CPU performance differences don't matter since you're limited by Gigabit Ethernet anyway.

Zitat von ekent

you counseled that SBCs are not really appropriate for RAID as RAID requires absolutely reliable hardware. I suppose that this is still the case with the Helios4?

Of course. You should think about what separates a self-built OMV installation from an expensive professional NAS box like a NetApp filer with 'built-in expertise' and 'hardware that does not suck'. Such a NetApp appliance you simply attach to your server (or storage network), switch it on and can totally forget about it. Power losses? Server crashes? Don't matter at all since this appliance combines both sufficient software and hardware to be not affected by this.

At the hardware side that's a battery backed cache/NVRAM inside able to cope with sudden power losses. No need for a lengthy fsck any more due to the chosen filesystem and neither data corruption nor data loss since sufficient hardware is combined with sufficient software (a 'copy-on-write' approach and log/journal functionality as it's common now in the professional storage world). Also the vendors of these appliances either provide a HCL (hardware compatibility list consisting of tested and approved HDDs listing also drive firmware version) or you're only able to buy (approved) HDDs from them anyway.

OMV users usually are no storage experts, combine some commodity hardware in random ways and then think by playing RAID they would've done something for data safety. Asides this misbelief (RAID is only about data availability and with RAID5/6 also partially about data integrity) those users face two major hardware challenges they're not aware of:

users might go with unreliable powering (picoPSUs with x86 setups or external power bricks too small/old). Once more than one HDD looses power at the same time (or just a voltage dip affecting all HDDs at the same time resulting in drive resets) your whole array is gone. You might be able to recover with a lengthy rebuild (what about 'data availability' now?) but you have to fear data corruption or even data loss, see below.
users choose drives that do not correctly implement write barriers (there's a reason HW RAID vendors provide hardware compatibility lists with approved drives they tested against their own few SAS/SATA HBA they use in their appliances). If you got such drive/controller combination with broken write barrier behavior your array is at risk in case of a crash or power failure (though same is true for every modern filesystem approach like journaled or checksummed filesystems)

I don't believe novice NAS/OMV users are aware of either problem.

Let's look closer at the 'write hole' issue since that's also mostly misunderstood or OMV/Linux users are simply not aware of its existence.

The 'write hole' similarly affects all partity RAID attempts like mdraid 5/6, RAIDz and btrfs' raid56 implementation just to name what we have with OMV. In case of a crash or power loss you're very likely to end up with corrupted data. What would be needed is some sort of a log/journal on a SSD to close the write hole. Such an attempt is available for mdraid since years but not directly supported by OMV as of now. You'll find some details here: https://lwn.net/Articles/665299/

The good news: the whole 'write hole' issue only affects you when you're running a degraded array since otherwise the existing parity information can be used to reconstruct the data without data corruption. Crashes and power losses are fine as long as the array is intact, it just needs a scrub to repair potential mismatch between data and parity information after an unsafe shutdown.

The bad news: if you're running a degraded array you need to replace the failed drive as soon as possible and then run a rebuild to get an intact array again. Since you would now be affected by power losses adding an UPS to the mix seems like a good idea. But also your system is now more likely to crash/hang or to run into 'out of memory' errors especially on SBC with their low amount of DRAM. I know several installations/people who struggled with interrupted rebuilds and after they eventually succeeded had to realize that their mdraid array contained corrupted data and the necessary fsck one layer above sent vast majority of filesystem data to the lost+found directory (you were able to successfully recover the RAID at the block device layer but above mostly garbage exists any more)

If you run with a reliable AC UPS power efficiency is gone (so why choosing an SBC in the first place) but unfortunately an UPS doesn't protect you from the 'write hole' problem since crashes/hangs are the major issue. Adding to that I've seen in my professional life within the last 2.5 decades way more powering problems caused by UPS than without and almost all servers we deploy today have redundant PSUs with one connected to the UPS and the other directly to mains.

To be sure your mdraid is really able to be rebuilt without data corruption you would need to test this sufficiently which will cost you days or weeks of your life since you need some old/crappy drives to test with (drive failure is not binary, those things die slowly). Just check the documentation: https://raid.wiki.kernel.org/i…Raid#When_Things_Go_Wrogn

There's a reason modern attempts like RAIDz or btrfs raid56 have been developed. To overcome the downside of anachronistic RAID modes like mdraid.

ekent · 4. März 2019

@tkaiser thank you so much for your detailed responses! It's really appreciated and there is a lot of information there that will take me some time to sift through and understand and I can only imagine how much time it spent putting it together.

Zitat von sirleon

@ekent I've also thought about buying 1 or 2 helios4. If you also live in germany and intereseted in helios4, we may order together to save shipping costs.

@sirleon I'd be more than happy to do a combined order with you (and thank you for the offer). Unfortunately I live in Australia so I feel it wouldn't work.

Jetzt mitmachen!