Best Raid option for NAS starting out with 2 drives but wanting to expand?

    • OMV 4.x
    • Best Raid option for NAS starting out with 2 drives but wanting to expand?

      Hi, So im looking to build a home server/nas soon, and im planning on using OMV.
      But im not sure what type of raid i should use... I want to have some sort of redundancy, but im only going to be starting out with 2 drives, and then slowly expanding as my storage needs increase...
      Whats a good solution for me? Im planning on building a system with a i5 4590, 8gb ddr3, q87 motherboard, 2x 4 or 6TB Segate Barracuda drives.

      Tips?

      Thanks
      Gershy13
    • Gershy13 wrote:

      But im not sure what type of raid i should use...
      If you're not a business then the best type of raid for you is NO raid. It's wasting disks for almost nothing especially with disks in the TB region. RAID was great for businesses or other entities who need data availability two decades ago. Times have changed and parity RAID with large disks is simply fooling yourself.

      If you want to spent efforts/money on redundancy then go for data safety (backup) instead.
    • henfri wrote:

      Use single drives and combine them to one with mergerfs.
      How many stories have we read here of people having lost all data because one of their ten drives of a pool died.... This does not happen with my proposal
      I dont mind what it is, im just trying to figure out which is the best option that allows me to have redundancy, but without having to literally have another drive mirroring the main one, as that will be a pain when upgrading, each time ill need to buy 2 drives when i want to add more storage.
    • tkaiser wrote:

      Gershy13 wrote:

      But im not sure what type of raid i should use...
      If you're not a business then the best type of raid for you is NO raid. It's wasting disks for almost nothing especially with disks in the TB region. RAID was great for businesses or other entities who need data availability two decades ago. Times have changed and parity RAID with large disks is simply fooling yourself.
      If you want to spent efforts/money on redundancy then go for data safety (backup) instead.
      What options are there for me then? I just want something that will protect my data to a certain degree (i will still have backups on an external drive of the really important data...) And yes im not a business, just a standard home user who wants to build a home server...
    • Gershy13 wrote:

      tkaiser wrote:

      Gershy13 wrote:

      But im not sure what type of raid i should use...
      If you're not a business then the best type of raid for you is NO raid. It's wasting disks for almost nothing especially with disks in the TB region. RAID was great for businesses or other entities who need data availability two decades ago. Times have changed and parity RAID with large disks is simply fooling yourself.If you want to spent efforts/money on redundancy then go for data safety (backup) instead.
      What options are there for me then? I just want something that will protect my data to a certain degree (i will still have backups on an external drive of the really important data...) And yes im not a business, just a standard home user who wants to build a home server...
      For example having one main drive and doing automated incremental backups to a second backup drive.
    • Morlan wrote:

      Gershy13 wrote:

      tkaiser wrote:

      Gershy13 wrote:

      But im not sure what type of raid i should use...
      If you're not a business then the best type of raid for you is NO raid. It's wasting disks for almost nothing especially with disks in the TB region. RAID was great for businesses or other entities who need data availability two decades ago. Times have changed and parity RAID with large disks is simply fooling yourself.If you want to spent efforts/money on redundancy then go for data safety (backup) instead.
      What options are there for me then? I just want something that will protect my data to a certain degree (i will still have backups on an external drive of the really important data...) And yes im not a business, just a standard home user who wants to build a home server...
      For example having one main drive and doing automated incremental backups to a second backup drive.
      Then what would happen when i expand my drives and want to add more storage? Would i just have to get a bigger backup drive/multiple? Could i switch to raid 5 or snapraid at some point? and is that a bad idea?
    • Morlan wrote:

      Gershy13 wrote:

      tkaiser wrote:

      Gershy13 wrote:

      But im not sure what type of raid i should use...
      If you're not a business then the best type of raid for you is NO raid. It's wasting disks for almost nothing especially with disks in the TB region. RAID was great for businesses or other entities who need data availability two decades ago. Times have changed and parity RAID with large disks is simply fooling yourself.If you want to spent efforts/money on redundancy then go for data safety (backup) instead.
      What options are there for me then? I just want something that will protect my data to a certain degree (i will still have backups on an external drive of the really important data...) And yes im not a business, just a standard home user who wants to build a home server...
      For example having one main drive and doing automated incremental backups to a second backup drive.
      Edit: Personally I'm more concerned about backing up my data, than it's availability (as tkaiser loves to point out in raid threads)... and I don't run any form of RAID at all.

      Easiest way to do that, is to set your "main" drive (Drive A).. this will house all your data for services, etc. Then set up a "backup" drive (Drive B)... set up a simple rsync job in the webUI to backup Drive A, to Drive B. Getting the timing down correctly can be tricky.

      I run several rsync jobs automatically. I have one job that backs up to a 2nd drive on my NAS... that runs every 8hrs. I personally leave the delete trigger off, that way if something is accidentally deleted, it stays on my 2nd drive. Once or twice a month I'll log in and enable the delete trigger and run the job manually, to bring Drive A and Drive B completely in sync.

      Every day at 3am, a job runs that backs up my NAS's "Drive B" to a remote single drive OMV at my parents house. This job has the delete trigger enabled at all times.. so it is always in sync with "Drive B"

      I have another job that then runs at 4am, that backs up just movies/tv shows from my Mom's, to another another remote single drive OMV at my sisters. That also has a simple SMB share, and she uses an android TV box and Kodi to watch whatever stuff she or her kids want.

      It was simple to set up and once it's up it pretty much runs w/o any input from me.
      Air Conditioners are a lot like PC's... They work great until you open Windows.

    • KM0201 wrote:

      Morlan wrote:

      Gershy13 wrote:

      tkaiser wrote:

      Gershy13 wrote:

      But im not sure what type of raid i should use...
      If you're not a business then the best type of raid for you is NO raid. It's wasting disks for almost nothing especially with disks in the TB region. RAID was great for businesses or other entities who need data availability two decades ago. Times have changed and parity RAID with large disks is simply fooling yourself.If you want to spent efforts/money on redundancy then go for data safety (backup) instead.
      What options are there for me then? I just want something that will protect my data to a certain degree (i will still have backups on an external drive of the really important data...) And yes im not a business, just a standard home user who wants to build a home server...
      For example having one main drive and doing automated incremental backups to a second backup drive.
      Easiest way to do that, is to set your "main" drive (Drive A).. this will house all your data for services, etc. Then set up a "backup" drive (Drive B)... set up a simple rsync job in the webUI to backup Drive A, to Drive B. Getting the timing down correctly can be tricky.
      I run several rsync jobs automatically. I have one job that backs up to a 2nd drive on my NAS... that runs every 8hrs. I personally leave the delete trigger off, that way if something is accidentally deleted, it stays on my 2nd drive. Once or twice a month I'll log in and enable the delete trigger and run the job manually, to bring Drive A and Drive B completely in sync.

      Every day at 3am, a job runs that backs up my NAS's "Drive B" to a remote single drive OMV at my parents house. This job has the delete trigger enabled at all times.. so it is always in sync with "Drive B"

      I have another job that then runs at 4am, that backs up just movies/tv shows from my Mom's, to another another remote single drive OMV at my sisters. That also has a simple SMB share, and she uses an android TV box and Kodi to watch whatever stuff she or her kids want.

      It was simple to set up and once it's up it pretty much runs w/o any input from me.
      So what would happen when i want to expand and add a drive for more storage? Would i need to buy 2 instead of 1 and have 1 as the backup drive and the other as main 2?
    • I guess it just depends on the amount of data you have.. As long as you have an equal amount of backup space to your data space, you should have no problem coming up with a backup scenario.

      As for adding drives, I don't run raid, mergerfs, etc.. so if I needed to add a drive, I add it just like any other time you'd install a drive, and then work it into my backup scenario somehow. I'm only dealing with about 3-4tb of data though.
      Air Conditioners are a lot like PC's... They work great until you open Windows.

    • Gershy13 wrote:

      So what would happen when i want to expand and add a drive for more storage?
      once you define the meaning of 'redundancy' your options might grow.

      Usually all this redundancy babbling is to justify something really stupid: parity RAID-5 with large drives. As soon as you define what you want from your storage setup it get's somewhat easy:

      • data availability (that's this parity RAID thing)
      • data safety (backup)
      • data integrity (modern filesystems or Snapraid)
    • Data availability is an important thing for business continuity. If you're using only online storage for your backup, it may take a while to restore the backup. And having both local and remote backup is a pain to implement @home. I know that the right strategy is a local backup and a remote backup and a disaster recovery site and clusters and archiving and bla bla bla :) but it's not easy to implement @home and it may cost money.

      I think that deploying such complex strategy @home is not a good idea for the mass. You have to define the strategy, implement it, monitor it... It will work 2 or 3 months and then something goes wrong and it's never fix because of lack of monitoring, alerting, time...

      Since the remote backup is mandatory (because of theft, fire, ...), why RAID 5 (or 6 for large array) with specialized NAS disks wouldn't be a right strategy to ensure both safety and availability ? You monitor the SMART status of your disk to guess if one must be changed by precaution and if you accidentally remove a file on your system, you just get it from your remote backup. Thus, you save the local backup (time, money, annoyances, ...).
      Very easy to implement. mdadm for the RAID part, borg for the remote part and you keep a KISS approach.

      Otherwise, to answer the PO question: if you really want to begin a RAID-5 with two disks, I think it's possible with mdadm to build a degraded array. But be warned that you are at risk because a disk may fail. So either you buy a new one quickly and you pray in the meantime or you have a good backup !

      The post was edited 1 time, last by BLaurent ().

    • BLaurent wrote:

      why RAID 5 (or 6 for large array) with NAS disk wouldn't be a right strategy to ensure both safety and availability ?
      Since it doesn't work with the large drives NAS user want to use today. Rebuilds take days to weeks (if the array is in use) and if now another drive dies or just has a bad sector then your whole array is gone -- nice example: 8Tb RAID 5 array is Clean, FAILED. Pls help before I do something to make things worse.

      Adding to this: the parity RAID write hole still exists. This is not a problem if you run server grade hardware with an UPS or assembled your RAID manually with a log device but those OMV users playing RAID usually aren't aware of this and therefore risk data corruption while rebuilding. And data corruption is an issue with mdraid in general after/while rebuilds. I know people who 'successfully' rebuilt their mdraid after several days just to realize that the following fsck of the ext4 on top of it moved vast majority of data into the bin.

      This whole RAID5 thing in 2019 is for hobbyists wanting to have a good feeling since 'done something for redundancy'. It's wasting a disk for nothing.

      If I would want to use some sort of parity RAID I would opt for double redundancy and a more intelligent approach (that's RAIDz2 or btrfs raid56 created with '-draid6 -mraid1').
    • Agreed there is a chance that the RAID never rebuild (but you can mitigate this with NAS disk, RAID 6...).

      But I was speaking in term of data availability and business continuity:
      - with non RAID solution, you loose time before the backup is restored. Say in other words, you stop to work and you restore. Then you may continue to work;
      - with RAID, you may continue to work, the spare disk is setting up by the system and the RAID try to rebuild the array. Everything is transparent, you can continue to work and plan for a backup restoration if the rebuild failed.

      The main difference is that with non RAID, you do not plan the backup restoration. With RAID, you may plan it. And possibly avoid it.

      About the thread 8Tb RAID 5 array is Clean, FAILED. Pls help before I do something to make things worse., I understood that two disks failed. One during normal operation and one during rebuilt. No chance on a 5*2To RAID 5. In this configuration, the rebuilt chance is 94% with 10^-15 error rate disks. Moreover, if you look at the disk size, it's exactly the same on all disks of the array. It can indicate that the disks are the same model family, same device model and probably (this is simply guessing) are bought the same day so are from the same factory serie. It's the worst case for a safe rebuild !

      And RAID does not prevent to make backup ! I was speaking about RAID AND remote backup to ensure availability and safety !

      The post was edited 2 times, last by BLaurent ().

    • BLaurent wrote:

      with non RAID solution, you loose time before the backup is restored
      Me not. If data on a NAS is valuable then there exists at least one other local copy and all that's needed is accessing another host. I also do not differentiate that much between 'backup', snapshot and sync since it's 2019 and we don't need to differentiate that way any more. We have powerful choices like ZFS or btrfs allowing us to sync snapshots between NAS boxes allowing for permanent backup and accessing these backups without lengthy restore procedures. With the amount of data even in small companies today a 'restore from tape/cloud' takes way too long anyway.

      BLaurent wrote:

      with RAID, you may continue to work
      If (md)raid would work as advertised I'm with you. But I deal now with 'RAID failing again' for more than 2 decades and am not that confident that it simply works, especially given on which hardware the average OMV user wants to play RAID and that majority of users skips backup since 'redundancy/RAID protecting data' (or other misunderstandings).
    • BLaurent wrote:

      In this configuration, the rebuilt chance is 94% with 10^-15 error rate disks.
      This is assuming that disk fails are completly independent. As there is in reality at least one coupling mode for most setups, that being the age and usetime of the disk, not even talking about temperature and vibrations, the propability of disk fails during rebuilt increases significantly.
      In raids that dont face heavy usage, there is also a great chance a nother disk is already broken but it did not get recognized before. I personally whitnessed many disk fails during rebuilts. Statistics with coupling are way more complex as you need to get the exact differential coupling equation which is pretty hard most times. This is the sole reason it most times gets neglected and false statistical data are presented.
    • Users Online 2

      2 Guests