I’ve touched on this topic before but I don’t think I’ve ever done a dedicated entry on the topic. I came across a blog post from Marc Farley, which got my mind thinking on the topic again. He talks about a leaked document from EMC trying to educate their sales force to fight 3PAR in the field. One of the issues raised is 3PAR’s lack of RAID 6 (nevermind the fact that this is no longer true, 2.3.1 introduced RAID 6(aka RAID DP) in early January 2010).
RAID 6 from 3PAR’s perspective for the most part was mostly just a check box, because there are those customers out there that have hard requirements, they disregard the underlying technology and won’t even entertain the prospect unless it mets some of their own criteria.
What 3PAR did in their early days was really pretty cool, the way they virtualize the disks in the system which in turn distributes the RAID across many many disks. On larger arrays you can have well over 100,000 RAID arrays on the system. This provides a few advantages:
- Evenly distributes I/O across every available spindle
- Parity is distributed across every available spindle – no dedicated parity disks
- No dedicated hot spare spindles
- Provides a many:many relationship for RAID rebuilds
- Which gives the benefit of near zero impact to system performance while the RAID is rebuilt
- Also increases rebuild performance by orders of magnitude (depending on # of disks)
- Only data that has been written to the disk is rebuilt
- Since there are no spare spindles, only spare “space” on each disk, in the event you suffer multiple disk failures before having the failed disks swapped(say you have 10 disks fail over a period of a month and for whatever reason you did not have the drives replaced right away) the system will automatically allocate more “spare” space as long as there is available space to write to on the system. Unlike traditional arrays where you may find yourself low or even out of hot spares after multiple disks fail which will make you much more nervous and anxious to replace those disks than if it were a 3PAR system(or similarly designed system)
So do you need RAID 6?
To my knowledge the first person to raise this question was Robin from Storage Mojo, whom a bit over three years ago wrote a blog post talking about how RAID 5 will have lost it’s usefulness in 2009. I have been following Robin for a few years (online anyways), he seems like a real smart guy I won’t try to dispute the math. And I can certainly see how traditional RAID arrays with large SATA disks running RAID 5 are in quite a pickle, especially if there is a large data:parity ratio.
In the same article he speculates on when RAID 6 will become as “useless” as RAID 5.
I think what it all really comes down to is a couple of things:
- How fast can your storage system rebuild from a failed disk
- For distributed RAID this is determined by the number of disks participating in the RAID arrays and the amount of load on the system, because when a disk fails one RAID array doesn’t go into degraded mode, potentially hundreds of them do, which then triggers all of the remaining disks to help in the rebuild.
- For 3PAR systems at least this is determined by how much data has actually been written to the disk.
- What is the likelihood that a 2nd disk will fail(in the case of RAID 5) or two more disks(RAID 6) fail during this time?
3PAR is not alone with the distributed RAID. As I have mentioned before, others that I know of that have similar technology are at least : Compellent, Xiotech and IBM XIV. I bet there are others as well.
From what I understand of Xiotech’s technology I don’t *think* that RAID arrays can span their ISE enclosures, I think they are limited to a single enclosure(by contrast I believe a LUN can span enclosures), so for example if there are 30 disks in the enclosure and a disk fails the maximum number of disks that can participate in the rebuild is 30. Though in reality I think the number is less given how they RAID based on disk heads, the number of individual RAID arrays is far fewer vs 3PAR’s chunklet-based RAID.
I’ve never managed to get in depth info on Compellent’s or IBM XIV’s design with regards to specifics around how RAID arrays are constructed. Though I haven’t tried any harder than looking at what is publically available on their web sites.
Distributed RAID really changes the game in my opinion as far as RAID 5’s effective life span (same goes for RAID 6 of course).
Robin posted a more recent entry several months ago about the effectiveness of RAID 6, and besides on of the responders being me, there was another person that replied with a story that made me both laugh and feel sorry for the guy, a horrific experience with RAID 6 on Solaris ZFS with Sun hardware –
Depending on your Recovery Time Objectives, RAID6 and other dual-parity schemes (e.g. ZFS RAIDZ2) are dead today. We know from hard experience.
Try 3 weeks to recover from a dual-drive failure on 8x 500GB ZFS RAIDZ2 array.
It goes like this:
– 2 drives fail
– Swap 2 drives (no hot spares on this array), start rebuild
– Rebuild-while-operating took over one week. How much longer, we don’t know because …
– 2 more drives failed 1 week into the rebuild.
– Start restore from several week old LTO-4 backup tapes. The tapes recorded during rebuild were all corrupted.
– One week later, tape restore is finished.
– Total downtime, including weekends and holidays – about 3 weeks (we’re not a 24xforever shop).Shipped chassis and drives back to vendor – No Trouble Found!
Any system that takes longer than say 48 hours to rebuild you probably do want that extra level of protection in there, whether it is dual parity or maybe even triple parity(something I believe ZFS offers now?).
Add to that disk enclosure/chassis/cage(3PAR term) availability which means you can lose an entire shelf of disks without disruption, which means in their S/T class systems 40 disks can go down and your still ok(protection against a shelf failing is the default configuration and is handled automatically – this can be disabled upon request of the user since it does limit your RAID options based on the number of shelves you have). So not only do you need to suffer a double disk failure but that 2nd disk has to fail:
- In a DIFFERENT drive chassis than the original disk failure
- Happens to be a disk that has portions of RAID set(s) that were also located on the original disk that failed
But if you can recover from a disk failure in say 4 hours even on a 2TB disk with RAID 5, do you really need RAID 6? I don’t know what the math might look like but would be willing to bet that a system that takes 3 days to rebuild a RAID 6 volume has about as much of a chance of suffering a triple disk failure as a system that takes 4 hours (or less) to rebuild a RAID 5 array suffering a double disk failure.
Think about the probability of the two above bullet points on how a 2nd drive must fail in order to cause data loss, combine that with the fast rebuild of distributed RAID, and cosnider whether or not you really need RAID 6. Do you want to take the I/O hit ? Sure it is an easy extra layer of protection, but you might be protecting yourself that is about as likely to happen as a natural disaster taking out your data center.
I mentioned to my 3PAR rep a couple of weeks ago about the theory of RAID 6 with “cage level availability” has the potential of being able to protect against two shelves of disks failing(so you can lose up to 80 disks on the big arrays) without impact. I don’t know if 3PAR went this far to engineer their RAID 6, I’ve never seen it mentioned so I suspect not, but I don’t think there is anything that would stop them from being able to offer this level of protection at least with RAID 6 6+2.
Myself I speculate that on a decently sized 3PAR system (say 200-300 disks) SATA disks probably have to get to 5-8TB in size before I think I would really think hard about RAID 6. That won’t stop their reps from officially reccomending RAID 6 with 2TB disks though.
I can certainly understand the population at large coming to the conclusion that RAID 5 is no longer useful, because probably 99.999% of the RAID arrays out there (stand alone arrays as well as arrays in servers) are not running on distributed RAID technology. So they don’t realize that another way to better protect your data is to make sure the degraded RAID arrays are rebuilt (much) faster, lowering the chance of additional disk failures occurring at the worst possible time.
It’s nice that they offer the option, let the end user decide whether or not to take advantage of it.
The XIV has similar contraints and advantages as the 3PAR system without RAID6: all drives participate in the rebuild, only written data is rebuilt, if the system isn’t full you can survive additional failures, etc. I believe a fully written system is supposed to be able to survive a module failure plus two or three additional drives. But there’s still the possibility that if you simultaneously lost the ‘right’ two drives in different module/cages you’d be in pretty big trouble; for XIV you’d lose the whole thing, for 3PAR you probably lose several LUNs at the least depending on the chunklet distribution and how full the system is. With the addition of RAID6 you can ensure certain LUNs can survive that unlikely event. Hopefully at some point XIV will add a similar option, e.g. 3-way mirroring for selected LUNs.
Unless things have changed since I last looked into them, I believe the RAID groups for Xiotech are actually constrained to the datapack level (2 datapacks per ISE shelf). LUNs can span datapacks, but the only way to protect against datapack failure is to use ISE mirroring which is what it sounds like.
The biggest drawback is that mirroring becomes a property of the datapacks rather than the LUN, so you have to define which datapacks will be used for mirroring and which will be used for unmirrored LUNs rather than letting all datapacks be available for all LUNs and having the front-end ensure that LUNs requiring datapack/ISE protection have redundancy distributed appropriately. I much prefer the 3PAR approach of defining the protection level for the LUN rather than having to set aside another drive group with a higher protection level.
Comment by Chuck — August 14, 2010 @ 2:31 pm
Wow! That is some outstanding detail, thank you for posting it! Very interesting indeed.
Comment by Nate — August 14, 2010 @ 2:35 pm
[…] They also claim to have some sort of rapid rebuild, by striping volumes over multiple RAID sets, I suppose this is less common in the markets they serve (certainly isn't possible on some of the DotHill models), this of course has been the norm for a decade or more on larger systems. Rapid rebuild to me obviously involves sub disk distributed RAID. […]
Pingback by Real time storage auto tiering? « TechOpsGuys.com — August 23, 2012 @ 9:47 am
[…] Controller failures are generally few and far between. 3PAR does not use the traditional concept of hot spares, so you can in many cases have many disks fail over a period of time and not have to worry about […]
Pingback by 3PAR: The Next Generation « TechOpsGuys.com — December 4, 2012 @ 12:41 am
In fact, it is within the price range that is estimated for other impressive tablets that can be found in the market.
These tablets are frequently over the a hundred and fifty mark but offer you a considerably broader and longer lasting expertise.
Some manufacturers call their product a Cover while
others use the name Case.
Comment by www.placeboprestige.fireleaf-forum.com — May 16, 2013 @ 8:26 am
[…] was told at some point that the 3PAR OS would start requiring RAID 6 on volumes that were on nearline drives at some point – perhaps that point is now(I am not sure). I […]
Pingback by 3PAR: Faster, bigger, better « TechOpsGuys.com — December 9, 2013 @ 6:00 pm