What is RAID 5?
RAID 5 is a redundant array of independent disks configuration that uses disk striping with parity. Data and parity are separated into blocks and striped evenly across three or more disks in the array, so no single disk is a bottleneck. Striping also enables users to reconstruct data in case of a disk failure.
RAID 5 also calculates parity information for all stored data to increase the array's fault tolerance and enable data recovery in case one of the drives fails.
How RAID 5 works
RAID 5 uses both disk striping and parity. Striping is the process of storing consecutive segments of data across different storage devices; it enables better throughput and performance. At least three devices or drives are used in a RAID 5 array to stripe data. These drives function as a single logical device or data volume. The total usable capacity of the array is the sum of the capacities of all the drives in the array minus the capacity of one drive that stores the parity information. For example, the usable capacity of an array with five 1 terabyte drives is 4 TB.
In RAID 5, no single drive contains a complete copy of the data. Since each disk contains a partial data copy, RAID 5 provides data redundancy. That said, while disk striping makes the array redundant, it alone does not make the array fault-tolerant. Disk striping combined with parity provides RAID 5 with redundancy and reliability. For redundancy, RAID 5 uses parity instead of mirroring -- a technique used in RAID 1. When data is written to a RAID 5 drive, the system calculates parity and writes that parity onto the drive.
While mirroring maintains multiple copies of data in each volume to ensure that the data is available even after a drive failure, RAID 5 can rebuild a failed drive using the parity data, which is not kept on a fixed single drive. By spreading the data across all three drives, any two drives in the array can combine to equal the data stored on the third drive, keeping data secure in case of a single drive failure.
 
  RAID 5 layout
Depending on where the parity blocks are located and in which order the data blocks are written, there can be different types of RAID 5 arrays.
In an array consisting of n disks -- minimum three -- the RAID 5 layout depends on whether the data blocks are written from left to right or right to left, whether the parity block is at the beginning or end of the stripe, and where the first block of a stripe is located with respect to the parity block of the previous stripe.
Synchronicity and asynchronicity define the order of the data blocks. Thus, if the data blocks are written from left to right with the parity block placed at the end of the stripe, the RAID 5 layout is known as left asynchronous RAID 5. Furthermore, in such a configuration, the first block of the next stripe is not on the same disk as the parity block of the previous stripe. On the other hand, if the first data block of the next stripe is written on the same drive as the parity block of the previous stripe, it is known as a synchronous RAID 5 layout.
Either way, RAID 5 uses block-level striping with parity information calculated on the full stripe and distributed among all the disks in the array. Striping provides data redundancy and protection and improves read performance. The parity feature also minimizes data losses and reduces the need for a dedicated parity disk.
Key features of RAID 5
RAID 5 is currently one of the most commonly used RAID methods because it offers numerous features that other configurations lack.
For one, it evenly balances reads/writes and provides both high access speeds and good performance. Furthermore, its performance is similar to RAID 0 due to the use of data striping.
RAID 5 groups have a minimum of three drives and no maximum. Since it uses more drives, it has more usable storage than RAID 1 (mirroring) and RAID 10 configurations. Because the parity data is spread across all drives, RAID 5 is considered one of the most secure RAID configurations. Also, drives can be hot-swapped in RAID 5, which means a failed drive can be removed and replaced with little or no downtime.
Parity information for the stored data plays an important role in facilitating data recovery. It is an additional stripe of data and is distributed across all the drives in the RAID array. If data blocks are lost or damaged, the array uses an exclusive-OR, or XOR, link to enable their reconstruction and minimize data losses.
In sum, the key features of RAID 5 are the following:
- Disk striping.
- Parity.
- Data redundancy.
- Fault tolerance.
- Fast read performance.
- Efficient write performance.
- Balance between speed and fault tolerance.
RAID 5 advantages
Fast, reliable read speed is a major benefit of RAID 5. Considered a good, all-around RAID system, RAID 5 combines the better elements of storage efficiency and performance among the different RAID configurations. The efficiency stems from the fact that parity information is distributed across all the drives in the array, while the high performance is the result of using multiple drives with a data striping mechanism. Another reason for improved performance is that read/write operations can be performed concurrently.
This RAID configuration also provides higher storage capacity and more efficient capacity utilization compared to other RAID levels. Additionally, it offers inexpensive data redundancy and fault tolerance. With RAID 5, the failure of a single drive usually doesn't result in data loss. While writes tend to be slower because of the parity data calculation, RAID 5 enables users to access and read data even while a failed drive is being rebuilt.
RAID 5 drawbacks
The need to calculate parity can affect the write performance of drives in a RAID 5 array. The array requires additional time to calculate and write parity information during write operations, in which the corresponding data block and the associated parity information must be updated. This results in what is known as a write penalty, which ultimately affects write performance. A dedicated hardware RAID controller can help to mitigate this problem, although it can increase implementation complexity and cost.
Longer rebuild times are another major drawback of RAID 5. Rebuilding is also a resource-intensive and complex process, depending on the number and size of drives, controller speed and the load on the array. Due to the complexity, RAID 5 rebuilds can take a while -- sometimes a full day or even longer. Also, if another disk fails during the rebuild, then data is lost. Data loss may also occur if multiple disks develop bad sectors or bad blocks during rebuilding.
To maintain data integrity and ensure its availability, it's important to consider the rebuild time. It's equally crucial to keep in mind that there is still a risk of multiple drive failures with RAID 5 since the array is in a vulnerable state during the rebuild. Additionally, while RAID configurations can contribute to business continuity, they are not equivalent to or a replacement for a disaster recovery setup.
If anything, uninterrupted business continuity needs both a data backup strategy and a RAID 5 configuration. Taking regular backups of critical data can help to prevent data losses that may occur due to unforeseen events other than disk failures, such as malware or ransomware infections, software corruption, power outages and natural disasters. Human errors can also result in accidental data deletions and losses.
As a redundancy mechanism, RAID cannot safeguard against such events. Rather, it only provides a safeguard against drive failures. For reliable prevention of data losses, it's advisable to use both RAID and data backups as part of a broader disaster recovery plan.
RAID 5 use cases
RAID 5 is ideal for application and file servers that have a limited number of drives but want greater storage performance and reliability. Furthermore, since it provides fault tolerance via striping and parity checksums, the RAID 5 mechanism is suitable for storing mission-critical data.
RAID 5 can also be a good choice in these situations:
- Multiple users need consistent access to the data stored on multiple servers.
- Access speed is less of a priority than reliable data storage and redundancy.
- A reliable storage mechanism is crucial because data losses may result in serious business consequences, like financial losses or regulatory fines.
RAID 1 vs. RAID 5
RAID 1 writes to two mirrored disk drives and can handle twice the number of reads than a single drive. Because RAID 1 outperforms RAID 5 in terms of read speed and maintaining data integrity, it is one of the most favored RAID configurations.
However, RAID 1 also requires more drive space and offers much less storage capacity compared to the total drive capacity. This is because additional drives hold data copies, reducing their usable storage capacity. Also, the fault tolerance capability of RAID 1 is not as high as that of RAID 5 with parity.
And, even though RAID 5 requires drive space to store the parity checksums, it still offers a higher total storage capacity than a RAID 1 array. But its weaknesses notwithstanding, RAID 1 can still be a good choice in settings where data loss is unacceptable.
RAID 5 vs. other types of RAID configurations
All RAID configurations offer benefits and drawbacks. Standard RAID levels, such as 2, 3, 4 and 7, are not as commonly used as others, specifically RAID 5, 1, 6 and 10. While RAID 3 could be considered inferior to RAID 5 because it uses a separate disk for parity data, other configurations can hold their own when compared to RAID 5.
Similar to RAID 5, RAID 6 has speedy read/write parity data to multiple drives. However, because it writes to two drives, RAID 6 uses a minimum of four drives rather than the three required by RAID 5, which can increase system complexity and cost. RAID 6 also uses the double parity method, meaning two checksums are created for data redundancy and fault tolerance instead of just the one that's created in RAID 5. And RAID 6 can withstand two drive failures and provide access to all data even while both drives are being rebuilt, so it is considered more secure than RAID 5.
With RAID 6, writes are even slower than RAID 5 because of the additional parity data calculation. Like RAID 5, though data is still accessible while a drive is being rebuilt, rebuilds can take a considerable amount of time. Overall, however, RAID 6 is considered a solid system and may be preferable to RAID 5 in environments where a high number of large drives are used for storage.
RAID 10 is a nonstandard RAID configuration that combines elements of RAID 1 and RAID 0, meaning it uses both disk striping (RAID 0) and data mirroring (RAID 1) to improve both read/write speeds and offer high fault tolerance. Also known as RAID 1+0, RAID 10 has a fast rebuild time, thanks to the ability to copy mirrored data to a new drive. This process can take as little as 30 minutes, depending on the drive size.
One drawback of RAID 10 is that half of all storage capacity goes to mirroring, which can speed up rebuilds but can become expensive quickly. Also, a RAID 10 setup can only withstand one drive failure in a mirrored pair of disk drives. A second failure usually results in total data loss.
Software vs. hardware RAID
RAID can be in the form of hardware or software depending on where the processing occurs.
Software RAID is a form of RAID performed on an internal server. Because it processes on the internal server, software RAID is slower than hardware RAID. However, because hardware RAID requires purchasing additional hardware, software RAID costs less.
 
  Trends and future directions
Despite the numerous configurations available, RAID is an aging technology that is facing off with new technologies in the storage space, such as erasure coding. However, many vendors do still use RAID to supplement technologies, like solid-state drives, to provide users with the benefits of data redundancy and fault tolerance. Until a more reliable form of data redundancy becomes available, RAID will likely continue to have a place in the storage market.
While RAID 5 remains popular, other RAID schemes also have their selling points. For example, the ability of RAID 6 to withstand two drives failing makes it an appealing option, and disk vendors are recommending RAID 6 and 10 for larger workloads. Standard Serial Advanced Technology Attachment drives are not a good fit for RAID 5 because administrators can be prevented from rebuilding a drive after a failure.
When considering the future of RAID 5, storage capacity growth is an important factor to keep in mind. As drive sizes increase, RAID 5 rebuild times will rise, increasing the risk that another drive may fail and lead to irrecoverable data losses. Also, an increase in storage density that isn't met by better performance results in a lengthy rebuild. With so many variations of RAID available to fix the mistakes of earlier configurations, better options are likely to appear down the road.
RAID protects data and improves storage performance and availability. The technology can be confusing, however. Read about the different levels of RAID, the pros and cons, and where they work best. Also, explore the key differences in software RAID vs. hardware RAID.
 
					 
									 
					 
									 
					 
									 
					