Today, I will be specifically talking about RAID 5 since it's the most popularly used, (although being replaced slowly by RAID 6) and the safest RAID level.
Image from Wikipedia.
First things first, let's start with how storage space works in RAID 5. Let's assume we have 4 disks for this example, and our disks are 2 TB each. In this case, we would have 6 TB of space in this array, which is calculated by the formula X * ( Y - 1 ), where X is the smallest disk in this array, and Y is the number of disks. A minimum of 3 disks are needed for a RAID 5 array.
As you can see in the picture, the data is divided into 4 disks, which allows data on the different disks to be accessed simultaneously, which is an upside of the RAID 5 array.
With all that being said, now we can talk about the main point, which is RAID 5 data recovery. In RAID 5, the data is striped between multiple disks like we have talked before, and a parity bit is added to each disk (which are calculated between the cylinders) , which is the reason why we see a decrease in total storage space in the array. For those who don't know what parity is, I will explain really quick.
Parity in RAID 5, in simplest terms, are digits which contain information (calculated by XOR of data). But how does that help in order to recover lost data incase one of the disks malfunction? The answer to this is the XOR function. Using the XOR function on the undamaged disks' parity bits give you the parity bit of the damaged drive, which can be used to reconstruct all the data on the damaged disk. Since we brought that up, let's also see how XOR works.
XOR is the principle where if there's a difference between numbers, then the result is 1, and if they're same, the result is 0. Therefore 1 xor 1 would be 0, 0 xor 0 would be 0, 1 xor 0 would be 1, and 0 xor 1 would be 1.
Since we know how XOR works now, let's see an example.
Drive 1: 010111001
Drive 2: 110011100
Drive 3: 100100111 is found by using the XOR function. This means the damaged drive's parity bit is 100100111, and can be used to reconstruct the data on the disk. The same method can be used over and over for each disk, given that only ONE disk in the array is damaged. If more than one disk is damaged, then the XOR function cannot be used, which means RAID 5 will be ineffective.
Of course, there are billions of numbers and calculations being made in a real life case, and this is just a stripped down example in order to learn how RAID 5 data recovery works.
According to the things we talked until now, you can see RAID 5 has a big advantage in the name of data recovery, and is also a good system if the disks are being read more than written on (which I've mentioned earlier)
RAID 5 also comes with its disadvantages like everything else, and that's no surprise. Writing on a RAID 5 system will surely be slow, since the parity data should be kept updated all the time. Therefore, RAID 5 shouldn't be used in places where constant writing to the disks is applied. RAID 5 will also require more RAM, and it makes you lose some part of your storage space.
All in all, I believe RAID 5 data recovery is one of the most miraculous things to happen in computer history, since everything is being stored in computers in this age, and losing important data could be a matter of life and death for a company or a business owner. I'm sure many of you heard the term RAID and wondered how it worked, so I hope this article made you understand it in a better way!