What
is RAID?
RAID
(Redundant Array of Inexpensive Disk) is a method of
getting two or more hard disks to work together as one more
efficient hard disk. The combined disk also reduces the risk
of losing information in case fault tolerance occurs. The group
of hard disk used together is called 'disk array'. The operating
system and software will see all hard disks as one disk only.
Using RAID is not only for increasing the capacity of storing
data, but also for economic reasons. The greater the capacity
of a hard disk, the more expensive, so RAID is used for operations
which need a lot of space for storing data such as Database
Server. In case of using only one hard disk with large capacity,
called SLED (Single Large Expensive Disk), using RAID may be
more economical.
Data
Striping
What is Data Striping?
Data Striping
is a way of splitting data into sections and storing them in
the multiple hard disks. Striping is used to increase the efficiency
of reading and writing in the disk array. This is because the
hard disks work in parallel motion, making the file undoubtedly
more attainable than using only one hard disk.
Below
are different types of RAIDs. Each with its own capability and
is used for different purposes.
RAID
0
RAID 0
is linking more than one hard disk together in non-redundant
form. The purpose of RAID 0 is to increase the reading/writing
speed directly from the hard disks without any data backup.
So if any one of the hard disks fails; the whole data will be
inaccessible. From the picture above, we can see that the data
is divided and stored in different hard disks. The larger the
number of hard disk in the array, the less the time used in
reading/writing data. Theoretically speaking, if the disk array
is composed of N disks, the reading/writing data speed is increased
N times. But in the reality the speed may be slower. This is
because the speed is based on many factors, such as the RAID
controller, or the inaccuracy of the hard disk speed.
The advantage
of RAID 0 is its speedy data processing. Its weak point is if
either one of the hard disks fails, the whole data processing
will be affected.
RAID
1
RAID 1
(also called 'disk mirroring') consists of two hard disks, which
hold the exact same data. In a way, It is similar to backup
data. If either hard disk fails, the system can still access
to the data from the other hard disk. For a well-designed RAID
controller, writing to two hard disks at the same time will
take as much time as writing to only one. While the time spent
in reading is lessened, for the RAID controller can read from
either one of the hard disks.
The superiority
of RAID 1 is the safety of the data, not its efficiency or speed
as in RAID 0, although RAID 1 increase the data reading speed.
RAID
2
The data
will be divided and stored in the multiple disks. One hard disk
is used for Error Checking and Correcting (ECC), this will lessen
the possibilities of damaging or losing data. When the data
is sent to be stored in the disk array, a hard disk collects
the ECC settings. If either one of the hard disks fails, the
system can summon up the whole data in that hard disk, using
data from other hard disks and ECC settings. ECC makes the hard
disk system work quite hard and many disks are used for storing
the ECC settings, which makes it not very economical.
RAID
3
RAID 3
shares similarity with RAID 2, but instead of dividing data
at bit level as in RAID 2, RAID 3 divides data at byte level.
Instead of using ECC, RAID 3 uses parity, which gives RAID 3
a higher efficiency of data reading/writing. This is because
each hard disk is linked by stripes and only one hard disk is
used for storing parity. But if RAID 3 is used where a small
number of data is transferred, a problem called 'bottle neck'
is encountered. Bottleneck is caused because RAID 3 has to distribute
data throughout the hard disk, and time is wasted because parity
must be created no matter how big or small the data is. If a
small-sized data is being dealt with, the data can be stored
long before the parity is formed. Then the whole system must
wait until the parity is successfully created in order to continue
work.
RAID 3
is suitable for transferring large amount of data, such as video
editing, etc.
RAID
4
RAID 4
is actually very alike RAID 3, except dividing data is done
in a block level, instead of bit or byte level. This makes random
accessing data speedier. Anyhow, the bottleneck problem can
also be encountered.
RAID
5
RAID 5
divides data in block level, as in RAID 4. But in RAID 5 the
parity is not stored in a single disk, but is distributed throughout
the entire array by mixing with ordinary data. This reduces
the bottleneck problem, which a major problem in RAID 3 and
RAID 4. Another interesting feature in RAID 5 is the Hot-Swap
Technology. Hot Swapping is swapping hard disks in case a problem
occurs, while the system is still operating. RAID 5 is appropriate
for server / workstation.
RAID
6
RAID 6
uses the basic operation of RAID 5. But one more parity block
is added, which enables us to hot swap two disks at the same
time. (RAID 5 can hot swap only one disk at a time, if two hard
disks fail at the same time, the whole system is malfunctioned.)
This is to increase the system's fault tolerance. RAID 6 is
usually used in tasks that require very high security and stability
of data.
RAID
7
RAID 7
uses the basic operation of RAID 4, and adding a few more features,
enabling the each hard disk to operate independently. Preventing
the bottleneck problem, which is common in RAID 4. Each data
transfer is through X-bus, a high-speed bus. RAID 7 also contains
various levels of cache memory in the RAID controller in order
to allow the disks to function independently. A real-time operating
system is included in the array control processor, and controls
the data transfer on the bus.
RAID 7
is suitable for large organizations. It can connect up to 12
hosts with 48 drives. The price of RAID 7 is rather high because
it is under the license of Storage Computer Corporation, and
the users of RAID 7 cannot make any adjustments to the system
at all. This makes RAID 7 not very popular among users.
RAID
10
RAID 10
or RAID 1+0 is combining RAID 0 and RAID 1 together, making
the access to data speedier and there is a back up of data.
The disadvantage of RAID 10 is there is difficulty in adding
extra hard disks because each disk has its own mirror. If we
add extra disks, backup disks must also be added. RAID 10 is
suitable for servers that need a speedy access to data and where
large capacity is not necessary.
RAID
53
RAID 53
has a fairly fast data access, owing to the fact that its basic
operation is based on RAID 0. And like in RAID 3, it also has
data prevention, but the bottleneck problem still exists. RAID
53 can also hot swap, like in RAID 5
spin
9