Fault-tolerance, as the name implies, allows the
overall hardware system to continue to operate even if one of its components fails.
This, in turn, implies redundant components and the ability to detect component
failure and seamlessly integrate the failed component??™s replacement. The major system
components that should be fault-tolerant include the following:
??? Disk drives
??? Disk controllers
??? CPUs
??? Power supplies
??? Cooling fans
??? Network cards
??? System buses
Disk failure is the largest area of exposure for hardware failure, since disks have the
shortest mean times to failure of any of the components in a computer system. Disks
also present the greatest variety of redundant solutions, so discussing that type of
failure in detail should provide the best example of how high availability can be
implemented with hardware.
Disk redundancy
Although the mean time to failure of an individual disk drive is very high, the everincreasing
number of disks used for today??™s very large databases results in more
frequent failures.
Pages:
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577