Failover is the ability of a surviving node in a cluster to assume the
responsibilities of a failed node. Although failover doesn??™t directly address the issue
of the reliability of the underlying hardware, automated failover can reduce the
downtime from hardware failure.
The concept is very simple: a combination of software and hardware ???watches??? over
the cluster. Typically, this monitoring is done by regularly checking a heartbeat,
which is a message sent between machines in the cluster. If Machine A fails, Machine
B will detect the failure through the loss of the heartbeat and will execute scripts to
take over control of the disks, assume Machine A??™s network address, and restart the
processes that failed with Machine A. From an Oracle database perspective, the
entire set of events is identical to an instance crash followed by an instance recovery.
The instance uses the control files, redo log files, and database files to perform crash
recovery. The fact that the instance is now running on another machine is irrelevant
??”the various Oracle files on disk are the key.
Pages:
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585