failover(7M) failover(7M)
NAME
failover - disk device alternate path support
SYNOPSIS
/etc/init.d/failover [init|start]
DESCRIPTION
Failover is the ability to define and manage alternate paths to a single
disk device or lun. At startup, failover automatically detects and
configures alternate paths to SGI Raid devices. To configure primary and
alternate paths to other more generic devices, failover also processes
configuration directives contained within the /etc/failover.conf
configuration file.
Failover works in conjuction with XLV(7m), the logical volume disk
driver, to automatically switch between primary and alternate paths.
Additionally, failover is only possible for devices which utilize
dksc(7m), the scsi disk driver.
Alternate Path Configuration
Primary and alternate paths to devices are defined by two different
mechanisms. Automatic detection, and definition within a configuration
file.
Configuration of SGI Raid devices is automatic and happens at the time of
device discovery during the probing of the scsi and fibre channel buses.
Configuration of alternate paths to other disk storage devices is defined
within the /etc/failover.conf configuration file. This file is processed
during XLV startup, and when the /etc/init.d/failover script is executed.
When /etc/init.d/failover is executed with the start parameter, it
automatically calls xlv_assemble(1m). When executed with the init
parameter, the execution of xlv_assemble is skipped.
The entries defining the available paths to a device consist of a single
line which defines an arbitrary group name, a primary path, and up to
three alternate paths. The group name is an arbitrary string of up to 31
characters. Following the group name are the /dev/scsi names associated
with the primary and alternate paths, the primary being the first path
specified.
Sample Configuration Entries
The sample file shows eight failover groups, A-H, each consisting of a
primary path and one alternate path. The sample also includes the
comments placed in the sample configuration file which is installed on
the system.
#ident $Revision: 1.1 $
#
# This is the configuration file for table driven failover support.
# The entries within this file consist of a single line per
# failover grouping. These lines describe up to four paths to
Page 1
failover(7M) failover(7M)
# each device. Obviously, these paths should reference the
# same device! Some sanity checking is performed, but the
# software is not foolproof in this area. (If it was foolproof,
# the configuration file would be unnecessary!) It is recommended
# that all paths to a device be listed.
#
# The format of each line consists of a group name followed by
# up to four device names from the "/dev/scsi" directory. If
# it's not in "/dev/scsi", it cannot be configured. But, this
# implementation assumes that if a device is not present, its
# path has failed. Consequently, no error messages will be
# generated for missing paths. This means that you may have a
# a failover group that consists of a single primary path with
# no alternates.
#
# Additionally, there is a configuration directive available that
# will cause the program to emit debug information. Placing
# "#verbose" at the start of a line, without the quotes, will
# enable the debug output. The debug output is displayed for
# all subsequent configuration directives. Once enabled, the
# debug output cannot be disabled.
#
# Lines that begin with a "#" are considered comment lines.
# A "#" anywhere within a line signals the beginning of a comment.
# White space must separate the last parameter of a line and the #.
# Blank lines are also considered a comment.
#
# --> Lines that begin with 'sc' will be skipped as it's likely a
# --> configuration error.
#
# Sample configuration. (Remember to omit the "#" the start of
# the line.)
#
#
# # Name Pri path | alt path | alt path | alt path
#
# GroupA sc6d50l0 sc7d50l0 sc8d50l0 sc9d50l0
# GroupB sc6d51l0 sc7d51l0 sc8d51l0 sc9d51l0 # Some comment
# GroupC sc6d52l0 sc7d52l0 sc8d52l0 sc9d52l0
#
# One word of caution. The switch to an alternate path is
# choreographed by XLV. XLV is a requirement for failover to
# function.
#
# Notes:
#
# It is not possible to change the primary device of a group after
# the initial configuration by reordering the entries on the line
# and running /etc/init.d/failover start. To change the primary,
# scsifo may be used to switch to the next available path.
#
# It is not possible to remove the primary of a group by deleting
Page 2
failover(7M) failover(7M)
# the entry from the group and running /etc/init.d/failover start.
# A reboot is necessary.
#
# To remove the "DOWN" displayed by hinv for the failed path,
# fix the path and reproble the bus using scsiha -p #.
#
A sc7d1l0 sc8d1l0
B sc7d1l1 sc8d1l1
C sc7d1l2 sc8d1l2
D sc7d1l3 sc8d1l3
E sc7d1l4 sc8d1l4
F sc7d1l5 sc8d1l5
G sc7d1l6 sc8d1l6
H sc7d1l7 sc8d1l7
Switching to an Alternate Path
Failover is controlled by XLV. When XLV receives notification of an i/o
error, it requests failover to switch the erring device to an available
alternate path. If the path switch is successful, XLV retries the failed
i/o using the new path.
The scsifo(1m) command is available to permit the system administrator to
manually request a switch to an alternate path. While the scsifo command
performs a switch, it is not detected by XLV until XLV receives an i/o
error on the current path due to the path no longer being available. XLV
then begins utilizing the new path.
Inventory Display
The hinv(1m) command will display the path status of primary and
alternate paths configured in the /etc/failover.conf configuration file.
The following sample hinv output reflects the above sample configuration
file. Three of the devices have failed over to the alternate path,
perhaps via the scsifo command.
Integral SCSI controller 7: Version Fibre Channel AIC-1160, revision 1
Disk drive: unit 1 on SCSI controller 7 (primary path)
Disk drive: unit 1,lun 1, on SCSI controller 7 (primary path)
Disk drive: unit 1,lun 2, on SCSI controller 7 (primary path)
Disk drive: unit 1,lun 3, on SCSI controller 7 (primary path)
Disk drive: unit 1,lun 4, on SCSI controller 7 (primary path)
Disk drive: unit 1,lun 5, on SCSI controller 7 (alternate path) DOWN
Disk drive: unit 1,lun 6, on SCSI controller 7 (alternate path) DOWN
Disk drive: unit 1,lun 7, on SCSI controller 7 (alternate path) DOWN
Integral SCSI controller 8: Version Fibre Channel AIC-1160, revision 1
Disk drive: unit 1 on SCSI controller 8 (primary path)
Disk drive: unit 1,lun 1, on SCSI controller 8 (alternate path)
Disk drive: unit 1,lun 2, on SCSI controller 8 (alternate path)
Disk drive: unit 1,lun 3, on SCSI controller 8 (alternate path)
Disk drive: unit 1,lun 4, on SCSI controller 8 (alternate path)
Disk drive: unit 1,lun 5, on SCSI controller 8 (primary path)
Disk drive: unit 1,lun 6, on SCSI controller 8 (primary path)
Page 3
failover(7M) failover(7M)
Disk drive: unit 1,lun 7, on SCSI controller 8 (primary path)
By using the scsiha(1m) command to reprobe the bus to which a down device
is connected, presuming the device is now responding on the bus, the
"DOWN" indicator displayed by hinv can be cleared.
FILES
/etc/failover.conf
/etc/init.d/failover
/etc/init.d/xlv
SEE ALSO
dks(5m), ds(7m), hinv(1m), scsifo(1m), scsiha(1m), xlv_assemble(1m), and
xlv(7m).
NOTES
The group name specified within the /etc/failover.conf file has no
external visibility. It cannot be correlated to the group number
information displayed by the scsifo command.
Page 4