Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ availmon(5) — IRIX 6.5.3f

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

Mail(1)

amconfig(1M)

amnotify(1M)

amparse(1M)

amreceive(1M)

amregister(1M)

amreport(1M)

amsend(1M)

amsysinfo(1M)

amsyslog(1M)

amtickerd(1M)

amtime1970(1M)

chkconfig(1M)

halt(1M)

hinv(1M)

icrash(1M)

init(1M)

shutdown(1M)

versions(1M)

syslogd(1M)

syslog(3C)

crontab(1)



availmon(5)                                                        availmon(5)



NAME
     availmon - overview of system availability monitoring facilities

DESCRIPTION
     The availability monitor (availmon) is a set of programs that
     collectively monitor and report the availability of a system and the
     diagnosis of system crashes.  For unexpected reboots, availmon identifies
     the cause of the reboot by gathering information from diagnostic programs
     such as icrash(1M), which includes results from the FRU analyzer when
     available, and syslog (see syslog(3C)), and system configuration
     information from versions(1M), hinv(1M) and gfxinfo(1G).

     Availmon can send availability and diagnostic information to various
     locations, depending on configuration; it can provide local system
     availability statistics and reboot history reporting, and it can provide
     limited site-management facilities by collecting availmon information
     from a set of systems into a single log file, and then reporting on the
     composite system availability data.  It also provides immediate
     notification of important system log messages (that are logged in
     /var/adm/SYSLOG) by passing them via a syslogd(1M) filter.

     All availmon capabilities are configurable using amconfig(1M).  Availmon,
     by default, will not automatically send availmon reports on reboot.  In
     all cases, amregister(1M) must be run to enable automatic distribution of
     reports.  Otherwise, most configurable options are enabled for high-end
     platforms/servers and disabled for low-end platforms/workstations.

     Availmon reporting centers around event records.  Any system reboot is an
     availmon event, whether a controlled shutdown or an "unscheduled" reboot,
     such as a power interruption or a "crash".  An event record contains the
     time at which the system was previously booted, which starts the event
     period, the time the event occurred, which ends the period of "uptime",
     the reason for the event, and the time that the system was rebooted.  If
     the system stopped as a result of a hang, the exact instant at which it
     stopped is not easily known; this time is estimated by a (configurable)
     ticker daemon (see amtickerd(1M)).

     Events are grouped as either "Service Action" events, or "Unscheduled"
     events.  Service Action events are controlled shutdowns, initiated by
     operators through shutdown(1M), halt(1M) and init(1M)).  For such
     controlled shutdowns, a (configurable) prompt is given to identify the
     reason for the shutdown.  Unscheduled events include system panics, power
     failures, power cycles, and system resets (usually due to system hangs).
     Panics are identified as either due to hardware or due to software or due
     to unknown reasons.  This distinction is based strictly on results of the
     FRU analyzer, if present.

     Availmon generates three types of reports: availability, diagnosis and
     pager.  Availability reports consist of the system serial number, full
     hostname/internet address, the previous system start time, the time of
     the event, the reason for the event (the event code), uptime, start time
     (following the reboot), and a summary of the reason for the event where



                                                                        Page 1





availmon(5)                                                        availmon(5)



     relevant.

     Diagnosis reports include all data from an availability report, and
     additionally may contain the icrash analysis report, FRU analyzer result,
     important syslog messages, and system hardware/software configuration and
     version information.  Important syslog messages include error messages
     and all messages logged by sysctlrd and syslogd, since the last reboot.
     Duplicated messages are eliminated even if not consecutive; the first
     such message is retained with its time stamp, and the number of
     duplicated messages and the last time stamp are appended.  System
     software version information is limited to version output for the
     operating system and installed patches.

     Pager reports are intended for "chatty pagers", and include only the
     system hostname, a brief description of the reason for the event, and the
     summary, if present.

     Availability information for the local system is always permanently
     stored in /var/adm/avail/availlog.  Files in /var/adm/avail are
     maintained by availmon and should not be deleted, modified, or moved.
     The most recent reports are stored as availreport, diagreport and
     pagerreport in the directory /var/adm/crash/.  In addition, reports for
     single-user events are stored under the same names with the suffix .su.
     A copy of syslog messages is stored in /var/adm/avail/AMSYSTEMLOG which
     is rotated at regular intervals via a cron job (see crontab(1)).

CONFIGURATION
     Once availmon is installed, "registration" is required before availmon
     reports are automatically distributed, and configuration of local options
     may also be desired.  The most important configuration option is
     autoemail, which enables automatic distribution.  Normally,
     amregister(1M) is run to accomplish initial email configuration and to
     set autoemail to on.  See also amconfig(1M) for detailed explanation of
     availmon's configurable options and their exact default values.

     Registration of a system can normally be accomplished simply by running:

          amregister -r

     This assumes that the default configuration is acceptable and that the
     local system is a relatively recent platform, where the system serial
     number is machine readable (see amsysinfo(1M) for an exact list).  For
     the case where the serial number is not machine readable, see
     amregister(1M) for configuration details.  The default distribution of
     email reports is to send a diagnosis report to availmon@csd.sgi.com,
     which enters the report into the SGI Availmon Database.

     Applying a common configuration to multiple systems is easily
     accomplished by using amconfig(1M) on a single system to produce the
     desired autoemail.list configuration file; copying the result,
     /var/adm/avail/config/autoemail.list, to all systems, and then running
     amregister -r on each system.  To change any other availmon configuration



                                                                        Page 2





availmon(5)                                                        availmon(5)



     options, run amconfig(1M) appropriately on each system.

     There are several other configuration options that can prove useful.  One
     is to configure sending availmon reports from one or more systems to a
     standard system administrator email alias.  This provides real-time
     notification of system activity.  Another similar option is to configure
     availmon pager reports for real-time notification to "chatty" pagers.
     Or, availmon diagnostic reports may be sent to a local support office, or
     to a system administrator for detailed evaluation.

     The site-management facilities of availmon can be used by configuring to
     send availmon reports to a "concentrator account".  Such an account would
     be a common email alias on a single system that pipes incoming email
     through amreceive(1M) and then appends it to an aggregate site log file.
     See amreport(1M) -s option for reporting on site log files.

     Availmon can also send generate periodic status report that indicate that
     a system is still running and "registered" to send email reports.  This
     is controlled by the statusinterveral configuration value, which defaults
     to 60 days.  Such reports are sent by the availmon ticker daemon, so they
     are sent only if the tickerd config flag is on.

     Even where sending of availmon reports is not enabled, local system
     availability data is always maintained, and amreport(1M) can be run to
     produce statistical or event detail reports for the local system.  Such
     reporting can be automated on a regular basis using the -f ("from")
     argument to amreport(1M).  It is also possible to manually send availmon
     reports after any reboot using either amnotify(1M) or amsend(1M).

REPORT VIEWING
     The amreport(1M) program reviews saved availability report information
     and provides statistical and event history reports.  By default, it
     processes the availability data on the local system.  It can also process
     aggregate site log files; that is, an appended accumulation of availmon
     reports from different systems.

     amreport can be run interactively or it can generate statistical or event
     history reports that are written to standard output.  Interactively, it
     presents a statistical summary and allows hierarchical selection and
     display of a list of events or detail on particular events.  Run
     interactively on a site logfile, it presents the same statistical or
     event information either on all systems or on each system individually.
     In either case, it can generate statistical, event list, event detail or
     combined reports written to standard output.

     amreport accepts -f "from" and/or -t "to" arguments which can be used on
     the local system to bound the time period which is reported.  This
     capability can be used to generate regular statistical or event
     list/detail reports.






                                                                        Page 3





availmon(5)                                                        availmon(5)



     Run interactively on the local system, amreport also supports resending
     event data from selected historical events.  This allows recreating prior
     reports and resending them in the case where an email report may have
     been lost. Some information included in the original report may not be
     included in the resent report.  This includes current status information,
     such as hinv(1M) data, which may have changed from the time of the
     original report to the current time (and which is therefore not included)
     or information derived from files in /var/adm/crash/, such as icrash
     files, which may have been removed.

FILES
     /var/adm/avail/config/{autoemail,shutdownreason,tickerd,hinvupdate,livenotification}
               configuration files containing flag values for autoemail,
               shutdownreason, tickerd, hinvupdate, livenotification.
     /var/adm/avail/config/statusinterval
               configuration file containing the value of statusinterval.
     /var/adm/avail/config/autoemail.list
               configuration file containing the autoemail.list address lists.
     /var/adm/avail/availlog
               primary log of availability monitor
     /var/adm/avail/AMSYSTEMLOG
               A copy of system log messages which is maintained by availmon
     /var/adm/avail/lasttick
               uptime in seconds since Jan 1, 1970 (written by tickerd)
     /var/adm/crash/*
               availmon report files:  availreport, diagreport, pagerreport,
               availreport.su, diagreport.su, pagerreport.su
     /etc/init.d/availmon, /usr/etc/amstart
               init scripts that log start/stop and initiate notification

SEE ALSO
     Mail(1), amconfig(1M), amnotify(1M), amparse(1M), amreceive(1M),
     amregister(1M), amreport(1M), amsend(1M), amsysinfo(1M), amsyslog(1M),
     amtickerd(1M), amtime1970(1M), chkconfig(1M), halt(1M), hinv(1M),
     icrash(1M), init(1M), shutdown(1M), versions(1M), syslogd(1M),
     syslog(3C), crontab(1).



















                                                                        Page 4



Typewritten Software • bear@typewritten.org • Edmonds, WA 98026