Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ crash(8) — A/UX 0.7

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

fsck(1M)

boot(8)



     crash(8)                                                 crash(8)



     NAME
          crash - what to do when the system crashes

     DESCRIPTION
          This entry gives a few clues about what to do if the system
          crashes.  It is not complete.

          In restarting after a crash, always bring up the system
          single-user, as specified in boot(8) modified for your
          installation.  Perform an fsck(1M) on all file systems which
          could have been in use when the system crashed.  If you find
          any serious file system problems, you should repair them.
          When you are satisfied with the health of your disks, check
          and set the date, then come up multi-user.

          To boot Oreo, certain files (and the directories leading to
          them) must be intact.  First, the initialization program
          /etc/init must be present and executable.  For init to work
          correctly, /dev/console, /bin/sh, and /bin/env must be
          present.  If one of these does not exist, the symptom is
          best described as thrashing.  init will go into a fork/exec
          loop trying to create a Shell with proper standard input and
          output.  The file /etc/rc should also be executable; the
          system will come up but will not be fully initialized
          without it.

          If you cannot get the system to boot, you must obtain a
          runnable system from a backup medium.  You may then doctor
          the root file system as a mounted file system, as described
          below.  If there are any problems with the root file system,
          you should go to a backup system to avoid working on a
          mounted file system.

        Repairing disks
          You should treat an addled disk gently; you shouldn't mount
          it unless necessary, and if it is very valuable but in quite
          bad shape, perhaps you should copy it before trying surgery
          on it.  This is an area where experience and informed
          courage count for much.

          fsck(1M) is adept at diagnosing and repairing file system
          problems.  It first identifies all the files that contain
          bad (out of range) blocks or blocks that appear in more than
          one file.  Any such files are identified by name and fsck
          requests permission to remove them from the file system.
          You should remove files with bad blocks.  When there are
          duplicate blocks, you should remove all the files except the
          most recently modified.  You should check the contents of
          the survivor after repairing the file system to ensure that
          it contains the proper data.  (Note that running fsck -n
          reports all problems without attempting any repair.)




     Page 1                                        (last mod. 1/15/87)





     crash(8)                                                 crash(8)



          fsck also reports on incorrect link counts and requests
          permission to adjust any that are erroneous. In addition, it
          reconnects any files or directories allocated without any
          file system references to a ``lost+found'' directory.
          Finally, if the free list is bad (out of range, missing, or
          duplicate blocks) fsck will, with the operators concurrence,
          construct a new one.

        Why did it crash?
          Oreo types a message on the console typewriter when it
          voluntarily crashes.  Here is the current list of such
          messages, with enough information to provide a hope of the
          remedy.  The message has the form ``panic: ...'', possibly
          accompanied by other information.  Left unstated in all
          cases is the possibility that hardware or software error
          produced the message in some unexpected way.  Not all
          systems produce all of these panics.

          bflush: bad free list.
               A buffer management error occurred during a sync or
               umount system call

          blkdev
               The getblk routine was called with a nonexistent major
               device as argument.  Definitely hardware or software
               error.

          devtab
               Null device table entry for the major device used as
               argument to getblk.  Definitely hardware or software
               error.

          dpfrelse
               The list of processes currently mapped into the memory
               management unit has been lost (68451 only).

          iinit
               An I/O error reading the super-block for the root file
               system during initialization.

          interrupt stack overflow
               The kernel ran out of stack space on an interrupt.
               Subroutine depth is too great or too many local
               variables.

          kernel memory management error
               Bus error or address error in supervisor mode.  Can be
               a software or hardware problem.

          kernel parity error
               A memory parity error occurred while the cpu was in
               supervisor mode.



     Page 2                                        (last mod. 1/15/87)





     crash(8)                                                 crash(8)



          lost vmap segment
               An error occurred opening a shared memory segment.

          no data pages
               A memory management error occurred allocating mmu
               registers for a process's data segment.

          no fs
               A device has disappeared from the mounted-device table.
               Definitely hardware or software error.

          no imt
               Like no fs, but produced elsewhere.

          no clock
               During initialization, neither the line nor
               programmable clock existed.

          no procs
               Process table has been destroyed.

          no text page
               A memory management error occurred allocating mmu
               registers for a process's text segment.

          I/O error in swap
               An unrecoverable I/O error during a swap.  Really
               shouldn't be a panic, but it is hard to fix.

          oops!!! syscall
               Missing the interrupt vector for system calls.

          out of swap space
               A program needs to be swapped out, and there is no more
               swap space.  It has to be increased.  This really
               shouldn't be a panic, but there is no easy fix.

          timeout table overflow
               The timeout table overflowed.  The timeout table is not
               large enough or some routine is starting up too many
               timeouts.

          trap An unexpected trap occurred within the system.  This is
               accompanied by the following information:

               trap type
               2        bus error
               3        address error
               4        illegal instruction
               5        divide by zero
               6        CHK instruction
               7        TRAPV instruction



     Page 3                                        (last mod. 1/15/87)





     crash(8)                                                 crash(8)



               8        privilege violation
               9        trace
               10       1010 emulator
               11       1111 emulator
               12-255   unexpected interrupt
               virtual address    (for bus/address errors only)
               physical address
               instruction register
               function code
               mmu dump
               program counter
               status register
               program id
               registers

          unexpected kernel trap
               A buserr or similar unexpected exception occurred while
               the cpu was in supervisor mode.

          In some of these cases, it is possible to add hex 1000 into
          the trap type; this indicates that the processor was in user
          mode when the trap occurred.

     SEE ALSO
          fsck(1M), boot(8).






























     Page 4                                        (last mod. 1/15/87)



Typewritten Software • bear@typewritten.org • Edmonds, WA 98026