PRESTO(4) — DEVICES AND NETWORK INTERFACES
NAME
Prestoserve − pseudo disk driver layer that caches synchronous writes in non-volatile memory
CONFIG
deviceps0 at vme16d16 ? csr 0x4c00
devicepr0 at vme24d32 ? csr 0x800000
SYNOPSIS
#include <sys/param.h>
#include <sundev/prestoioctl.h>
DESCRIPTION
Prestoserve is a pseudo device driver layered upon other disk drivers. It intercepts the I/O requests for the drivers it is layered on by replacing the entry points of the original driver in the bdevsw and cdevsw tables. Prestoserve caches the intercepted synchronous write requests in its non-volatile memory. Whenever Prestoserve needs to perform actual I/O, as when the cache needs draining (a bursty form of LRU replacement is used), it will call the original driver’s entry points to perform the actual I/O.
In an environment with lots of synchronous writes (like an NFS server with clients paging or modifying files), Prestoserve can substantially improve performance, since these writes will be performed at memory speeds rather then disk speeds, and writes resulting in a dirty cache hit avoid ever having to perform the earlier physical disk writes (since only the last one will be performed). This results in 1/2 to 2/3’s of all physical disk write operations being avoided, since every sequential NFS write to a file causes the inode and the indirect block to be also synchronously written.
An accelerated disk device (one which has Prestoserve layered on top of it), uses the same major and minor devices it used before it was accelerated. See the presto_chango(8) manual page for information on how to accelerate a particular Prestoserve filesystem.
There are two devices that must be found at boot time for Prestoserve to perform its write caching function. The pr device is the Prestoserve non-volatile memory device while the ps device is the Prestoserve status device for reading battery status. The accelerated disk devices can still be opened and used just like a normal disk even if both the Prestoserve devices are not found at boot time (which would happen if the Prestoserve board was removed from the VMEbus). In this case, the Prestoserve driver simply passes all I/O requests through to the appropriate device.
IOCTLS
Prestoserve does not intercept ioctl commands, thus they go directly to the real disk driver. The following ioctl commands may be performed on /dev/pr0. Some affect Prestoserve as a whole, while others only affect a particular accelerated partition.
PRGETSTATUS The argument to ioctl is a pointer to a struct presto_status. This structure contains battery status information, Prestoserve state, current and maximum memory sizes, and various Prestoserve statistics.
PRSETSTATE The argument to ioctl is a pointer to an int. This int can be either PRUP to enable Prestoserve or PRDOWN to disable Prestoserve On system reboot, Prestoserve is in the PRDOWN state and must be explicitly enabled via an ioctl. This is normally done through use of the presto(1) command from /etc/rc.local. When Prestoserve goes from the PRDOWN state to the PRUP state, the Prestoserve I/O statistics are reset. When Prestoserve goes from the PRUP state to the PRDOWN state, all the Prestoserve buffers are written back to the real disks and invalidated.
PRSETMEMSZ The argument to ioctl is a pointer to an int. This int is the size in bytes of the Prestoserve memory to use.
PRRESET The argument to ioctl is ignored. Like the PRSETSTATE ioctl, PRRESET sets the Prestoserve state to PRDOWN, but in addition, it reinitializes all of Prestoserve memory. If Prestoserve was in the PRERROR state and some Prestoserve buffers could not be written back to the disk because of disk I/O errors, these changes will be lost. Using this ioctl is the only way in software to force Prestoserve to discard data that cannot be written back to the real disk.
PRGETPRTAB The argument to ioctl is a pointer to a struct prtab. On input, the field bmajordev specifies the block device major number of the device whose struct prtab should be returned. The field bmajordev will be set to NODEV if the requested device does not exist or is not accelerated. The struct prtab contains a field enabled which is a bit vector, indexed by a minor device number, which indicates whether that minor device number has Prestoserve caching enabled on it.
PRNEXTPRTAB The argument to ioctl is a pointer to a struct prtab. This ioctl returns the struct prtab for the accelerated device with the smallest block device major number which is greater then the bmajordev field of the struct prtab argument. This allows the sequential retrieval of each accelerated device’s struct prtab by specifying the the block device major number of the previous one. To get the first accelerated device’s struct prtab, set the bmajordev field of the NODEV (defined in <sys/param.h>). Use the same struct prtab that was returned on the previous call to the next call. When the bmajordev field of the argument struct prtab is greater then or equal to the last accelerated device’s major block device number, the struct prtab returned will have the bmajordev field set to NODEV.
PRENABLE The argument to ioctl is a pointer to a dev_t. This will enable Prestoserve caching on the specified partition. If PR_BOUNCEIO is not set in the flags field of the prtab structure for this device, a test read of block zero on the given device will be performed to ensure it can access VMEbus space for DMA. If you see the system console message, presto: cannot access block device (%d, %d), or the system crashes immediately after enabling Prestoserve, you may have a disk controller or driver which cannot correctly access VMEbus space. If the device is a Xylogics 450 or 451 controller, configure the controller and the Multibus to VMEbus adapter to support 24 bit addresses. For other controllers or drivers, you should set the flag PR_BOUNCEIO in the flags field of the prtab structure in the Prestoserve device specific stubs file, /sys/sundev/pr[A-Z][A-Z].c (the [A-Z][A-Z] gets replaced with the particular device’s prefix converted to upper case). Using the sd device as an example, its device specific stubs file would be in /sys/sundev/prSD.c. Edit the stub file, and change the SDprtab structure initialization for the third field from 0 to PR_BOUNCEIO.
PRDISABLE The argument to ioctl is a pointer to a dev_t. If all cached data for the specified partition can be successfully written, then Prestoserve caching will be disabled for this partition. The
OPERATION
When Prestoserve is in the PRUP state, Prestoserve will cache all synchronous write requests for enabled partitions to the Prestoserve driver in non-volatile memory, writing back dirty Prestoserve cache data asynchronously to the real disks as needed. When Prestoserve is in the PRDOWN state, then by definition there is no valid data in the Prestoserve cache. When Prestoserve is in the PRERROR state, the only valid data in the Prestoserve cache will be dirty data that cannot be written back because of some real disk error. When Prestoserve is in the PRDOWN state, no data is put into the Prestoserve cache and all disk requests are simply passed through to the corresponding real disk driver.
When the system is shutdown cleanly using the reboot(2) system call (normally from shutdown(8), halt(8), or reboot(8)), Prestoserve is put in the PRDOWN state so that all the dirty Prestoserve data is written back to disk. This is done so that if the system is powered down, the Prestoserve board can be safely removed from the system without taking part of the logical disk’s contents with it.
On reboot, any valid dirty data still in the Prestoserve non-volatile memory cache will be written back. Prestoserve will then either be in the PRDOWN state if all the data was written back and the batteries are ok or the PRERROR state if an error occurred. Note that the only way that valid dirty Prestoserve data can be present on reboot of a system is because of a power failure, a kernel crash (from either a software or hardware problem), or a real disk error.
If a disk error from a real disk occurs or all the batteries are too low, Prestoserve will attempt to flush back all dirty data and enter the PRERROR state. While in the PRERROR state, new data written to a block not found in the Prestoserve cache are passed directly through to the real disk driver. If new data is written to a block that is found in the cache, Prestoserve will replace the existing block and synchronously write the block to the real disk driver to see if the error condition on that block still persists. If this write is successful and all Prestoserve data can now be written back, Prestoserve will leave the PRERROR state and return to its previous state.
Under no circumstances will Prestoserve ever lose anything that was successfully written to it without being explicitly told to do so by an PRRESET ioctl (normally through using the -R flag to presto(1)). This command should only be used when there is a fatal disk error and the data is no longer of importance and you want to get rid of the persistent data in the presto cache that no longer matters so that Prestoserve can be enabled again (e.g., you are going to replace a broken disk with a new one which will have newfs(8) run on it before the old file system contents are restored from backup tapes).
FILES
/dev/pr0 generic Prestoserve control device
SEE ALSO
presto(8), prestotool(1), prestoctl_svc(8), presto_chango(8)
DIAGNOSTICS
presto: error on dev (%d, %d)
The disk (%d, %d) had an I/O error on a Prestoserve write back operation.
presto: battery #1 %s, battery #2 %s, battery #3 %s
The status for each battery is reported as either “LOW” or “ok”.
presto: all batteries are low!
All three batteries now have excessively low voltage levels.
presto: disabling...
Prestoserve has disabled itself as a result of an all batteries low condition or from some underlying disk errors during write back.
presto: back online!
The previous disk errors or all batteries low condition has cleared itself and Prestoserve has enabled itself again.
presto: %d dirty buffers found
Dirty buffers were found after rebooting, and will be written to disk as soon as possible, generally when the first I/O request occurs to any accelerated device.
presto: writing dirty buffers
Starting to write the dirty buffers found after rebooting.
presto: dirty buffers written
The dirty buffers found after rebooting have been written to disk.
presto: dirty buffers found for host id 0x%x,
which is different from this host’s id (0x%x)
The Prestoserve board was not cleanly shutdown and it was previously in a different system. The driver during system initialization time will interactively let you 1) Throw away the data, 2) Write the data to disk, or 3) Halt the machine.
presto: Block device %d not accelerated in this kernel!
The Prestoserve board was not cleanly shutdown and it was previously running a kernel with a device accelerated which the current kernel doesn’t have accelerated. You should boot a kernel which has all the devices accelerated which the previous kernel had accelerated.
presto: cannot access block device (%d, %d)
Verify that controller PLUS adapter (if any) will handle 24 address bits
A test reading of block 0 failed when the partition had Prestoserve acceleration enabled on it; see the discussion of PRENABLE (and PRBOUNCEIO) under the IOCTLS section.
ERRORS
EPERM The command is PRSETSTATE, PRSETMEMSZ, PRRESET, PRENABLE or PRDISABLE and the effective uid of the caller is not root.
EBUSY The command is PRSETSTATE , PRSETMEMSZ or PRDISABLE and Prestoserve still has a fatal disk or battery problem.
ENOMEM The command is PRSETSTATE and the memory size specified exceeds the maximum size of the non-volatile memory board reported in the presto_status structure.
ENODEV The command is PRENABLE or PRDISABLE and the device specified by the dev_t is not a device initialized for use with Prestoserve.
Prestoserve 1.1 — Last change: September 13, 1989