aio(7) DEVICES AND MODULES aio(7)
NAME
aio - a primitive asynchronous I/O facility for NEWS-OS
Release 5.0U
SYNOPSIS
#include <sys/file.h>
#include <sys/ioctl.h>
#include <sys/async.h>
INTRODUCTION
The AIO (asynchronous I/O) facility allows processes a prim-
itive way of performing asynchronous reads and writes to a
particular raw disk device. This means that a process may
start an I/O operation and, instead of blocking until the
I/O is completed, continue executing while the I/O is
queued, while it is being performed, and after it is com-
pleted. At some point in time, it is expected that the pro-
cess will request the status on all previously-sent I/O
requests.
The process uses the same devices that it would normally use
for raw disk I/O. However, it must use the ioctl(2) mechan-
ism to send the I/O request and to receive the status of
previously sent requests. In order for the AIO facility to
function correctly, the process must lock the part of memory
that is used for transferring to and from disk. The process
should also call a special asynchronous I/O close routine
when the raw device is closed (to perform cleanup func-
tions). Otherwise, system resources can be wasted since the
special close routine is not called until after the process
exits or execs another process.
It is also possible to have asynchronous notification of
completed AIO requests which eliminates the need to continu-
ously poll for completed requests. If this function is
enabled when a request is issued, then when that request is
finished, a SIGEMT signal is sent to the process.
The 3 ioctl commands and the fcntl flags that apply to AIO
are:
DKIOCMLOCK lock a portion of memory for IO transfer to/from
disk
DKIOCASTRT start (send) an asynchronous IO request
DKIOCASTAT return the status of previous requests.
FSACLOSE call the asynchronous close routine when file is
closed
FSASYNC cause asynchronous notification of completed
1
aio(7) DEVICES AND MODULES aio(7)
requests.
Here is a very general algorithm describing the proper use
of asynchronous I/O:
1. The process allocates buffer space for the transfer of
data to and from the disk.
2. It opens the raw device and sets FSACLOSE via fcntl.
It can also optionally set the FSASYNC flag for asyn-
chronous notification.
3. It then locks I/O buffer with the DKIOCMLOCK ioctl com-
mand.
4. The process now issues (several) I/O requests using the
ioctl command DKIOCASTRT and then retrieves the status
of those requests using the DKIOCASTAT ioctl command.
KERNEL CONFIGURATION
The AIO mechanism is configured into the kernel through the
sysgen.xxx file in the master.gen directory in
/usr/src/uts/sony. The xxx stands for your CPU configura-
tion, single or dual. NEWS 32XX, 34XX, and 37XX worksta-
tions are single CPU models and 38XX workstations are dual
CPU models. If the mechanism is to be included, the entry
should read
INCLUDE: saio
To exclude the mechanism, the entry should read
EXCLUDE: saio
The number of async buffers can be changed by editing the
file saio in /etc/master.d. Edit the line defining the
macro NSASYNCBUFS. As distributed, the line reads
#define NSASYNCBUFS 100
DESCRIPTION
In order for a process to perform asynchronous I/O, it is
necessary to set aside a locked portion of memory which the
device drivers access to transfer the data for each request.
This is done to keep those pages from being paged out while
I/O is being performed. Memory is locked via the DKIOCMLOCK
ioctl command:
ioctl(fd, DKIOCMLOCK, &memlock); struct asyncmlock mem-
lock;
2
aio(7) DEVICES AND MODULES aio(7)
The struct asyncmlock specifies the starting (virtual)
address of the memory to lock and the number of bytes to
lock. Once locked, the locked portion of memory cannot be
unlocked or changed. Any attempted AIO requests referencing
memory outside the locked area fail.
The process should also set the FSACLOSE flag on the file
descriptor. This can be done with F_SETFL via the fcntl (2)
system call. The FSASYNC flag can also be set at this time
if asynchronous notification of completed requests is
desired.
To send an I/O a request to the driver, the process sends a
DKIOCASTRT ioctl command:
ioctl(fd, DKIOCASTRT, &reqbuf); struct areqbuf reqbuf;
The struct areqbuf specifies the command to be performed,
the locations on disk and in core to transfer to and from,
and the size of the transfer. The command may be either
AU_READ, AU_WRITE, or AU_ORDWRITE. The difference between
AU_WRITE and AU_ORDWRITE is that the latter, called an
ordered write, causes all subsequent writes to physically
occur only after the ordered write request has finished.
(Normally, when a write request is issued, it is optimally
sorted into the disk drive sort queue; that is, the order of
physical writes is independent of the order sent.) In the
current implementation, the size of the transfer MUST be the
size of one page only. If no internal AIO buffers are
available, then the call fails with the error EWOULDBLOCK.
NOTE If the AU_ORDWRITE command is used, it is up to the
process to order the requests for optimum disk opera-
tion. Incorrect use of this feature can severely
reduce disk performance.
To retrieve the status of up to MAXSTATUS completed
requests, a process uses the DKIOCASTAT ioctl command:
ioctl(fd, DKIOCASTAT, &iostat); struct asyncstatus ios-
tat;
The struct asyncstatus is filled out by the AIO driver and
consists of a count of the number of requests for which
status is being returned (if no requests have finished, then
the count is zero), and a data structure (struct aiostat)
for each of those completed requests. The aiostat data
structure consists of the status the driver returned for
that particular request and two fields which can be used to
identify the request. These are the size of the request
sent and its in-core memory address (as taken from the
struct areqbuf argument to DKIOCASTRT).
3
aio(7) DEVICES AND MODULES aio(7)
The following definitions (from <sys/async.h>) describe the
ioctl commands and related data structures necessary for
performing AIO.
/*
* Starting I/O (DKIOCASTRT)
* The parameter passed back to SVR4 is a struct areqbuf.
*/
typedef struct areqbuf {
long au_cmd; /* AU_READ, etc */
long au_daddr; /* destination on disk */
char *au_maddr; /* (virtual) memory address */
long au_size; /* bytes to transfer */
} AREQBUF;
/*
* command bits
* Note: AU_ORDWRITE is treated as AU_WRITE in the kernel
* They exist as separate flags for user convenience only.
*/
#define AU_READ 01 /* read request */
#define AU_WRITE 02 /* unordered write request */
#define AU_ORDWRITE 04 /* ordered write request */
#define AU_CMDMASK 07 /* mask of command bits */
/*
* The disk block to transfer to/from is in the lower 3 bytes
* of au_daddr, the upper byte is ignored by SVR4.
*/
#define ADBLK(A) ((A) & 0xffffff)
/*
* Getting I/O Completion Status (DKIOCSTAT)
* The parameter passed back from SVR4 is a struct asyncstatus.
*/
typedef struct aiostat {
short iostatus; /* I/O completion status for request */
short iobsize; /* verification (from AREQBUF dbsize) */
char *iomaddr; /* verification (from AREQBUF dbmaddr)*/
} IOSTAT;
#define MAXSTATUS 15
typedef struct asyncstatus {
long acount; /* # of requests being returned */
IOSTAT astatus[MAXSTATUS]; /* completion status per request */
} ASYNCSTATUS;
4
aio(7) DEVICES AND MODULES aio(7)
/*
* Locking Memory for I/O (DKIOCMLOCK)
* The parameters are passed to SVR4 as a struct asyncmlock.
*/
typedef struct asyncmlock {
char *avaddr; /* starting virtual address */
unsigned asize; /* size of area to be locked */
} ASYNCMLOCK;
ERRORS
All errors are the same as returned by the normal I/O system
calls (read, write, ioctl, etc). See the relevant manual
pages for information on these errors. It is possible for
the iobsize field in the IOSTAT structure to be modified if
a read request was beyond the end of a partition. A non-
zero iostatus field in the IOSTAT structure signifies that
an error occurred during the IO and the user should not
count on the data for an AU_READ and should assume that the
data for an AU_WRITE or AU_ORDWRITE was not written.
SEE ALSO
fcntl(2), ioctl(2)
NOTES
Using ioctls to do IO is not fun!
The page at a time requests should be eliminated; The driver
should handle page requests with sizes of more than 1 page.
EXAMPLE
The following is a simple C program that reads asynchro-
nously the first block from device /dev/rsd000. (For clar-
ity, some error checking code is not included)
#include <stdio.h>
#include <sys/types.h>
#include <sys/file.h>
#include <sys/errno.h>
#include <sys/ioctl.h>
#include <sys/async.h>
#include <signal.h>
#define reqstat iostat.astatus[0]
#define COMMAND AU_READ
#define DEVICE "/dev/rsd000"
#define PAGETOREAD 0
#define PAGESIZE 4096 /* disk blk/system page size */
#define do_something()
AREQBUF reqbuf; /* request buffer */
ASYNCSTATUS iostat; /* status buffer */
5
aio(7) DEVICES AND MODULES aio(7)
char buffer[PAGESIZE]; /* transfer buffer */
ASYNCMLOCK memlock; /* mem locking info */
int gotsig;
getsigemt(sig)
int sig;
{
gotsig = 1;
}
main()
{
int i, fd;
extern errno;
if ((fd = open(DEVICE, O_RDONLY)) == -1) {
perror("open");
exit(1);
}
/* arrange to catch SIGEMT */
signal(SIGEMT, getsigemt);
/* Cause the AIO close routine to be called when this file is closed */
fcntl(fd, F_SETFL, FSACLOSE);
/* Cause a SIGEMT to be sent when io is complete */
fcntl(fd, F_SETFL, FSASYNC);
/* Lock I/O pages in memory */
memlock.avaddr = &buffer[0];
memlock.asize = PAGESIZE;
if (ioctl(fd, DKIOCMLOCK, &memlock) < 0) /* this is a fatal error */
exit(1);
/* Set up the request with the proper parameters */
reqbuf.au_cmd = COMMAND;
reqbuf.au_daddr = PAGETOREAD;
reqbuf.au_maddr = &buffer[0];
reqbuf.au_size = PAGESIZE;
/* Start the read */
retryread:
if (ioctl(fd, DKIOCASTRT, &reqbuf) < 0) {
if (errno == EWOULDBLOCK) {
goto retryread;
}
exit(1);
}
/*
6
aio(7) DEVICES AND MODULES aio(7)
* This is the point where the process normally would do
* something else, however, for this example we will just
* call a dummy routine. Note: the function must return at
* some point to pick up the status of the sent requests.
*/
while (!gotsig)
do_something();
if (ioctl(fd, DKIOCASTAT, &iostat) < 0) {
perror("ioctl: DKIOCASTAT");
exit(1);
}
/* check for any errors/inconsistencies */
if (reqstat.iobsize != reqbuf.au_size) {
printf("READ FAILED: iobsize %d, wanted %d\n",
reqstat.iobsize, reqbuf.au_size);
exit(1);
}
if (reqstat.iomaddr != reqbuf.au_maddr) {
printf("READ FAILED: iomaddr 0x%x, wanted 0x%x\n",
reqstat.iomaddr, reqbuf.au_maddr);
exit(1);
}
if (reqstat.iostatus) {
printf("READ FAILED: iostatus 0x%x\n", reqstat.iostatus);
exit(1);
}
printf("successful read\n");
close(fd);
exit(0);
}
7