buf(D4) buf(D4)
NAME
buf - block I/O data transfer structure
SYNOPSIS
#include <sys/types.h>
#include <sys/proc.h>
#include <sys/buf.h>
#include <sys/ddi.h>
DESCRIPTION
The buf structure, also called a buffer header, is the basic
data structure for block I/O transfers.
USAGE
Each block I/O or physical (direct) I/O transfer has an
associated buffer header structure. This structure contains
control and status information for the transfer. Buffer
headers are passed to block drivers' strategy(D2) routines.
They may also be allocated by the driver itself using
getrbuf(D3), ngeteblk(D3), or geteblk(D3).
Associated with each buffer header is a data buffer which
holds the data for the I/O. The data buffer may be in kernel
data space, or it may be simply a list of physical memory
pages. It may also be a portion of user data space locked
into memory, in the physical I/O case [see physiock(D3)].
Do not depend on the size of the buf structure when writing a
driver (or any other module which needs binary compatibility).
In particular, this means you must only allocate buf
structures using DDI/DKI routines (for example, getrbuf).
Static allocations are not allowed.
It is important to note that a buffer header may be linked
onto multiple lists simultaneously, and is also passed back to
the system when the driver is done with it. Because of this,
most of the members in the buffer header cannot be changed by
the driver, even when the buffer header is in one of the
driver's work lists.
To understand the rules for using various members of the buf
structure, it is necessary to first understand the various
agents which handle a buffer, and how control of the buffer is
passed among these agents.
Copyright 1994 Novell, Inc. Page 1
buf(D4) buf(D4)
The first agent is the creator. The creator agent acquires
the buffer (through interfaces such as getblk, getrbuf, and
ngeteblk) and initializes it. Some of the initialization is
done inside the allocating interface; some is done by the
caller. The creator is often part of a filesystem module or a
kernel filter routine, but it may also be a driver routine
which calls getrbuf, geteblk, or ngeteblk.
Another agent is the I/O handler. The I/O handler is the part
of a device driver, starting from its strategy entry point,
which receives a buffer as an I/O request, handles the
transfer of data and/or errors, and indicates I/O completion
by calling biodone(D3).
An additional type of agent which sometimes gets involved in
the process, often transparently, is a kernel filter routine.
A filter routine receives a buffer which either has already
been passed to an I/O handler or is about to be, performs some
transformations on either the buffer itself or newly created
buffer(s) based on the original buffer, then passes the
transformed buffer(s) on to (or back to) the I/O handler or
the next filter.
Control of a buffer is handed off from one agent to another
over the life of the buffer. At any given time, exactly one
agent has control of the buffer. Only the controlling agent
may do anything with the buffer besides waiting for I/O
completion with biowait(D3) or biowait_sig(D3).
The creator controls the buffer initially. When the driver
strategy routine is called, control transfers to the I/O
handler. Control remains with the I/O handler until it calls
biodone. For synchronous I/O, the creator calls biowait (or
biowait_sig); when biowait returns, control returns to the
creator. If a filter routine is called, control transfers to
the filter routine until it calls the next-level strategy
routine. During I/O completion processing, if an iodone
routine is called (b_iodone was non-NULL), control transfers
to that routine, which is considered to be part of the agent
that set b_iodone in the first place.
Many buffer fields are only allowed to be modified by certain
agents. These and other restrictions are listed below, in the
descriptions of the individual fields. References to the
driver refer to the whole driver, no matter which agent.
Where a field is described as being preserved by an agent,
Copyright 1994 Novell, Inc. Page 2
buf(D4) buf(D4)
this means that either the agent does not change the field, or
that before giving up control of the buffer, the agent
restores the field's value to the value it had when the agent
first acquired control.
Structure Definitions
The buf structure, buf_t, contains the following members which
may be accessed by drivers. Note that some structure members
may not be present in all releases of the UNIX System.
uint_t b_flags; /* Misc flags; see below */
buf_t *b_forw; /* Kernel/driver list link */
buf_t *b_back; /* Kernel/driver list link */
buf_t *av_forw; /* Driver work list link */
buf_t *av_back; /* Driver work list link */
long b_bufsize; /* Size of allocated buffer */
uint_t b_bcount; /* Transfer count (in bytes) */
dev_t b_edev; /* Device major/minor number */
daddr_t b_blkno; /* Block number on device */
ushort_t b_blkoff; /* Byte offset within block on device */
uchar_t b_addrtype; /* Type of address in b_addr */
ushort_t b_scgth_count; /* Number of entries in scatter/gather */
/* list when b_addrtype is BA_SCGTH */
union {
caddr_t b_addr; /* Address of buffer data */
ba_scgth_t *b_scgth; /* scatter/gather list */
} b_un;
proc_t *b_proc; /* Process doing physical I/O */
uint_t b_resid; /* Number of bytes not transferred */
clock_t b_start; /* Request start time */
void (*b_iodone)(); /* Function called by biodone */
/* if non-NULL */
void *b_misc; /* Miscellaneous data (SVR4.2 MP only) */
union {
void *un_ptr;
int un_int;
} b_priv; /* Driver private data (SVR4.2 MP only) */
union {
void *un_ptr;
int un_int;
long un_long;
daddr_t un_daddr;
Copyright 1994 Novell, Inc. Page 3
buf(D4) buf(D4)
} b_priv2; /* Driver private data (SVR4.2 MP only) */
void *b_private; /* Driver private data (SVR4MP only) */
int b_error; /* Error number (only on systems which */
/* do not support bioerror) */
The scatter/gather structure ba_scgth_t contains exactly the
following members, in the order shown:
paddr_t sg_base; /* Base physical address */
size_t sg_size; /* Size, in bytes, of this piece */
Buffer Fields
Drivers are only allowed to access certain buffer fields.
Accesses by a driver to any other field are illegal and may
not continue to work in subsequent releases of the UNIX
System.
The following fields may be accessed by the driver:
b_flags This is a bitmask of flag bits which reflect
buffer status and control flags. This field may
not be directly assigned by any agent; it is only
legal to set or clear specific bits.
The driver may only access some of these flag
bits. The following flags may be accessed by the
driver:
B_PAGEIO
If set, the data buffer is represented as a
page list. This means that b_un.b_addr is
not a virtual address but is instead the
offset into the first page of a list of one
or more physical pages. This list of pages
is accessible through the getnextpg(D3)
function. The pages will be in contiguous
device block order, starting from the first
block, given by b_blkno. If B_PAGEIO is not
set, b_un.b_addr is a virtual address; it is
a global kernel virtual address if B_PHYS is
not set, or a user virtual address if B_PHYS
is set. This flag may not be changed except
by kernel utility routines which create,
map, or unmap the buffer. If a driver does
not have D_NOBRKUP set in its devflag, it
Copyright 1994 Novell, Inc. Page 4
buf(D4) buf(D4)
will never see a buffer with B_PAGEIO set.
B_PHYS
Indicates that the data buffer is in user
virtual space. b_un.b_addr contains the
starting user virtual address of the data
buffer. The data buffer and its virtual
addresses are locked in memory so that
accesses are guaranteed to succeed. The
user virtual address may not be directly
accessed; it must either be mapped into a
kernel virtual address using bp_mapin(D3),
converted to physical addresses using
vtop(D3) and b_proc, or copied to/from
kernel space using copyin(D3) and
copyout(D3) (the copyin/copyout routines may
only be used in the context of the user
process which initiated the I/O). B_PHYS
and B_PAGEIO will never be set
simultaneously. This flag may not be
changed except by kernel utility routines
which create, map, or unmap the buffer. If
a driver does not call physiock(D3) or does
not have D_NOBRKUP set in its devflag, it
will never see a buffer with B_PHYS set.
B_READ
Indicates that data are to be transferred
from the peripheral device into main memory.
This flag may only be changed by the creator
agent.
B_WRITE
Indicates that data are to be transferred
from main memory to the peripheral device.
B_WRITE is a pseudo-flag that occupies the
same bit location as B_READ. B_WRITE cannot
be directly tested; it is only detected as
the absence of B_READ [for example, !(bp-
>b_flags&B_READ)]; it can only be ``set'' by
clearing B_READ.
B_ERROR
The driver sets B_ERROR to indicate an error
occurred during an I/O transfer. On systems
where the bioerror(D3) function is
Copyright 1994 Novell, Inc. Page 5
buf(D4) buf(D4)
available, drivers should not access this
flag directly.
b_forw/b_back
These fields can be used to link the buffer onto a
doubly-linked list. They may only be used by the
creator agent (or, if the creator is in a driver,
by the whole driver), and only if the buffer was
created by getrbuf.
av_forw/av_back
These fields can be used to link the buffer onto a
doubly-linked list. The driver may change these
any time it controls the buffer. These fields
must be preserved by any filters.
b_bufsize This field contains the size in bytes of the
allocated buffer. The b_bufsize field may not be
changed except by kernel utility routines which
create buffers, or by the creator agent if the
buffer was created by getrbuf.
b_bcount This field specifies the number of bytes to be
transferred. It will be set to the total byte
count (which should always be a multiple of
NBPSCTR) upon initial entry to the I/O handler or
a filter. This field may be changed by the
creator or the I/O handler.
b_edev This field contains the external device number of
the device. Only the creator may change this
member.
b_blkno This field specifies the first logical block on
the device which is to be accessed. One block
equals NBPSCTR bytes. The driver may have to
convert this logical block number to a physical
location such as a cylinder, track, and sector of
a disk. Only the creator may change this member.
b_blkoff This field specifies the byte offset within the
block given by b_blkno of the beginning of the
transfer. This will always be less than NBPSCTR.
Only the creator may change this member. Unless
the driver indicates that it understands b_blkoff
by setting D_BLKOFF in its devflag, this field
Copyright 1994 Novell, Inc. Page 6
buf(D4) buf(D4)
will always be zero and may be ignored.
b_addrtype This field specifies the type of address used to
reference the buffer data. It can be contiguous
kernel virtual (BA_KVIRT), contiguous user virtual
(BA_UVIRT), a list of physical pages
(BA_PAGELIST), contiguous physical (BA_PHYS), or
physical scatter/gather list (BA_SCGTH). This
field may not be changed except by kernel utility
routines. Currently, this field is valid only for
strategy routines called by buf_breakup(D3).
b_scgth_count
This field represents the number of entries in the
scatter/gather list when b_addrtype is BA_SCGTH.
This field may not be changed except by kernel
utility routines.
b_un.b_addr This field indicates the start of the data buffer
for all address types except BA_SCGTH. It is
either a virtual address or an offset into the
first page of a page list (see B_PAGEIO, B_PHYS,
and b_addrtype for more details). The creator or
the I/O handler may change this member.
b_un.b_scgth
This field points to an array of ba_scgth_count
entries of type ba_scgth_t, when b_addrtype is
BA_SCGTH.
b_proc When B_PHYS is set, b_proc identifies the process
which contains the data buffer pointed to by the
user virtual address in b_un.b_addr. When B_PHYS
is not set, b_proc will be NULL. This field can
thus be used as the second argument to vtop to
convert user virtual addresses from b_un.b_addr
into physical addresses. When B_PAGEIO is set,
b_proc is undefined and should be ignored. This
field may not be changed except by kernel utility
routines which create buffers.
b_resid This field indicates the number of bytes not
transferred, usually because of an error (a value
of zero indicates a successful complete transfer).
The I/O handler must set this member before it
calls biodone.
Copyright 1994 Novell, Inc. Page 7
buf(D4) buf(D4)
b_start This field is used to hold the time the I/O
request was started (as obtained by
drv_getparm(LBOLT)). The I/O handler may set it
and use it upon I/O completion to compute response
time metrics.
b_iodone This field identifies a specific routine to be
called when the I/O has completed. If it is non-
NULL, biodone will call *b_iodone instead of doing
the normal completion processing. Any agent may
change this member, but it must restore the
previous value before calling biodone (and thus
relinquish control of the buffer); for the
creator, the previous value will always be NULL.
This protocol allows for ``stacking'' of iodone
routines (particularly useful for filters). Each
agent saves the old value, sets b_iodone to its
iodone routine, and hands off the buffer to the
next filter or the I/O handler. On completion,
the I/O handler calls biodone, and each iodone
routine performs its final processing, restores
b_iodone to the saved value, and calls biodone
again, thus invoking the next iodone routine.
b_misc This is a miscellaneous field for use by the
controlling agent. One common use is in
conjunction with b_iodone, to help in saving the
previous b_iodone value. Typically, the previous
values of b_iodone and b_misc would be saved in a
structure, b_misc set to point to this structure,
and b_iodone set to point to the new iodone
routine. This field may only be used by the
controlling agent. If the controlling agent is
the creator, it may modify b_misc directly;
otherwise it must preserve the original value
before returning the buffer to another agent.
This field is valid on SVR4.2 MP systems only.
b_priv/b_priv2
These fields are private fields for use by the
driver. No other agents interpret or change them.
These fields are valid on SVR4.2 MP systems only.
Copyright 1994 Novell, Inc. Page 8
buf(D4) buf(D4)
b_private This field is a private field for use by the
driver. No other agents interpret or change it.
The b_private field is valid on SVR4MP systems
only. Note that in other releases of the UNIX
System, this field is reserved for use by the
kernel, or may not exist at all, and therefore
should not be used by the driver.
b_error If B_ERROR is set, this field holds the errno
which indicates the type of error that occurred.
On systems where the bioerror function is
available, drivers should not access this field
directly.
Warnings
Buffers are a shared resource within the kernel. Drivers
should only read or write the members listed in this section
in accordance with the rules given above. Drivers that
attempt to use undocumented members of the buf structure risk
corrupting data in the kernel and on the device.
DDI/DKI-conforming drivers may only use buffer headers that
have been allocated using geteblk, ngeteblk, or getrbuf, or
have been passed to the driver strategy routine.
REFERENCES
biodone(D3), bioerror(D3), bioreset(D3), biowait(D3),
biowait_sig(D3), brelse(D3), clrbuf(D3), freerbuf(D3),
geteblk(D3), geterror(D3), getrbuf(D3), iovec(D4),
ngeteblk(D3), physiock(D3), strategy(D2), uio(D4)
NOTICES
Portability
All processors
Copyright 1994 Novell, Inc. Page 9