FCNTL(2) FCNTL(2)
NAME
fcntl - file and descriptor control
C SYNOPSIS
#include <unistd.h>
#include <fcntl.h>
int fcntl (int fildes, int cmd, ... /* arg */);
DESCRIPTION
fcntl provides for control over open descriptors. fildes is an open
descriptor obtained from a creat, open, dup, fcntl, pipe, socket, or
socketpair system call.
The commands available are:
FDUPFD Return a new descriptor as follows:
Lowest numbered available descriptor greater than or equal to
the third argument, arg, taken as an object of type int.
Refers to the same object as the original descriptor.
Same file pointer as the original file (i.e., both file
descriptors share one file pointer).
Same access mode (read, write or read/write).
Same descriptor status flags (i.e., both descriptors share the
same status flags).
Shares any locks associated with the original file descriptor.
The close-on-exec flag, FDCLOEXEC associated with the new
descriptor is cleared to keep the file open across calls to the
exec(2) family of functions.
FGETFD Get the file descriptor flags associated with the descriptor
fildes. If the FDCLOEXEC flag is 0 the descriptor will remain
open across exec, otherwise the descriptor will be closed upon
execution of exec.
FSETFD Set the file descriptor flags for fildes. Currently the only
flag implemented is FDCLOEXEC. Note: this flag is a per-
process and per-descriptor flag; setting or clearing it for a
particular descriptor will not affect the flag on descriptors
copied from it by a dup(2) or FDUPFD operation, nor will it
affect the flag on other processes instances of that
descriptor.
Page 1
FCNTL(2) FCNTL(2)
FGETFL Get file status flags and file access modes. The file access
modes may be extracted from the return value using the mask
OACCMODE.
FSETFL Set file status flags to the third argument, arg, taken as an
object of type int. Only the following flags can be set [see
fcntl(5)]: FAPPEND, FSYNC, DSYNC, RSYNC, FNDELAY, FNONBLK,
FLCFLUSH, FLCINVAL, FDIRECT, and FASYNC. Since arg is used as
a bit vector to set the flags, values for all the flags must be
specified in arg. (Typically, arg may be constructed by
obtaining existing values by FGETFL and then changing the
particular flags.) FAPPEND is equivalent to OAPPEND; FSYNC is
equivalent to OSYNC; FDSYNC is equivalent to ODSYNC; FRSYNC
is equivalent to ORSYNC; FNDELAY is equivalent to ONDELAY;
FNONBLK is equivalent to ONONBLOCK; FLCFLUSH is equivalent to
OLCFLUSH; FLCINVAL is equivalent to OLCINVAL; and FDIRECT is
equivalent to ODIRECT. FASYNC is equivalent to calling ioctl
with the FIOASYNC command (except that with ioctl all flags
need not be specified). This enables the SIGIO facilities and
is currently supported only on sockets.
Since the descriptor status flags are shared with descriptors
copied from a given descriptor by a dup(2) or FDUPFD
operation, and by other processes instances of that descriptor
a FSETFL operation will affect those other descriptors and
other instances of the given descriptors as well. For example,
setting or clearing the FNDELAY flag will logically cause an
FIONBIO ioctl(2) to be performed on the object referred to by
that descriptor. Thus all descriptors referring to that object
will be affected.
Flags not understood for a particular descriptor are silently
ignored except for FDIRECT. FDIRECT will return EINVAL if used
on other than an EFS, XFS or BDS file system file.
FFREESP Alter storage space associated with a section of the ordinary
file fildes. The section is specified by a variable of data
type struct flock pointed to by the third argument arg. The
data type struct flock is defined in the <fcntl.h> header file
[see fcntl(5)] and contains the following members: l_whence is
0, 1, or 2 to indicate that the relative offset l_start will be
measured from the start of the file, the current position, or
the end of the file, respectively. l_start is the offset from
the position specified in l_whence. l_len is the size of the
section. An l_len of 0 frees up to the end of the file; in
this case, the end of file (i.e., file size) is set to the
beginning of the section freed. Any data previously written
into this section is no longer accessible. If the section
specified is beyond the current end of file, the file is grown
and filled with zeroes. The l_len field is currently ignored,
and should be set to 0.
Page 2
FCNTL(2) FCNTL(2)
FALLOCSP This command is identical to FFREESP.
FFREESP64
This command is identical to FFREESP except that the type of
the data referred to by the third argument arg is a struct
flock64. In this version of the structure, l_start and l_len
are of type off64t instead of offt (64 bits instead of 32
bits).
FALLOCSP64
This command is identical to FFREESP64.
FFSSETDM Set the di_dmevmask and di_dmstate fields in an XFS on-disk
inode. The only legitmate values for these fields are those
previously returned in the bs_dmevmask and bs_dmstate fields of
the bulkstat structure. The data referred to by the third
argument arg is a struct fsdmidata. This structure's members
are fsd_dmevmask and fsd_dmstate. The di_dmevmask field is set
to the value in fsd_dmevmask. The di_dmstate field is set to
the value in fsd_dmstate. This command is restricted to root
or to processes with device management capabilities. Its sole
purpose is to allow backup and restore programs to restore the
aforementioned critical on-disk inode fields.
FDIOINFO Get information required to perform direct I/O on the specified
fildes. Direct I/O is performed directly to and from a user's
data buffer. Since the kernels buffer cache is no longer
between the two, the user's data buffer must conform to the
same type of constraints as required for accessing a raw disk
partition. The third argument, arg, points to a data type
struct dioattr which is defined in the <fcntl.h> header file
and contains the following members: d_mem is the memory
alignment requirement of the user's data buffer. d_miniosz
specifies block size, minimum I/O request size, and I/O
alignment. Ths size of all I/O requests must be a multiple of
this amount and the value of the seek pointer at the time of
the I/O request must also be an integer multiple of this
amount. d_maxiosz is the maximum I/O request size which can be
performed on the fildes. If an I/O request does not meet these
constraints, the read(2) or write(2) will return with EINVAL.
All I/O requests are kept consistent with any data brought into
the cache with an access through a non-direct I/O file
descriptor. See also FSETFL above and open (2).
FGETOWN Used by sockets: get the process ID or process group currently
receiving SIGIO and SIGURG signals; process groups are returned
as negative values.
FSETOWN Used by sockets: set the process or process group to receive
SIGIO and SIGURG signals; process groups are specified by
supplying arg as negative, otherwise arg is interpreted as a
process ID.
Page 3
FCNTL(2) FCNTL(2)
FFSGETXATTR
Get extended attributes associated with files in XFS file
systems. The arg points to a variable of type struct fsxattr.
The structure fields include: fsx_xflags (extended flag bits),
fsx_extsize (nominal extent size in file system blocks),
fsx_nextents (number of data extents in the file), fsx_uuid
(file unique id). Currently the only meaningful bits for the
fsx_xflags field are bit 0 (value 1), which if set means the
file is a realtime file, and bit 1 (value 2), which if set
means the file has preallocated space. A fsx_extsize value
returned indicates that a preferred extent size was previously
set on the file, a fsx_extsize of 0 indicates that the defaults
for that filesystem will be used.
FFSGETXATTRA
Identical to FFSGETXATTR except that the fsx_nextents field
contains the number of attribute extents in the file.
FFSSETXATTR
Set extended attributes associated with files in XFS file
systems. The arg points to a variable of type struct fsxattr,
but only the following fields are used in this call:
fsx_xflags and fsx_extsize. The fsx_xflags realtime file bit,
and the file's extent size, may be changed only when the file
is empty.
FGETBMAP Get the block map for a segment of a file in an XFS file
system. The arg points to an arry of variables of type struct
getbmap. All sizes and offsets in the structure are in units
of 512 bytes. The structure fields include: bmv_offset (file
offset of segment), bmv_block (starting block of segment),
bmv_length (length of segment), bmv_count (number of array
entries, including the first), and bmv_entries (number of
entries filled in). The first structure in the array is a
header, and the remaining structures in the array contain block
map information on return. The header controls iterative calls
to the FGETBMAP command. The caller fills in the bmv_offset
and bmv_length fields of the header to indicate the area of
interest in the file, and fills in the bmv_count field to
indicate the length of the array. If the bmv_length value is
set to -1 then the length of the interesting area is the rest
of the file. On return from a call, the header is updated so
that the command can be used again to obtain more information,
without re-initializing the structures. Also on return, the
bmv_entries field of the header is set to the number of array
entries actually filled in. The non-header structures will be
filled in with bmv_offset, bmv_block, and bmv_length. If a
region of the file has no blocks (is a hole in the file) then
the bmv_block field is set to -1.
Page 4
FCNTL(2) FCNTL(2)
FGETBMAPA
Identical to FGETBMAP except that information about the
attribute fork of the file is returned.
FRESVSP This command is used to allocate space to a file. A range of
bytes is specified with the struct flock. The blocks are
allocated, but not zeroed, and the file size does not change.
It is only supported on XFS and BDS file systems.
FRESVSP64
This command is identical to FRESVSP except that the type of
the data referred to by the third argument arg is a struct
flock64. In this version of the structure, l_start and l_len
are of type off64t instead of offt (64 bits instead of 32
bits).
FUNRESVSP
This command is used to free space from a file. A range of
bytes is specified with the struct flock. Partial filesystem
blocks are zeroed, and whole filesystem blocks are removed from
the file. The file size does not change. It is only supported
on XFS and BDS file systems.
FUNRESVSP64
This command is identical to FUNRESVSP except that the type of
the data referred to by the third argument arg is a struct
flock64. In this version of the structure, l_start and l_len
are of type off64t instead of offt (64 bits instead of 32
bits).
FFSYNC fsync data in a range of an ordinary file fildes. The section
is specified by a variable of data type struct flock pointed to
by the third argument arg. The data type struct flock is
defined in the <fcntl.h> header file [see fcntl(5)]. If field
l_type is set to 1, the call behaves like fdatasync(2). If
field l_type is set to 0, the call behaves like fsync(2).
fdatasync(2) syncs only the inode state required to ensure that
the data is permanently on the disk. fsync(2) syncs everything
that fdatasync(2) flushes but also syncs out the other state
associated with the file such as the current timestamps,
permissions, owner, etc. l_start specifies the start of the
range in the file to be sync'ed. l_len specifies the size of
the range. A l_len of 0 flushes everything up to the end of
the file. The remaining fields are ignored and should be set
to 0.
FFSYNC64 This command is identical to FFSYNC except that the type of
the data referred to by the third argument arg is a struct
flock64. In this version of the structure, l_start and l_len
are of type off64t instead of offt (64 bits instead of 32
bits).
Page 5
FCNTL(2) FCNTL(2)
FSETPRIO
FGETPRIO These two commands are intended for use alongwith direct io to
indicate to the system that certain file io should be given
preference over others. An io priority can be associated by the
set interface by providing the priority (a "short" value) as
the third argument arg. To clear the preference status, the set
interface should be invoked with the third argument arg set to
0. The get interface can be used to query the io priority
associated with a file. The set and get interfaces do work on
all types of files, but will have any effect only on xfs files.
FGETBIOSIZE
This command gets information about the preferred buffered I/O
size used by the system when performing buffered I/O (e.g.
standard Unix non-direct I/O) to and from the file. The
information is passed back in a structure of type struct
biosize pointed to by the third argument arg. The data type
struct biosize is defined in the <fcntl.h> header file [see
fcntl(5)]. biosize lengths are expressed in log base 2. That
is if the value is 14, then the true size is 2^14 (2 raised to
the 14th power). The biosz_read field will contain the current
value used by the system when reading from the file. Except at
the end-of-file, the system will read from the file in
multiples of this length. The biosz_write field will contain
the current value used by the system when writing to the file.
Except at the end-of-file, the system will write to the file in
multiples of this length. The dfl_biosz_read and
dfl_biosz_write will be set to the system default values for
the opened file. The biosz_flags field will be set to 1 if the
current read or write value has been explicitly set. Th
FGETBIOSIZE fcntl is supported only on XFS filesystems.
FSETBIOSIZE
This command the preferred buffered I/O size used by the system
when performing buffered I/O (e.g. standard Unix non-direct
I/O) to and from the file. The information is passed in a
structure of type struct biosize pointed to by the third
argument arg. Using smaller preferred I/O sizes can result in
performance improvements if the file is typically accessed
using small synchronous I/Os or if only a small amount of the
file is accessed using small random I/Os, resulting in little
or no use of the additional data read in near the random I/Os.
To explictly set the the preferred I/O sizes, the biosz_flags
field should be set to 0 and the biosz_read and biosz_write
fields should be set to the log base 2 of the desired read and
write lengths, respectively (e.g. 13 for 8K bytes, 14 for 16K
bytes, 15 for 32K bytes, etc.). Valid values are 13-16
inclusive for machines with a 4K byte pagesize and 14-16 for
machines with a 16K byte pagesize. The specified read and
write values must also result in lengths that are greater than
Page 6
FCNTL(2) FCNTL(2)
or equal to the filesystem block size. The dfl_biosz_read and
dfl_biosz_write fields are ignored.
If biosizes have already been explicitly set due to a prior use
of FSETBIOSIZE, and the requested sizes are larger then the
existing sizes, the fcntl call will return successfully and the
system will use the smaller of the two sizes. However, if
biosz_flags is set to 1, the system will use the new values
regardless of whether the new sizes are larger or smaller than
the old.
To reset the biosize values to the defaults for the filesystem
that the file resides in, the biosz_flags filed should be set
to 2. The remainder of the fields will be ignored in that
case.
Changes made by FSETBIOSIZE are transient. The sizes are
reset to the default values once the reference count on the
file drops to zero (e.g. all open file descriptors to that file
have been closed). See fstab(4) for details on how to set the
default biosize values for a filesystem. The FSETBIOSIZE
fcntl is supported only on XFS filesystems.
The following commands are used for record-locking. Locks may be placed
on an entire file or on segments of a file.
FGETLK Get the first lock which blocks the lock description given by
the variable of type struct flock pointed to by arg. The
information retrieved overwrites the information passed to
fcntl in the flock structure. If no lock is found that would
prevent this lock from being created, then the structure is
passed back unchanged except that the lock type will be set to
FUNLCK and the l_whence field will be set to SEEKSET. If a
lock is found that would prevent this lock from being created,
then the structure is overwritten with a description of the
first lock that is preventing such a lock from being created.
The returned structure will also contain the process ID and the
system ID of the process holding the lock. This command never
creates a lock; it tests whether a particular lock could be
created.
FSETLK Set or clear a file segment lock according to the variable of
type struct flock pointed to by arg [see fcntl(5)]. The cmd
FSETLK is used to establish read (F_RDLCK) and write (F_WRLCK)
locks, as well as remove either type of lock (F_UNLCK). If a
read or write lock cannot be set fcntl will return immediately
with an error value of -1.
FSETLKW This cmd is the same as FSETLK except that if a read or write
lock is blocked by other locks, the process will sleep until
the segment is free to be locked.
Page 7
FCNTL(2) FCNTL(2)
FGETLK64 This cmd is identical to FGETLK but uses a struct flock64
instead of a struct flock (see FFREESP64 above).
FSETLK64 This cmd is identical to FSETLK but uses a struct flock64
instead of a struct flock.
FSETLKW64
This cmd is identical to FSETLKW but uses a struct flock64
instead of a struct flock.
FRSETLK Used by the network lock daemon, lockd(3N), to communicate
with the NFS server kernel to handle locks on NFS files.
FRSETLKW Used by the network lock daemon, lockd(3N), to communicate
with the NFS server kernel to handle locks on NFS files.
FRGETLK Used by the network lock daemon, lockd(3N), to communicate
with the NFS server kernel to handle locks on NFS files.
FCHKFL This flag is used internally by FSETFL to check the legality
of file flag changes.
A read lock prevents any process from write locking the protected area.
More than one read lock may exist for a given segment of a file at a
given time. The file descriptor on which a read lock is being placed
must have been opened with read access.
A write lock prevents any process from read locking or write locking the
protected area. Only one write lock and no read locks may exist for a
given segment of a file at a given time. The file descriptor on which a
write lock is being placed must have been opened with write access.
The structure flock describes the type (l_type), starting offset
(l_whence), relative offset (l_start), size (l_len), process id (l_pid),
and system id (l_sysid) of the segment of the file to be affected. The
process id and system id fields are used only with the FGETLK cmd to
return the values for a blocking lock. Locks may start and extend beyond
the current end of a file, but may not be negative relative to the
beginning of the file. A lock may be set to always extend to the end of
file by setting l_len to zero (0). If such a lock also has l_whence and
l_start set to zero (0), the whole file will be locked. Changing or
unlocking a segment from the middle of a larger locked segment leaves two
smaller segments for either end. Locking a segment that is already
locked by the calling process causes the old lock type to be removed and
the new lock type to take effect. All locks associated with a file for a
given process are removed when a file descriptor for that file is closed
by that process or the process holding that file descriptor terminates.
Locks are not inherited by a child process in a fork(2) system call.
When file locking is used in conjunction with memory-mapped files over
NFS, the smallest locking granularity which will work properly with
multiple clients is the page size of the system. All clients must use
Page 8
FCNTL(2) FCNTL(2)
the same granularity.
When mandatory file and record locking is active on a file, [see
chmod(2)], read(2), creat(2), open(2), and write(2) system calls issued
on the file will be affected by the record locks in effect.
The following commands are used for SMB opportunistic locks. An SMB
server application will register oplocks on files and grant them to SMB
clients. When external references are made to oplocked files, the SMB
server is notified to revoke the oplocks granted to clients before
operations from the external references are allowed to continue.
FOPLKREG The oplock registration command identifies the file to oplock
and, via arg, the write side of the pipe (eg p[1] from the
pipe(int *p) call) to use as the signalling mechanism. The
same write side pipe can be used for any number of oplocked
files.
If any external references to the file already exist or the
caller already has an oplock on the file, the FOPLKREG command
fails with EAGAIN. If successful, the value of OPEXCLUSIVE is
returned.
FOPLKSTAT
The oplock state change command is used to get state change
information on any recently externally referenced files
registered with the given write side pipe (eg p[1] from a
pipe(int *p) call). The returned oplock_stat_t structure
pointed at by arg contains the current state (os_state) and the
dev/ino information (os_dev/os_ino) to identify the file.
This is only done on the write side of a pipe for which
select() indicates there is a byte of data to read() on the
read side. A byte of data must then be read() from the read
side of the pipe for each successful FOPLKSTAT run on the
write side for select() to again give proper notification.
External references that cause state change notification will
hang for a while until the SMB server acknowledges the
revocation (typically after revoking the oplock it granted to
the SMB client) or until the systunable oplock_timeout expires.
FOPLKACK The oplock acknowledgement command is primarily used to respond
to oplock state changes due to external references on the given
file. The value given by arg can be OPREVOKE to revoke the
oplock either voluntarily or as an acknowledgement of a state
change reported in an FOPLKSTAT command, or it can be -1 to
request the current state of the given file.
If FOPLKACK is not used to voluntarily revoke the oplock, the
oplock is automatically revoked on the SMB server's last
close() of the file.
Page 9
FCNTL(2) FCNTL(2)
If FOPLKACK is not used to revoke the oplock in response to a
state change indicated in an FOPLKSTAT command, the oplock is
automatically revoked when the oplock_timeout expires.
fcntl will fail if one or more of the following are true:
[EACCES] cmd is F_SETLK, the type of lock (l_type) is a read lock (
F_RDLCK, ) and the segment of a file to be lock is already
write locked by another process, or the type is a write
lock ( F_WRLOCK, ) and the segment of a file to be locked
in already read or write locked by another process.
[EBADF] Fildes is not a valid open file descriptor.
[EBADF] cmd is F_SETLK, or SETLKW, the type of lock (l_type) is a
read lock (F_RDLCK), and fildes is not a valid file
descriptor open for reading.
[EBADF] cmd is F_SETLK, or SETLKW, the type of lock (l_type) is a
write lock (F_WRLCK), and fildes is not a valid file
descriptor open for writing.
[EBADF] cmd is F_FREESP and fildes is not a valid file descriptor
open for writing.
[EBADF] cmd is F_OPLOCKREG and the file is not a regular file or
the arg is not the write side of a pipe.
[EMFILE] cmd is F_DUPFD and {OPEN_MAX} file descriptors are
currently in use by this process, or no file descriptors
greater than or equal to arg are available.
[EINVAL] cmd is F_DUPFD. arg is either negative, or greater than
or equal to the maximum number of open file descriptors
allowed each user [see getdtablesize(2)].
[EINVAL] cmd is F_GETLK, F_SETLK, or F_SETLKW and arg or the data
it points to is not valid.
[EINVAL] cmd is F_SETFL, arg includes FDIRECT and is being
performed on other than an EFS, XFS or BDS file system
file.
[EINVAL] cmd is F_SETBIOSIZE and arg is invalid.
[EINVAL] cmd is F_OPLKREG and fildes is a file in a filesystem
other than XFS. Kernel level oplocks are only supported
for XFS.
Page 10
FCNTL(2) FCNTL(2)
[EINVAL] cmd is F_OPLKACK and the arg is not OPREVOKE or -1.
[EAGAIN] cmd is F_FREESP , the file exists, mandatory file/record
locking is set, and there are outstanding record locks on
the file. This restriction is not currently enforced.
[EAGAIN] cmd is F_SETLK or F_SETLKW , mandatory file locking bit is
set for the file, and the file is currently being mapped
to virtual memory via mmap [see mmap(2)]. This
restriction is not currently enforced.
[EAGAIN] cmd is F_OPLKREG and there is more than one reference on
the file. Oplocks thus cannot be used to guarantee
exclusive access to the file.
[EAGAIN] cmd is F_OPLKSTAT and there are no state change messages
for the specified write side pipe.
[EPERM] cmd is F_OPLKREG or F_OPLKSTAT or F_OPLKACK and the user
is not superuser.
[ENOLCK] cmd is F_SETLK or F_SETLKW, the type of lock is a read or
write lock, and there are no more record locks available
(too many file segments locked) because the system maximum
{FLOCK_MAX} [see intro(2)], has been exceeded. This can
also occur if the object of the lock resides on a remote
system and the requisite locking daemons are not
configured in both the local and the remote systems. In
particular, if lockd(1M) is running but statd(1M) is not,
this error will be returned. An additional source for
this error is when statd(1M) is running but cannot be
contacted. This can occur when the address for the local
host cannot be determined. [See lockd(1M) and statd(1M).]
[EINTR] cmd is F_SETLKW and a signal interrupted the process while
it was waiting for the lock to be granted.
[EDEADLK] cmd is F_SETLKW, the lock is blocked by some lock from
another process, and putting the calling-process to sleep,
waiting for that lock to become free, would cause a
deadlock.
[EDEADLK] cmd is F_FREESP, mandatory record locking is enabled,
O_NDELAY and O_NONBLOCK are being clear and a deadlock
condition was detected.
[EFAULT] cmd is F_FREESP, and the value pointed to by the third
argument arg resulted in an address outside the process's
allocated address space.
Page 11
FCNTL(2) FCNTL(2)
[EFAULT] cmd is F_GETLK, F_SETLK or F_SETLKW, and arg points
outside the program address space.
[ESRCH] cmd is F_SETOWN and no process can be found corresponding
to that specified by arg.
[EIO] An I/O error occurred while reading from or writing to the
file system.
[EOVERFLOW] cmd is F_GETLK and the process ID of the process holding
the requested lock is too large to be stored in the l_pid
field.
[ETIMEDOUT] The object of the fcntl is located on a remote system
which is not available [see intro(2)].
SEE ALSO
lockd(1M), close(2), creat(2), dup(2), exec(2), fork(2),
getdtablesize(2), intro(2), open(2), pipe(2), fcntl(5).
DIAGNOSTICS
Upon successful completion, the value returned depends on cmd as follows:
F_DUPFD A new file descriptor.
F_GETFD Value of flag (only the low-order bit is defined).
F_SETFD Value other than -1.
F_GETFL Value of file flags.
F_SETFL Value other than -1.
F_FREESP Value of 0.
F_ALLOCSP Value of 0.
F_FREESP64
Value of 0.
F_ALLOCSP64
Value of 0.
F_DIOINFO Value of 0.
F_GETOWN pid of socket owner.
F_SETOWN Value other than -1.
F_FSGETXATTR
Value of 0.
F_FSSETXATTR
Value of 0.
F_GETBMAP Value of 0.
F_RESVSP Value of 0.
F_RESVSP64
Value of 0.
F_UNRESVSP
Value of 0.
F_UNRESVSP64
Value of 0.
F_GETLK Value other than -1.
F_SETLK Value other than -1.
Page 12
FCNTL(2) FCNTL(2)
F_SETLKW Value other than -1.
F_GETLK64 Value other than -1.
F_SETLK64 Value other than -1.
F_SETLKW64
Value other than -1.
Otherwise, a value of -1 is returned and errno is set to indicate the
error.
Page 13