FILSYS(5) — System Interface Manual — File Formats
Each disk drive contains some number of file systems. A file system consists of a number of cylinder groups. Each cylinder group has inodes and data.
A file system is described by its super-block, which in turn describes the cylinder groups. The super-block is critical data and is replicated in each cylinder group to protect against catastrophic loss. This is done at mkfs time and the critical super-block data does not change, so the copies need not be referenced further unless disaster strikes.
Addresses stored in inodes are capable of addressing fragments of ‘blocks’. File system blocks of at most size MAXBSIZE can be optionally broken into 2, 4, or 8 pieces, each of which is addressible; these pieces may be DEV_BSIZE, or some multiple of a DEV_BSIZE unit.
Large files consist of exclusively large data blocks. To avoid undue wasted disk space, the last data block of a small file may be allocated as only as many fragments of a large block as are necessary. The file system format retains only a single pointer to such a fragment, which is a piece of a single large block that has been divided. The size of such a fragment is determinable from information in the inode, using the “blksize(fs, ip, lbn)” macro.
The file system records space availability at the fragment level; to determine block availability, aligned fragments are examined.
The root inode is the root of the file system. Inode 0 can’t be used for normal purposes and historically bad blocks were linked to inode 1, thus the root inode is 2. (inode 1 is no longer used for this purpose, however numerous dump tapes make this assumption, so we are stuck with it) The lost+found directory is given the next available inode when it is created by “mkfs”.
MINFREE gives the minimum acceptable percentage of file system blocks which may be free. If the freelist drops below this level only the superuser may continue to allocate blocks. This may be set to 0 if no reserve of free blocks is deemed necessary, however severe performance degredations will be observed if the file system is run at greater than 90% full; thus the default value of fs_minfree is 10%.
Empirically the best trade-off between block fragmentation and overall disk utilization at a loading of 90% comes with a fragmentation of 4, thus the default fragment size is a fourth of the block size.
Under current technology, most 300MB disks have 32 sectors and 16 tracks, thus these are the defaults used for fs_nsect and fs_ntrak respectively.
Cylinder group related limits.
For each cylinder we keep track of the availability of blocks at different rotational positions, so that we can lay out the data to be picked up with minimum rotational latency. NRPOS is the number of rotational positions which we distinguish. With NRPOS 8 the resolution of our summary information is 2ms for a typical 3600 rpm drive.
ROTDELAY gives the minimum number of milliseconds to initiate another disk transfer on the same cylinder. It is used in determining the rotationally optimal layout for disk blocks within a file; the default of fs_rotdelay is 2ms.
Each file system has a number of inodes statically allocated. We allocate one inode slot per NBPI bytes, expecting this to be far more than we will ever need.
MAXIPG bounds the number of inodes per cylinder group, and is needed only to keep the structure simpler by having the only a single variable size element (the free bit map).
N.B.: MAXIPG must be a multiple of INOPB(fs).
MINBSIZE is the smallest allowable block size. In order to insure that it is possible to create files of size 2^32 with only two levels of indirection, MINBSIZE is set to 4096. MINBSIZE must be big enough to hold a cylinder group block, thus changes to (struct cg) must keep its size within MINBSIZE. MAXCPG is limited only to dimension an array in (struct cg); it can be made larger as long as that structures size remains within the bounds dictated by MINBSIZE. Note that super blocks are always of size MAXBSIZE, and that MAXBSIZE must be >= MINBSIZE.
The path name on which the file system is mounted is maintained in fs_fsmnt. MAXMNTLEN defines the amount of space allocated in the super block for this name. The limit on the amount of summary information per file system is defined by MAXCSBUFS. It is currently parameterized for a maximum of two million cylinders.
Per cylinder group information; summarized in blocks allocated from first cylinder group data blocks. These blocks have to be read in from fs_csaddr (size fs_cssize) in addition to the super block.
N.B. sizeof(struct csum) must be a power of two in order for the “fs_cs” macro to work (see below).
Super block for a file system.
MAXBPC bounds the size of the rotational layout tables and is limited by the fact that the super block is of size SBSIZE. The size of these tables is INVERSELY proportional to the block size of the file system. It is aggravated by sector sizes that are not powers of two, as this increases the number of cylinders included before the rotational pattern repeats (fs_cpc). Its size is derived from the number of bytes remaining in (struct fs)
MAXBPG bounds the number of blocks of data per cylinder group, and is limited by the fact that cylinder groups are at most one block. Its size is derived from the size of blocks and the (struct cg) size, by the number of remaining bits.
NAME
filsys, flblk, ino − format of file system volume
SYNOPSIS
#include <sys/types.h>
#include <sys/flbk.h>
#include <sys/filsys.h>
#include <sys/inode.h>
DESCRIPTION
Every file system storage volume (e.g. RF disk, RK disk, RP disk, DECtape reel) has a common format for certain vital information. Every such volume is divided into a certain number of 1024-byte blocks. Block 0 is unused and is available to contain a bootstrap program, pack label, or other information.
Block 1 is the super block. The layout of the super block as defined by the include file <sys/filsys.h> is:
/∗fs.h4.883/04/08 ∗/
/∗
∗ Each disk drive contains some number of file systems.
∗ A file system consists of a number of cylinder groups.
∗ Each cylinder group has inodes and data.
∗
∗ A file system is described by its super-block, which in turn
∗ describes the cylinder groups. The super-block is critical
∗ data and is replicated in each cylinder group to protect against
∗ catastrophic loss. This is done at mkfs time and the critical
∗ super-block data does not change, so the copies need not be
∗ referenced further unless disaster strikes.
∗
∗ For file system fs, the offsets of the various blocks of interest
∗ are given in the super block as:
∗[fs->fs_sblkno]Super-block
∗[fs->fs_cblkno]Cylinder group block
∗[fs->fs_iblkno]Inode blocks
∗[fs->fs_dblkno]Data blocks
∗ The beginning of cylinder group cg in fs, is given by
∗ the “cgbase(fs, cg)” macro.
∗
∗ The first boot and super blocks are given in absolute disk addresses.
∗/
#define BBSIZE8192
#define SBSIZE8192
#defineBBLOCK((daddr_t)(0))
#defineSBLOCK((daddr_t)(BBLOCK + BBSIZE / DEV_BSIZE))
/∗
∗ Addresses stored in inodes are capable of addressing fragments
∗ of ‘blocks’. File system blocks of at most size MAXBSIZE can
∗ be optionally broken into 2, 4, or 8 pieces, each of which is
∗ addressible; these pieces may be DEV_BSIZE, or some multiple of
∗ a DEV_BSIZE unit.
∗
∗ Large files consist of exclusively large data blocks. To avoid
∗ undue wasted disk space, the last data block of a small file may be
∗ allocated as only as many fragments of a large block as are
∗ necessary. The file system format retains only a single pointer
∗ to such a fragment, which is a piece of a single large block that
∗ has been divided. The size of such a fragment is determinable from
∗ information in the inode, using the “blksize(fs, ip, lbn)” macro.
∗
∗ The file system records space availability at the fragment level;
∗ to determine block availability, aligned fragments are examined.
∗
∗ The root inode is the root of the file system.
∗ Inode 0 can’t be used for normal purposes and
∗ historically bad blocks were linked to inode 1,
∗ thus the root inode is 2. (inode 1 is no longer used for
∗ this purpose, however numerous dump tapes make this
∗ assumption, so we are stuck with it)
∗ The lost+found directory is given the next available
∗ inode when it is created by “mkfs”.
∗/
#defineROOTINO((ino_t)2) /∗ i number of all roots ∗/
#define LOSTFOUNDINO(ROOTINO + 1)
/∗
∗ Cylinder group related limits.
∗
∗ For each cylinder we keep track of the availability of blocks at different
∗ rotational positions, so that we can lay out the data to be picked
∗ up with minimum rotational latency. NRPOS is the number of rotational
∗ positions which we distinguish. With NRPOS 8 the resolution of our
∗ summary information is 2ms for a typical 3600 rpm drive.
∗/
#defineNRPOS8 /∗ number distinct rotational positions ∗/
/∗
∗ MAXIPG bounds the number of inodes per cylinder group, and
∗ is needed only to keep the structure simpler by having the
∗ only a single variable size element (the free bit map).
∗
∗ N.B.: MAXIPG must be a multiple of INOPB(fs).
∗/
#defineMAXIPG2048 /∗ max number inodes/cyl group ∗/
/∗
∗ MINBSIZE is the smallest allowable block size.
∗ In order to insure that it is possible to create files of size
∗ 2^32 with only two levels of indirection, MINBSIZE is set to 4096.
∗ MINBSIZE must be big enough to hold a cylinder group block,
∗ thus changes to (struct cg) must keep its size within MINBSIZE.
∗ MAXCPG is limited only to dimension an array in (struct cg);
∗ it can be made larger as long as that structures size remains
∗ within the bounds dictated by MINBSIZE.
∗ Note that super blocks are always of size MAXBSIZE,
∗ and that MAXBSIZE must be >= MINBSIZE.
∗/
#define MINBSIZE4096
#defineMAXCPG32 /∗ maximum fs_cpg ∗/
/∗
∗ The path name on which the file system is mounted is maintained
∗ in fs_fsmnt. MAXMNTLEN defines the amount of space allocated in
∗ the super block for this name.
∗ The limit on the amount of summary information per file system
∗ is defined by MAXCSBUFS. It is currently parameterized for a
∗ maximum of two million cylinders.
∗/
#define MAXMNTLEN 512
#define MAXCSBUFS 32
/∗
∗ Per cylinder group information; summarized in blocks allocated
∗ from first cylinder group data blocks. These blocks have to be
∗ read in from fs_csaddr (size fs_cssize) in addition to the
∗ super block.
∗
∗ N.B. sizeof(struct csum) must be a power of two in order for
∗ the “fs_cs” macro to work (see below).
∗/
struct csum {
longcs_ndir;/∗ number of directories ∗/
longcs_nbfree;/∗ number of free blocks ∗/
longcs_nifree;/∗ number of free inodes ∗/
longcs_nffree;/∗ number of free frags ∗/
};
/∗
∗ Super block for a file system.
∗/
#defineFS_MAGIC0x011954
structfs
{
structfs ∗fs_link;/∗ linked list of file systems ∗/
structfs ∗fs_rlink;/∗ used for incore super blocks ∗/
daddr_tfs_sblkno;/∗ addr of super-block in filesys ∗/
daddr_tfs_cblkno;/∗ offset of cyl-block in filesys ∗/
daddr_tfs_iblkno;/∗ offset of inode-blocks in filesys ∗/
daddr_tfs_dblkno;/∗ offset of first data after cg ∗/
longfs_cgoffset;/∗ cylinder group offset in cylinder ∗/
longfs_cgmask;/∗ used to calc mod fs_ntrak ∗/
time_t fs_time; /∗ last time written ∗/
longfs_size;/∗ number of blocks in fs ∗/
longfs_dsize;/∗ number of data blocks in fs ∗/
longfs_ncg;/∗ number of cylinder groups ∗/
longfs_bsize;/∗ size of basic blocks in fs ∗/
longfs_fsize;/∗ size of frag blocks in fs ∗/
longfs_frag;/∗ number of frags in a block in fs ∗/
/∗ these are configuration parameters ∗/
longfs_minfree;/∗ minimum percentage of free blocks ∗/
longfs_rotdelay;/∗ num of ms for optimal next block ∗/
longfs_rps;/∗ disk revolutions per second ∗/
/∗ these fields can be computed from the others ∗/
longfs_bmask;/∗ “blkoff” calc of blk offsets ∗/
longfs_fmask;/∗ “fragoff” calc of frag offsets ∗/
longfs_bshift;/∗ “lblkno” calc of logical blkno ∗/
longfs_fshift;/∗ “numfrags” calc number of frags ∗/
/∗ these are configuration parameters ∗/
longfs_maxcontig;/∗ max number of contiguous blks ∗/
longfs_maxbpg;/∗ max number of blks per cyl group ∗/
/∗ these fields can be computed from the others ∗/
longfs_fragshift;/∗ block to frag shift ∗/
longfs_fsbtodb;/∗ fsbtodb and dbtofsb shift constant ∗/
longfs_sbsize;/∗ actual size of super block ∗/
longfs_csmask;/∗ csum block offset ∗/
longfs_csshift;/∗ csum block number ∗/
longfs_nindir;/∗ value of NINDIR ∗/
longfs_inopb;/∗ value of INOPB ∗/
longfs_nspf;/∗ value of NSPF ∗/
longfs_sparecon[6];/∗ reserved for future constants ∗/
/∗ sizes determined by number of cylinder groups and their sizes ∗/
daddr_t fs_csaddr;/∗ blk addr of cyl grp summary area ∗/
longfs_cssize;/∗ size of cyl grp summary area ∗/
longfs_cgsize;/∗ cylinder group size ∗/
/∗ these fields should be derived from the hardware ∗/
longfs_ntrak;/∗ tracks per cylinder ∗/
longfs_nsect;/∗ sectors per track ∗/
long fs_spc; /∗ sectors per cylinder ∗/
/∗ this comes from the disk driver partitioning ∗/
longfs_ncyl; /∗ cylinders in file system ∗/
/∗ these fields can be computed from the others ∗/
longfs_cpg;/∗ cylinders per group ∗/
longfs_ipg;/∗ inodes per group ∗/
longfs_fpg;/∗ blocks per group ∗ fs_frag ∗/
/∗ this data must be re-computed after crashes ∗/
structcsum fs_cstotal;/∗ cylinder summary information ∗/
/∗ these fields are cleared at mount time ∗/
char fs_fmod; /∗ super block modified flag ∗/
char fs_clean; /∗ file system is clean flag ∗/
char fs_ronly; /∗ mounted read-only flag ∗/
char fs_flags; /∗ currently unused flag ∗/
charfs_fsmnt[MAXMNTLEN]; /∗ name mounted on ∗/
/∗ these fields retain the current block allocation info ∗/
longfs_cgrotor;/∗ last cg searched ∗/
structcsum ∗fs_csp[MAXCSBUFS];/∗ list of fs_cs info buffers ∗/
longfs_cpc;/∗ cyl per cycle in postbl ∗/
shortfs_postbl[MAXCPG][NRPOS];/∗ head of blocks for each rotation ∗/
longfs_magic;/∗ magic number ∗/
u_charfs_rotbl[1];/∗ list of blocks for each rotation ∗/
/∗ actually longer ∗/
};
/∗
∗ Convert cylinder group to base address of its global summary info.
∗
∗ N.B. This macro assumes that sizeof(struct csum) is a power of two.
∗/
#define fs_cs(fs, indx) fs_csp[(indx) >> (fs)->fs_csshift][(indx) & ~(fs)->fs_csmask]
/∗
∗ MAXBPC bounds the size of the rotational layout tables and
∗ is limited by the fact that the super block is of size SBSIZE.
∗ The size of these tables is INVERSELY proportional to the block
∗ size of the file system. It is aggravated by sector sizes that
∗ are not powers of two, as this increases the number of cylinders
∗ included before the rotational pattern repeats (fs_cpc).
∗ Its size is derived from the number of bytes remaining in (struct fs)
∗/
#defineMAXBPC(SBSIZE - sizeof (struct fs))
/∗
∗ Cylinder group block for a file system.
∗/
#defineCG_MAGIC0x090255
structcg {
structcg ∗cg_link;/∗ linked list of cyl groups ∗/
structcg ∗cg_rlink;/∗ used for incore cyl groups ∗/
time_tcg_time;/∗ time last written ∗/
longcg_cgx;/∗ we are the cgx’th cylinder group ∗/
shortcg_ncyl;/∗ number of cyl’s this cg ∗/
shortcg_niblk;/∗ number of inode blocks this cg ∗/
longcg_ndblk;/∗ number of data blocks this cg ∗/
structcsum cg_cs;/∗ cylinder summary information ∗/
longcg_rotor;/∗ position of last used block ∗/
longcg_frotor;/∗ position of last used frag ∗/
longcg_irotor;/∗ position of last used inode ∗/
longcg_frsum[MAXFRAG]; /∗ counts of available frags ∗/
longcg_btot[MAXCPG];/∗ block totals per cylinder ∗/
shortcg_b[MAXCPG][NRPOS]; /∗ positions of free blocks ∗/
charcg_iused[MAXIPG/NBBY]; /∗ used inode map ∗/
longcg_magic;/∗ magic number ∗/
u_charcg_free[1];/∗ free block map ∗/
/∗ actually longer ∗/
};
/∗
∗ MAXBPG bounds the number of blocks of data per cylinder group,
∗ and is limited by the fact that cylinder groups are at most one block.
∗ Its size is derived from the size of blocks and the (struct cg) size,
∗ by the number of remaining bits.
∗/
#defineMAXBPG(fs) (fragstoblks((fs), (NBBY ∗ ((fs)->fs_bsize - (sizeof (struct cg))))))
/∗
∗ Turn file system block numbers into disk block addresses.
∗ This maps file system blocks to device size blocks.
∗/
#define fsbtodb(fs, b)((b) << (fs)->fs_fsbtodb)
#definedbtofsb(fs, b)((b) >> (fs)->fs_fsbtodb)
/∗
∗ Cylinder group macros to locate things in cylinder groups.
∗ They calc file system addresses of cylinder group data structures.
∗/
#definecgbase(fs, c)((daddr_t)((fs)->fs_fpg ∗ (c)))
#define cgstart(fs, c) (cgbase(fs, c) + (fs)->fs_cgoffset ∗ ((c) & ~((fs)->fs_cgmask)))
#definecgsblock(fs, c)(cgstart(fs, c) + (fs)->fs_sblkno) /∗ super blk ∗/
#definecgtod(fs, c)(cgstart(fs, c) + (fs)->fs_cblkno) /∗ cg block ∗/
#definecgimin(fs, c)(cgstart(fs, c) + (fs)->fs_iblkno) /∗ inode blk ∗/
#definecgdmin(fs, c)(cgstart(fs, c) + (fs)->fs_dblkno) /∗ 1st data ∗/
/∗
∗ Macros for handling inode numbers:
∗ inode number to file system block offset.
∗ inode number to cylinder group number.
∗ inode number to file system block address.
∗/
#defineitoo(fs, x)((x) % INOPB(fs))
#defineitog(fs, x)((x) / (fs)->fs_ipg)
#defineitod(fs, x) ((daddr_t)(cgimin(fs, itog(fs, x)) + (blkstofrags((fs), (((x) % (fs)->fs_ipg) / INOPB(fs))))))
/∗
∗ Give cylinder group number for a file system block.
∗ Give cylinder group block number for a file system block.
∗/
#definedtog(fs, d)((d) / (fs)->fs_fpg)
#definedtogd(fs, d)((d) % (fs)->fs_fpg)
/∗
∗ Extract the bits for a block from a map.
∗ Compute the cylinder and rotational position of a cyl block addr.
∗/
#define blkmap(fs, map, loc) (((map)[loc / NBBY] >> (loc % NBBY)) & (0xff >> (NBBY - (fs)->fs_frag)))
#define cbtocylno(fs, bno) ((bno) ∗ NSPF(fs) / (fs)->fs_spc)
#define cbtorpos(fs, bno) ((bno) ∗ NSPF(fs) % (fs)->fs_nsect ∗ NRPOS / (fs)->fs_nsect)
/∗
∗ The following macros optimize certain frequently calculated
∗ quantities by using shifts and masks in place of divisions
∗ modulos and multiplications.
∗/
#define blkoff(fs, loc)/∗ calculates (loc % fs->fs_bsize) ∗/ ((loc) & ~(fs)->fs_bmask)
#define fragoff(fs, loc)/∗ calculates (loc % fs->fs_fsize) ∗/ ((loc) & ~(fs)->fs_fmask)
#define lblkno(fs, loc)/∗ calculates (loc / fs->fs_bsize) ∗/ ((loc) >> (fs)->fs_bshift)
#define numfrags(fs, loc)/∗ calculates (loc / fs->fs_fsize) ∗/ ((loc) >> (fs)->fs_fshift)
#define blkroundup(fs, size)/∗ calculates roundup(size, fs->fs_bsize) ∗/ (((size) + (fs)->fs_bsize - 1) & (fs)->fs_bmask)
#define fragroundup(fs, size)/∗ calculates roundup(size, fs->fs_fsize) ∗/ (((size) + (fs)->fs_fsize - 1) & (fs)->fs_fmask)
#define fragstoblks(fs, frags)/∗ calculates (frags / fs->fs_frag) ∗/ ((frags) >> (fs)->fs_fragshift)
#define blkstofrags(fs, blks)/∗ calculates (blks ∗ fs->fs_frag) ∗/ ((blks) << (fs)->fs_fragshift)
/∗
∗ Determine the number of available frags given a
∗ percentage to hold in reserve
∗/
#define freespace(fs, percentreserved) (blkstofrags((fs), (fs)->fs_cstotal.cs_nbfree) + (fs)->fs_cstotal.cs_nffree - ((fs)->fs_dsize ∗ (percentreserved) / 100))
/∗
∗ Determining the size of a file block in the file system.
∗/
#define blksize(fs, ip, lbn) (((lbn) >= NDADDR || (ip)->i_size >= ((lbn) + 1) << (fs)->fs_bshift) ? (fs)->fs_bsize : (fragroundup(fs, blkoff(fs, (ip)->i_size))))
#define dblksize(fs, dip, lbn) (((lbn) >= NDADDR || (dip)->di_size >= ((lbn) + 1) << (fs)->fs_bshift) ? (fs)->fs_bsize : (fragroundup(fs, blkoff(fs, (dip)->di_size))))
/∗
∗ Number of disk sectors per block; assumes DEV_BSIZE byte sector size.
∗/
#defineNSPB(fs)((fs)->fs_nspf << (fs)->fs_fragshift)
#defineNSPF(fs)((fs)->fs_nspf)
/∗
∗ INOPB is the number of inodes in a secondary storage block.
∗/
#defineINOPB(fs)((fs)->fs_inopb)
#defineINOPF(fs)((fs)->fs_inopb >> (fs)->fs_fragshift)
/∗
∗ NINDIR is the number of indirects in a file system block.
∗/
#defineNINDIR(fs)((fs)->fs_nindir)
#ifdef KERNEL
structfs ∗getfs();
structfs ∗mountfs();
#endif
S_isize is the address of the first block after the i-list, which starts just after the super-block, in block 2. Thus the i-list is s_isize−2 blocks long. S_fsize is the address of the first block not potentially available for allocation to a file. These numbers are used by the system to check for bad block addresses; if an ‘impossible’ block address is allocated from the free list or is freed, a diagnostic is written on the on-line console. Moreover, the free array is cleared, so as to prevent further allocation from a presumably corrupted free list.
The free list for each volume is maintained as follows. The s_free array contains, in s_free[1], ... , s_free[s_nfree−1], up to NICFREE free block numbers. NICFREE is a configuration constant. S_free[0] is the block address of the head of a chain of blocks constituting the free list. The layout of each block of the free chain as defined in the include file <sys/fblk.h> is:
The fields df_nfree and df_free in a free block are used exactly like s_nfree and s_free in the super block. To allocate a block: decrement s_nfree, and the new block number is s_free[s_nfree]. If the new block address is 0, there are no blocks left, so give an error. If s_nfree became 0, read the new block into s_nfree and s_free. To free a block, check if s_nfree is NICFREE; if so, copy s_nfree and the s_free array into it, write it out, and set s_nfree to 0. In any event set s_free[s_nfree] to the freed block’s address and increment s_nfree.
S_ninode is the number of free i-numbers in the s_inode array. To allocate an i-node: if s_ninode is greater than 0, decrement it and return s_inode[s_ninode]. If it was 0, read the i-list and place the numbers of all free inodes (up to NICINOD) into the s_inode array, then try again. To free an i-node, provided s_ninode is less than NICINODE, place its number into s_inode[s_ninode] and increment s_ninode. If s_ninode is already NICINODE, don’t bother to enter the freed i-node into any table. This list of i-nodes is only to speed up the allocation process; the information as to whether the inode is really free or not is maintained in the inode itself.
The fields s_lasti and s_nbehind are used to avoid searching the inode list from the beginning each time the system runs out of inodes. S_lasti gives the base of the block of inodes last searched on the filesystem when inodes ran out, and s_nbehind gives the number of inodes, whose numbers were less than s_lasti when they were freed with s_ninode already NICINODE. Thus s_ninode is the number of free inodes before s_lasti. The system will search forward for free inodes from s_lasti for more inodes unless s_nbehind is sufficiently large, in which case it will search the file system inode list from the beginning. This mechanism serves to avoid n∗∗2 behavior in allocating inodes.
S_flock and s_ilock are flags maintained in the core copy of the file system while it is mounted and their values on disk are immaterial. The value of s_fmod on disk is likewise immaterial; it is used as a flag to indicate that the super-block has changed and should be copied to the disk during the next periodic update of file system information. S_ronly is a write-protection indicator; its disk value is also immaterial.
S_time is the last time the super-block of the file system was changed. During a reboot, s_time of the super-block for the root file system is used to set the system’s idea of the time.
The fields s_tfree, s_tinode, s_fname and s_fpack are not currently maintained.
I-numbers begin at 1, and the storage for i-nodes begins in block 2. I-nodes are 64 bytes long, so 16 of them fit into a block. I-node 2 is reserved for the root directory of the file system, but no other i-number has a built-in meaning. Each i-node represents one file. The format of an i-node as given in the include file <sys/ino.h> is:
/∗inode.h4.2283/02/10 ∗/
/∗
∗ The I node is the focus of all file activity in UNIX.
∗ There is a unique inode allocated for each active file,
∗ each current directory, each mounted-on file, text file, and the root.
∗ An inode is ’named’ by its dev/inumber pair. (iget/iget.c)
∗ Data in icommon is read in from permanent inode on volume.
∗/
#defineNDADDR12 /∗ direct addresses in inode ∗/
#defineNIADDR3 /∗ indirect addresses in inode ∗/
struct inode {
structinode ∗i_chain[2]; /∗ must be first ∗/
u_shorti_flag;
u_shorti_count;/∗ reference count ∗/
dev_ti_dev;/∗ device where inode resides ∗/
u_shorti_shlockc;/∗ count of shared locks on inode ∗/
u_shorti_exlockc;/∗ count of exclusive locks on inode ∗/
ino_ti_number;/∗ i number, 1-to-1 with device address ∗/
structfs ∗i_fs;/∗ file sys associated with this inode ∗/
structdquot ∗i_dquot; /∗ quota structure controlling this file ∗/
union {
daddr_tif_lastr; /∗ last read (read-ahead) ∗/
structsocket ∗is_socket;
struct{
struct inode ∗if_freef; /∗ free list forward ∗/
struct inode ∗∗if_freeb; /∗ free list back ∗/
} i_fr;
} i_un;
struct icommon
{
u_shortic_mode; /∗ 0: mode and type of file ∗/
shortic_nlink; /∗ 2: number of links to file ∗/
shortic_uid; /∗ 4: owner’s user id ∗/
shortic_gid; /∗ 6: owner’s group id ∗/
quadic_size; /∗ 8: number of bytes in file ∗/
time_tic_atime; /∗ 16: time last accessed ∗/
longic_atspare;
time_tic_mtime; /∗ 24: time last modified ∗/
longic_mtspare;
time_tic_ctime; /∗ 32: last time inode changed ∗/
longic_ctspare;
daddr_tic_db[NDADDR]; /∗ 40: disk block addresses ∗/
daddr_tic_ib[NIADDR]; /∗ 88: indirect blocks ∗/
longic_flags; /∗ 100: status, currently unused ∗/
longic_spare[6]; /∗ 104: reserved, currently unused ∗/
} i_ic;
};
struct dinode {
union {
structicommon di_icom;
chardi_size[128];
} di_un;
};
#definei_modei_ic.ic_mode
#definei_nlinki_ic.ic_nlink
#definei_uidi_ic.ic_uid
#definei_gidi_ic.ic_gid
/∗ ugh! -- must be fixed ∗/
#ifdef vax
#definei_sizei_ic.ic_size.val[0]
#endif
#ifdef sun
#definei_sizei_ic.ic_size.val[1]
#endif
#definei_dbi_ic.ic_db
#definei_ibi_ic.ic_ib
#definei_atimei_ic.ic_atime
#definei_mtimei_ic.ic_mtime
#definei_ctimei_ic.ic_ctime
#definei_rdevi_ic.ic_db[0]
#definei_lastri_un.if_lastr
#definei_socketi_un.is_socket
#definei_forwi_chain[0]
#definei_backi_chain[1]
#definei_freefi_un.i_fr.if_freef
#definei_freebi_un.i_fr.if_freeb
#define di_icdi_un.di_icom
#definedi_modedi_ic.ic_mode
#definedi_nlinkdi_ic.ic_nlink
#definedi_uiddi_ic.ic_uid
#definedi_giddi_ic.ic_gid
#ifdef vax
#definedi_sizedi_ic.ic_size.val[0]
#endif
#ifdef sun
#definedi_sizedi_ic.ic_size.val[1]
#endif
#definedi_dbdi_ic.ic_db
#definedi_ibdi_ic.ic_ib
#definedi_atimedi_ic.ic_atime
#definedi_mtimedi_ic.ic_mtime
#definedi_ctimedi_ic.ic_ctime
#definedi_rdevdi_ic.ic_db[0]
#ifdef KERNEL
struct inode ∗inode;/∗ the inode table itself ∗/
struct inode ∗inodeNINODE;/∗ the end of the inode table ∗/
intninode;/∗ number of slots in the table ∗/
structinode ∗rootdir;/∗ pointer to inode of root directory ∗/
structinode ∗ialloc();
structinode ∗iget();
#ifdef notdef
structinode ∗ifind();
#endif
structinode ∗owner();
structinode ∗maknode();
structinode ∗namei();
ino_tdirpref();
#endif
/∗ flags ∗/
#defineILOCKED0x1 /∗ inode is locked ∗/
#defineIUPD0x2 /∗ file has been modified ∗/
#defineIACC0x4 /∗ inode access time to be updated ∗/
#defineIMOUNT0x8 /∗ inode is mounted on ∗/
#defineIWANT0x10 /∗ some process waiting on lock ∗/
#defineITEXT0x20 /∗ inode is pure text prototype ∗/
#defineICHG0x40 /∗ inode has been changed ∗/
#defineISHLOCK0x80 /∗ file has shared lock ∗/
#defineIEXLOCK0x100 /∗ file has exclusive lock ∗/
#defineILWAIT0x200 /∗ someone waiting on file lock ∗/
/∗ modes ∗/
#defineIFMT0170000 /∗ type of file ∗/
#defineIFCHR0020000 /∗ character special ∗/
#defineIFDIR0040000 /∗ directory ∗/
#defineIFBLK0060000 /∗ block special ∗/
#defineIFREG0100000 /∗ regular ∗/
#defineIFLNK0120000 /∗ symbolic link ∗/
#defineIFSOCK0140000 /∗ socket ∗/
#defineISUID04000 /∗ set user id on execution ∗/
#defineISGID02000 /∗ set group id on execution ∗/
#defineISVTX01000 /∗ save swapped text even after use ∗/
#defineIREAD0400 /∗ read, write, execute permissions ∗/
#defineIWRITE0200
#defineIEXEC0100
#defineILOCK(ip) { while ((ip)->i_flag & ILOCKED) { (ip)->i_flag = IWANT; sleep((caddr_t)(ip), PINOD); } (ip)->i_flag = ILOCKED; }
#defineIUNLOCK(ip) { (ip)->i_flag &= ~ILOCKED; if ((ip)->i_flag&IWANT) { (ip)->i_flag &= ~IWANT; wakeup((caddr_t)(ip)); } }
#defineIUPDAT(ip, t1, t2, waitfor) { if (ip->i_flag&(IUPD IACC ICHG)) iupdat(ip, t1, t2, waitfor); }
Di_mode tells the kind of file; it is encoded identically to the st_mode field of stat(2). Di_nlink is the number of directory entries (links) that refer to this i-node. Di_uid and di_gid are the owner’s user and group IDs. Size is the number of bytes in the file. Di_atime and di_mtime are the times of last access and modification of the file contents (read, write or create) (see times(2)); Di_ctime records the time of last modification to the inode or to the file, and is used to determine whether it should be dumped.
Special files are recognized by their modes and not by i-number. A block-type special file is one which can potentially be mounted as a file system; a character-type special file cannot, though it is not necessarily character-oriented. For special files, the di_addr field is occupied by the device code (see types(5)). The device codes of block and character special files overlap.
Disk addresses of plain files and directories are kept in the array di_addr packed into 3 bytes each. The first 10 addresses specify device blocks directly. The last 3 addresses are singly, doubly, and triply indirect and point to blocks of 256 block pointers. Pointers in indirect blocks have the type daddr_t (see types(5)).
For block b in a file to exist, it is not necessary that all blocks less than b exist. A zero block number either in the address words of the i-node or in an indirect block indicates that the corresponding block has never been allocated. Such a missing block reads as if it contained all zero words.
SEE ALSO
fsck(8), icheck(8), dcheck(8), dir(5), mount(8), stat(2), types(5)
Sun System Release 0.3 — Release 1.0 January 1983