========================================================================
M I C R O F O C U S C O B O L
V3.1.39
========================================================================
FILE STRUCTURES
===============
This document replaces the appendix "File Formats" in your COBOL
System Reference.
TABLE OF CONTENTS
=================
INTRODUCTION
FIXED AND VARIABLE FORMAT
BASIC FILE STRUCTURES
Fixed Structure
Variable Structure
STRUCTURE OF EACH FILE ORGANIZATION
Sequential Organization
Line Sequential Organization
Relative Organization
Indexed Organization
INTRODUCTION
============
This COBOL system provides three types of data file organization:
relative, indexed and sequential. Additionally, sequential files
fall into one of three categories: record sequential, printer sequential
and line sequential.
Record sequential, relative and indexed files can contain records that
are either all of fixed length, or records that are of variable length.
These files have fixed or variable format respectively. Printer
sequential and line sequential files contain records that are
implicitly variable length and have separate file formats.
Record sequential, relative and indexed files allow two different
formats: fixed and variable. The file format is specified explicitly
or implicitly as described in the section Fixed and Variable Format
later in this document. Fixed and variable format map indirectly
onto two types of file structure: fixed and variable.
The physical structure of each of these file types as they exist on
disk is explained in this document.
This information is provided for anyone who wants to understand the
nature of data files produced by programs created using this COBOL
system, or to process them outside the COBOL system where appropriate.
It can also be useful for debugging programs. However, you do not
need to understand these file structures to use data files from COBOL
programs.
You are advised not to process the files yourself using byte-stream I/O,
but to use COBOL syntax or the file handler call interface (documented in
an add-on product). This ensures that applications will function properly
if file formats are enhanced or developed in the future.
FIXED AND VARIABLE FORMAT
=========================
The format of the record sequential, relative and indexed file can be
explicitly or implicitly fixed or variable.
The file format is always fixed unless one of the following conditions is
specified for the file or file record:
o The RECORDING MODE IS V clause which always creates variable format.
o The RECORD IS VARYING clause which creates variable format provided
no RECORDING MODE IS F clause is present.
o The OCCURS...DEPENDING ON clause which creates variable format
when you set the NOODOSLIDE Compiler directive.
o The RECMODE"V" Compiler directive which creates variable format for
each file where no RECORDING MODE IS F or RECORD
CONTAINS n CHARACTERS clauses are present.
o The RECMODE"OSVS" Compiler directive which creates variable format
for files that contain fixed record definitions of different lengths.
o The data compression feature which creates variable format.
BASIC FILE STRUCTURES
=====================
There are four basic structures used for all files: fixed, variable,
line sequential and printer sequential.
Fixed is the structure used by fixed format record sequential and
relative files, and contains only fixed length records. The size
of each record is equal to the length of the largest record
definition for the file.
Variable is the structure of variable format record sequential and
relative files, and fixed and variable format indexed files. Variable
structure files can contain fixed or variable length records.
Line sequential is the structure of files with the line sequential
organization. Line sequential files are designed to enable you to read
source or text files created with the system editor. As such, the
format is operating system dependent but typically contains variable
length records with trailing spaces removed. See the section Line
Sequential Organization later in this document for further details.
Printer sequential is the structure of files that are destined
for a printer, either directly or by later spooling of a disk file.
They contain vertical and horizontal tab controls. The structure
of these files reflects what is required to drive a printer and
so is independent of the operating system. See the section Printer
Sequential Files later in this document for further details.
The following sections describe these file structures.
Fixed Structure
===============
Fixed structure files contain no record or file header information.
The records are all the same length, that length being determined
by the longest record defined in the File Description (FD) in the
program's File-Section.
Variable Structure
==================
Any files containing variable length records, with the exception of line
sequential files and files destined for the printer, contain a block of
128 bytes of header information at the start of the file. Each record in
the file is preceded by a 2- or 4-byte control field. The top 4 bits of
this field indicate the status of the record. A value of 0100 in these
bits means that this record is a user data record. Any other value means
that this record has either been deleted or is used internally. The
remainder of the control field contains the length of the record.
For all files where the maximum record size is less than 4096 (excluding
the prefix), the prefix is 2 bytes long. For all other files, the prefix
is 4 bytes long. Each record always starts on the next 4-byte boundary in
the file.
You must not alter the header information or the control fields in any
way since these are maintained by this COBOL system.
Record Header Types
-------------------
First 4 bits Record type
-------------------------------------------------------------------------
1 (0001) A system record (IDXFORMAT"4" files only).
This contains duplicate occurrence details in the
data file.
2 (0010) Deleted record (available for reuse via the Free
Space list).
3 (0011) System record.
4 (0100) User data record.
5 (0101) Reduced user data record (indexed files only).
The 16-bit word immediately following the data,
as indicated by the length in the header,
contains the space between the end of the data
record plus any padding characters and the
start of the next record header.
6 (0110) Pointer record (indexed files only).
The first 4 bytes following the record header
contain the offset in the file to the location of
the user data record.
7 (0111) User data record referenced by a Pointer record.
8 (1000) Reduced user data record referenced by a Pointer
record.
The first record in every variable structure file is a system record
called the File Header record. This is normally 128 bytes long.
The record header for each record starts on a 4-byte boundary.
Consequently, a record may be followed by up to three padding characters,
usually spaces. These padding characters are not included in the record
length.
Variable structure File Header record description:
Offset Size Description of the field
-------------------------------------------------------------------------
0 4 Length of the file header.
The first 4 bits are always set to 3 (0011 in
binary) indicating that this file header record is
a system record.
The remaining bits contain the length of the file
header record. If the maximum record length is less
than 4095 bytes, the length is 126 and is held in
the next 12 bits; otherwise it is 124 and is held
in the next 28 bits. Hence, in a file where the
maximum record length is less than 4095 bytes, this
field contains x"30 7E 00 00". Otherwise, this
field contains x"30 00 00 7C".
4 2 Database sequence number, used by add-on products
supplied with this COBOL system.
6 2 Integrity Flag. Indexed files only.
If this is non-zero when the header is read, it
indicates that the file is corrupt.
8 14 Creation date and time in YYMMDDHHMMSSCC format.
Indexed files only.
22 14 Reserved.
36 2 Reserved. Value 62 decimal; x"00 3E".
38 1 Not used. Set to zeros.
39 1 Organization.
1 = Sequential
2 = Indexed
3 = Relative
40 1 Not used. Set to zeros.
41 1 Data compression routine number.
0 = No compression
1 = CBLDC001
2-127 = Reserved for internal use
128-255 = User-defined compression routine
number
42 1 Not used. Set to zeros.
43 1 File format.
0 = Default
1 = C-ISAM
2 = LEVEL II COBOL
3 = Indexed file format used
by this COBOL system
4 = IDXFORMAT"4"
Note: This offset applies only to the index
(.idx) header of indexed files. It is
not used in .dat indexed files or
any other file type.
44 4 Reserved.
48 1 Recording mode.
0 = Fixed format
1 = Variable format
For indexed files, the Recording Mode field of the
index file takes precedence.
49 5 Not used. Set to zeros.
54 2 Not used. Set to zeros.
56 2 Maximum record length.
Example: with a maximum record of length 80
characters, this field will contain x"00 50".
58 2 Not used. Set to zeros.
60 2 Minimum record length.
Example: with a maximum record of length 2
characters, this field will contain x"00 02".
62 46 Not used. Set to zeros.
108 4 Version and build data for the indexed file handler
creating the file. Indexed files only.
112 16 Not used. Set to zeros.
STRUCTURE OF EACH FILE ORGANIZATION
===================================
The following sections describe the physical structure of the five data
file organizations.
Sequential Organization
=======================
Record Sequential Files
-----------------------
Record sequential files are intended to cater for binary data. These
files consist of a series of either fixed or variable length records.
The order of records in these files is set by the order of WRITE
statements when the file is created. The record order does not change
once it has been set. New records are added to the end of the file.
Each record in a record sequential file (except the first record)
has a unique record which precedes it, while each record (except the
last record) also has a unique record that follows it.
Record sequential files that are fixed length and not destined for the
printer have no record delimiter; the end of one record is immediately
followed by the beginning of the next.
Printer Sequential Files
------------------------
You can define a sequential file as a printer sequential or printer
destined file in either of the following ways:
o specify the LINE ADVANCING clause in the SELECT statement
o specify the ASSIGN TO PRINTER clause
For printer sequential files, specifying the WRITE statement without
the BEFORE or AFTER clause has the same effect as if you had
specified AFTER 1. Specifying the WRITE statement with the
BEFORE or AFTER clauses gives explicit vertical positioning
which you must only use for files destined for the printer. Using
these clauses for any other type of file will generally corrupt the file.
Printer sequential files should not be opened for INPUT or I/O.
Printer sequential file format consists of a sequence of print records
which are terminated by a carriage return (x"0D") with zero or more
vertical positioning characters between the print records.
A print record consists of zero or more printable characters.
The OPEN statement causes a x"0D" to be written to the file to
ensure that the printer is located at the first character position
before printing the first data record.
The WRITE statement causes trailing spaces to be removed from the
record before it is written to the printer with a terminating x"0D".
The BEFORE or AFTER clause specified in the WRITE statement
causes one or more line-feed characters (x"0A"), a form-feed
character (x"0C"), or a vertical tab character (x"0B") to be sent
to the printer after or before writing the data record.
Fixed Format Record Sequential Structure
----------------------------------------
In a fixed format record sequential file, each record immediately
follows the previous record in the file. Each record is the same
length as the maximum length record.
+--------------------------------------------+
| Fixed length record |
+--------------------------------------------+
| Fixed length record |
+--------------------------------------------+
. .
. .
+--------------------------------------------+
| Fixed length record |
+--------------------------------------------+
Variable Format Record Sequential Structure
-------------------------------------------
In a variable format record sequential file each record written
is preceded by a record header containing the length of the
record; the record is written at the length defined in the
program; the file contains a standard variable structure
file header record.
Up to three padding characters can follow a record to ensure that the
next record starts on a four-byte boundary.
+-----------------------------------------------+
| File Header record - 128 bytes |
| |
+--------+-----------------------------+---+----+
| Header | Variable length record | |
+--------+-----------------------------+---+--------+---+
| Header | Variable length record | |
+--------+------------------------------------------+---+
. . . .
. . . .
+--------+--------------------------------------+---+
| Header | Variable length record | |
+--------+--------------------------------------+---+
Printer Sequential Structure
----------------------------
+----+
| 0D |
+----+
| 0A |
+----+------------------------------------------+----+
| Print record | 0D |
+----+----+---+---------------------------------+----+
| 0A | 0A | 0A |
+----+----+---+----------------------------+----+
| Print record | 0D |
+---------------------------------+----+---+----+
| Print record | 0D |
+----+----------------------------+----+
| 0C |
+----+
| 0D |
+----+----+
| 0A | 0A |
+----+----+-----------------------------+----+
| Print record | 0D |
+---------------------------------------+----+
. . .
. . .
Line Sequential Organization
============================
Line sequential files are implemented to be consistent with your system
editor and any other similar utilities that use text files. They are
strictly operating system dependent; however, the scheme used by the
PC-DOS, OS/2 and UNIX operating systems is widely used and is
described here.
Line sequential files hold variable length text records, each containing
zero or more displayable or non-displayable characters. A WRITE
statement removes trailing spaces from the data record then adds
the system record delimiter. A READ statement removes the record
delimiter and if necessary pads the record area with trailing
spaces or returns surplus text as following records.
Each text record is followed by a record delimiter chosen by the
operating system to be consistent with your system editor. The
record delimiter varies depending on your operating system. See
the environment specific sections for line sequential files below
for further information.
A line sequential file must not be described as a printer destined file
and must not use the BEFORE or AFTER clause in the WRITE statement.
System editors expect text to contain only displayable characters.
However, line sequential files allow non-displayable characters with
a value of less than x"20" (space) to be written to and read from
them.
During a WRITE operation, non-displayable characters in the record
area are written to the file, each with a preceding LOW-VALUES or
null character (x"00") to show that they are not text characters.
A READ operation on the file removes the preceding LOW-VALUES
characters added during the WRITE operation. You can prevent null
insertion when writing to the file either by specifying the -N
run-time switch, or by a call to functions 46 or 47 of routine
x"91" to turn the N switch on or off, respectively, for a
particular file.
During a WRITE operation, any tab characters in a line sequential
file (x"09") are expanded to every eighth character position;
that is, the character following a tab will be in one of the
columns 9, 17, 25, 33, and so on. You can compress space
characters to tabs during output using either the +T
run-time switch, or a call to function 48 or 49 of routine
x"91" to turn the T switch on or off, respectively, for a
particular file.
Line Sequential Files on DOS, Windows and OS/2 Systems
------------------------------------------------------
The record delimiter x"0D0A" is used for DOS, Windows and OS/2
systems.
Any single byte x"1A" (user terminate run code) is used as an
unconditional file terminator (except when preceded by a null
character, as described below). If no x"1A" character is
encountered, the physical end of the file serves as the file
terminator.
When the file is closed, a terminating x"1A" character is NOT written.
Instead, the length of the file is used to determine where it ends.
On input, this COBOL system uses just the x"0A" as the record
delimiter. Additional device control characters
(such as x"0D", x"0B", x"0C") are discarded. x"1A" acts as a
record delimiter and also denotes the end of the file.
If you turn the N run-time switch off, you must make sure that any
COMP data does not contain bytes with a value of x"1A" (end-of-file
character) or x"0D" (record delimiter).
Line Sequential Files on UNIX Systems
-------------------------------------
The record delimiter on UNIX systems is a single byte x"0A" (the
default). However, for line sequential and relative files only, this
default record delimiter can be changed to that used by DOS,
Windows and OS/2.
If you turn off the N run-time switch (-N), you must make sure that
any COMP data does not contain bytes with a value of x"0A" (record
delimiter).
Line Sequential Structure
-------------------------
+-------------------------------------------+-----------+
| Variable length record | delimiter |
+------------+------------------------------+-----------+
| delimiter |
+------------+----------------+-----------+
| Variable length record | delimiter |
+-----------------------------+-----------+-----------+
| Variable length record | delimiter |
+-----------------------------------------+-----------+
. . .
. . .
. . .
+-----------------------------+-----------+
| Variable length record | delimiter |
+-----------------------------+-----------+
Relative Organization
=====================
Relative file organization enables you to access any record
randomly by specifying its ordinal position within the file. Data
held in relative files can consist of fixed or variable format
records which are of fixed length, the length being the length
of the longest record defined for the file. This is necessary so
that the COBOL file handling routines can quickly calculate the
physical location of any record given its record number within
the file.
Each record is uniquely identified by a record number. The
first record in the file is record number one, the second
record is number two, and so on.
Each record is followed by a record marker unless it is a
variable length file which indicates the current state of the
record. In a variable format file, the marker follows the
fixed length slot. The marker varies depending on your
environment. See the environment specific information
sections for relative files below for further information.
When you delete a record from a relative file, the only action
is to change that record's marker. However, the contents of a
deleted record physically remain in the file until a new
record is written. If, for security reasons, you want to make
sure that the data does not exist in the file, then you must
overwrite the record using the REWRITE statement before you
delete it.
A fixed format relative file can be processed as a fixed format
sequential organization file by defining the maximum record
length to be larger than that for the relative file (see the
sections on operating environment specific information for
details). A variable format relative file cannot be processed
as a sequential organization file.
The length of a relative file is determined by the largest
record number used when actually writing a record to the file.
Relative File Organization on DOS, Windows and OS/2 Systems
-----------------------------------------------------------
On DOS, Windows and OS/2 systems, the current state of the record is
indicated by a two-byte marker as follows:
Marker (hex) Description
-------------------------------------------------------------------------
0D0A Record present
0D00 Record deleted or never written.
A fixed format relative file can be processed as a fixed format
sequential file by defining the maximum record length to be two
characters larger than that for the relative file.
The size of a relative file on DOS, Windows and OS/2 systems is
calculated as follows.
Fixed format:
(max-rec-len + 2) * largest-record-number
Variable format:
128 + (max-rec-len + 2 + header) * largest-record-number
where header is 2 if max-rec-len is less than 4096, otherwise header is
4.
Relative File Organization on UNIX Systems
------------------------------------------
On UNIX systems, the current state of a record for fixed length relative
records is indicated by a one-byte marker as follows:
Marker (hex) Description
-------------------------------------------------------------------------
0A Record present
00 Record deleted or never written
The current state of a record for variable length relative records is
indicated by a two-byte marker as follows:
Marker (hex) Description
-------------------------------------------------------------------------
0D0A Record present
0D00 Record deleted or never written
A fixed format relative file can be processed as a fixed format
sequential file by defining the maximum record length to be one character
larger than that for the relative file.
The size of a relative file on UNIX systems is calculated as follows.
Fixed format:
(max-rec-len + 1) * largest-record-number
Variable format:
128 + (max-rec-len + 2 + header) * largest-record-number
where header is 2 if max-rec-len is less than 4096, otherwise header is
4.
Fixed Format Relative Structure
-------------------------------
A fixed format relative file is the same as a fixed format sequential
file, except each record is followed by a record marker.
+-------------------------------------------+------+
| Fixed length record - Record 1 |marker|
+-------------------------------------------+------+
| Fixed length record - Record 2 |marker|
+-------------------------------------------+------+
. . .
. . .
+-------------------------------------------+------+
| Fixed length record - Record i deleted |marker|
+-------------------------------------------+------+
. . .
. . .
+-------------------------------------------+------+
| Fixed length record - Record j - unused |marker|
+-------------------------------------------+------+
. . .
. . .
+-------------------------------------------+------+
| Fixed length record - Record n |marker|
+-------------------------------------------+------+
For relative files in random access, writing records 1, 2 and 9 will
occupy the same disk space as creating a file containing records
1, 2 and 3 on UNIX systems.
Variable Format Relative Structure
----------------------------------
A variable format relative file follows the basic variable
structure defined earlier in this document. However, each
record is placed into a fixed length slot, the length of the
slot being the length of the longest record defined, together
with the header and terminator characters. The record header
for each record contains the length of the logical record
written, not the length of the physical fixed length slot.
Each slot is followed by a two-byte record marker.
+----------------------------------------------------+
| File Header record - 128 bytes |
| |
+-------+--------------------------------+------+----+
|Header |Variable length record-Record 1 | pad |0D0A|
+-------+--------------------------------+------+----+
|Header |Variable length record-Record 2 |0D0A|
+-------+--------------------------------+------+----+
|Header |Variable length record-Record 3 | pad |0D0A|
+-------+--------------------------------+------+----+
. . . .
. . . .
+-------+---------------------------------------+----+
|Header |Variable length record-Record i delete |0D00|
+-------+---------------------------------------+----+
. . . .
. . . .
+-------+---------------------------------------+----+
|Header |Variable length record-Record j unused |0D00|
+-------+---------------------------------------+----+
. . . .
. . . .
+-------+-------------------------------+-------+----+
|Header |Variable length record-Record n| pad |0D0A|
+-------+-------------------------------+-------+----+
Indexed Organization
====================
Indexed files consist of a series of fixed or variable length
records. An indexed file is implemented as two separate files;
the data file and the key file. Variable length records are
handled by the variable length file handler supplied with this
COBOL system.
For all file formats other than C-ISAM and LEVEL II, the data
files are of the variable structure defined in the section
Variable Format Sequential Structure earlier in this chapter.
See the section Indexed Organization on UNIX Systems for
details of C-ISAM and fixed length record.
When you name the file, the name is given to the data file;
the name of the associated index file is produced by adding
a .idx extension to the data file name.
For example:
Data file Index file
-------------------------------------------------------------------------
myfile myfile.idx
clock.fle clock.fle.idx UNIX only
clock.fle clock.idx DOS, Windows and OS/2 only
You should avoid using the .idx extension in other contexts.
The index is built up as an inverted tree structure that grows
in height as records are added. The number of key file accesses
required to locate a randomly selected record depends primarily
on the number of records in the file and the key-length.
File I/O is faster when reading the file sequentially, but only
if other indexed sequential operations do not intervene.
We strongly recommend that you take regular backups of all
file types. There are, however, situations with indexed files
(for example, media corruption) that can lead to only one of
the two files becoming unusable. If the index file is lost in
this way, you can recover data records from just the data file
(although not in key sequence) and, therefore, reduce the time
lost due to a failure.
You can recover a corrupt indexed file using a utility which
rebuilds the index of the indexed file. The utility you use
is operating environment dependent and is referred to in each
of the sections covering the different operating systems below.
Indexed Organization on DOS, Windows and OS/2 Systems
-----------------------------------------------------
You can recover a corrupt indexed file using the Rebuild utility.
See the chapter Rebuild for details of this utility.
Indexed Organization on UNIX Systems
------------------------------------
If you are using C-ISAM, the C-ISAM file handler handles all
fixed length indexed records. The data files are in the relative
format described earlier in this chapter. If the C-ISAM file
handler is not the default one supplied with this system, or you
have substituted your own file handler for the default as
described in an add-on product, the format of the data file is
dependent on file handler you are using. See your Release Notes
for details of the default file handlers supplied with this
COBOL system.
It is possible to use an environment variable to specify that
the index and data files should appear in separate directories.
See the section The "&" Character in Environment Variables in
the chapter External File-name Mapping for further
information.
To recover an index from the data file when the indexed file
has become corrupt, all unused data records are marked as
deleted by adding x"00" to each record that contains LOW-VALUES.
For existing records, the records are marked with the character
x"0A".
The recovery operation can, therefore, be performed with a
simple COBOL program by defining the data file as ORGANIZATION
RELATIVE ACCESS SEQUENTIAL. The records are then read sequentially,
the data moved from the relative file record area into the
indexed record area and written to a new version of the indexed
file.
Those records with LOW-VALUES in the last (extra) byte are
discarded.
You can also rebuild a corrupt index file using the fhrebuild
utility. See the chapter File Handler Utilities for details of
how to do this.
Index File Structure
--------------------
On all operating systems, an index file can have several keys.
For each key defined, the index file contains an independent
index, structured as a B-Tree. A leaf node in an index contains
a list of key-values in ascending order, each of which points
to the data record (in the data file) to which it belongs.
A non-leaf node contains a list of key-values in ascending
order which points to a subordinate node in which the key-value
is the largest.
This index structure provides the fastest possible random
access to a data record using any key, as well as efficient
processing of data records in sequential key order.
The records in an index file are always the same length, whether
they are nodes, header records or key information records. The
size of the records is determined at the time the file is
created and cannot subsequently be changed. The size used is
configurable at the time the file is created. See the section
Index Node Record later in this chapter for further details on
how to do this.
The index file starts with the index File Header Record,
which contains information about the file. It points to the
Free Space record, which is used to maintain a list of free
records in the index file. The index File Header record also
points to the Key Information record, which contains
details of every key defined for the file, and, for each
key, points to the root Index Node record of the associated
index. Each of these records is described in the following
sections.
Index File Header Record
------------------------
The File Header record is located at offset 0 within the
index file. The first 128 bytes are the same as a standard
variable structure file header record, except for the
fields below.
Index File Header record description:
Offset Size Description of the field
-------------------------------------------------------------------------
0 4 Length of the file header.
39 1 Organization of the file. Always contains value 2
(for Indexed organization).
62 14 Always contains zeros.
76 1 Reserved. Set to 4.
124 4 Offset of logical end of the index file.
The remainder of the index file header record contains the following
fields:
Offset Size Description of the field
-------------------------------------------------------------------------
132 4 Offset of logical end of data file.
136 1 Value 2.
137 1 Value 2.
138 1 Value 4.
139 1 Value 4.
140 2 Contains the number of keys defined for the file.
142 1 Value 0, or 1 for IDXFORMAT"4" files.
143 1 Value 2 or 4. Number of bytes used for occurrence
numbers in indices where duplicates are permitted.
144 4 Value zeros.
148 4 Offset of first Key Information record.
152 4 Value zeros.
156 4 Offset of the Free Space record for the data file.
For fixed format files, this is a record in the
index file of the same format as the index Free
Space record, but the addresses point to free
records in the data file. For variable format
files, this is the address in the data file of the
data Free Space record. This record has a different
structure to the index Free Space record.
160 4 Value zeros.
164 4 Offset of first Free Space record in index file.
168 4 Value zeros.
172 2 Value zeros.
174 2 Index file record length (node size).
176 8 Value zeros.
184 328 Reserved. Value zeros. For node size 512.
840 For node size 1024.
3912 For node size 4096.
Free Space Record (Fixed Length Record)
---------------------------------------
The Free Space record is a record equal to the node size of your
file and contains the location of free records in the index or data
file. Continuation records of the same size and structure are
created as needed, each pointing to the next continuation record.
The first Free Space record is pointed to by the File Header
record.
Free Space record description:
Size Description of the field
-------------------------------------------------------------------------
2 Bit 15 Leading security flag. Value should
match value of trailing security flag.
Bits 14-0 Pointer to end of last free record
address entry, relative to start of
this record.
4 Offset of Free Space continuation record. Zero if
no further continuation records.
4 Offset of a free record in index file
. . . . .
. . . . .
4 Offset of a free record in index file
2 Bit 15 Security Flag. Value should match
value of leading security flag.
Bits 14-0 Reserved. Value x"7F".
Key Information Record
----------------------
The Key Information record is a record equal in size to the
index node size for your file. It describes the physical
characteristics of all the keys used in the indexed file,
including the length of each key; where the key is defined
within the data record; whether duplicates are permitted,
and so on. The File Header record points to the Key
Information record.
Within the Key Information record structure is a sub-structure,
the Key Block. A Key Block is created for each key defined.
The first Key Block always describes the prime key. Subsequent
Key Blocks define the alternate keys in the order specified
when the file was created.
If the Key Information record is not big enough to hold Key
Blocks for all the keys defined, equal sized continuation
records are created, each pointing to the next, until all
the keys have been defined.
Key Information record description:
Size Description of the field
-------------------------------------------------------------------------
2 Bit 15 Security Flag. Value 0.
Bits 14-0 Pointer to end of last Key Block entry
in this record relative to start of
this record.
4 Address of Key Information continuation record.
Zero if no further continuation records.
n Key Block for prime key
( . . )
( . . ) One for each alternate key in file
(n Key Block )
1 Reserved. Value x"FF".
1 Reserved. Value x"7E".
Key Block description:
Size Description of the field
-------------------------------------------------------------------------
2 Length of this entry in bytes.
4 Address of the root Index Node record for this key.
1 Key compression.
Bit 2 Compression of trailing spaces
Bit 1 Compression of leading characters
Bit 0 Compression of duplicates
5 Key-Component Block
( . . . ) If key is split, one block
( . . . ) per component
( 5 Key-Component Block )
Key-Component Block description:
Size Description of the field
-------------------------------------------------------------------------
2 Bit 15 Duplicates permitted flag. If
set, duplicates are permitted.
Bits 14-0 Length of component in bytes.
2 Offset of component within data record,
starting at 0.
1 Component type. Value zeros.
Index Node Record
-----------------
For each key defined, a complete and independent index is
constructed. It consists of a tree of Index Node records, each
record being the size of your index node and containing actual
key-values associated with data records written to the indexed
file. Every key-value in a node will point either to a
subordinate Index Node record or, if it is a leaf node, to
the data record associated with the key. The top level node
is called the root.
The default node size is 1024 bytes, but can change depending
on the largest key size defined for the file. If the largest
key is greater than 238 bytes, the node size will be 4096 bytes.
It is possible to change the node size by setting the XFHNODE
environment variable on DOS, Windows and OS/2 systems or the
isam_block_size run-time tunable on UNIX systems to one of
512, 1024 (the default) or 4096 bytes.
Note: If the largest Key-value is greater than 120 bytes, the
value of XFHNODE or isam_block_size will be overwritten
by 1024. Also, if the largest Key-value is greater than
248, the node size will automatically be 4096.
Index Node record description:
Size Description of the field
-------------------------------------------------------------------------
2 Bit 15 Security Flag. Value should match
value of the trailing security flag.
Bits 14-0 Pointer to end of last Key-Value Block
in this record, relative to the start
of this record.
n Key-Value Block
. .
. .
n Key-Value Block
1 Index number.
The value is the same for all nodes belonging to
the same index tree. Contains zero if prime key.
1 Bit 7 Security flag. Value should match value
of the leading security flag.
Bits 6-0 Level of this node. Leaf nodes are
level 0.
Key-Value Block description:
Size Description of the field
-------------------------------------------------------------------------
1/2 Optional. Compression character count.
This field is present only if compression is
enabled for this key. It contains a count of the
number of characters (leading and/or trailing) that
have been suppressed. If both leading and trailing
suppression is enabled, this field is two bytes in
length.
n Key-value.
2 Optional. Duplicate occurrence number.
This field is present only if duplicates are
allowed for this key. It contains the duplicate
occurrence count. The first key stored that is a
duplicate has this field set to 1. Second duplicate
has this field set to 2, and so on.
4 Bit 31 Reserved. Set if the next Key-value
block is a duplicate of this one and
duplicate compression is enabled.
Bits 30-0 Address of the data record in the data
file if this is a leaf node; otherwise,
the address of the subordinate Index
Node record in the index file.
Data File Structure
-------------------
The data file of an indexed file is a variable format sequential
file. The structure of such a file is described earlier in this
chapter. This file contains all the data records. It can be
processed as a sequential file either by defining the file as
ORGANIZATION SEQUENTIAL and adding a RECORDING MODE IS V clause
to the otherwise unchanged FD, or by specifying the CALLFH
directive. The file can then be opened and read sequentially.
Since the data file is not ordered in any particular way,
the records read should not be expected in a consistent order.
Information about free records in the data file is maintained
so that space created by deleting records can be re-used,
preventing the file from growing too quickly. In a fixed format
indexed file, this information is held in a Free Space record
in the index file. This record has the same structure as the
index Free Space record, except the addresses point to data
file records. In a variable format file the information is
held in a system record in the data file.
In a variable structure data file, all record slots are a
multiple of 4 bytes. For each slot length in a variable format
file, a chain is maintained for all slots of that length that
are free. The start of the chain for all lengths is maintained
in the Data Free Space record in the data file. This system
record is always the same length as the maximum slot length
or possible maximum compressed length (a record's length
may increase when it is compressed) for the file.
Each free slot pointed to contains the address of the next
free slot of the same length in the first four bytes after the
header record. The last slot in the chain contains an address
of zero.
Data Free Space record description:
Offset Size Description of the field
-------------------------------------------------------------------------
0 2/4 Header of the record.
2/4 4 Offset of the first free data slot of length 8
bytes.
6/8 4 Offset of the first free data slot of length 12
bytes.
. . . . .
. . . . .
n 4 Offset of the first free data slot of maximum
length.
=========================================================================
Micro Focus is a registered trademark of Micro Focus Limited.
=========================================================================
@(#)Vrn/file.1/3.1.03/15Jul93/nrV
Copyright (C) 1993 Micro Focus Limited