RDB/VMS Relational Database Operator DEFINE_INDEX — VMS RDB_4.1A

 Creates an index for a relation.  An index allows Rdb/VMS direct
 access to the records in the relation, to avoid sequential searching.
 You can define a sorted index or a hashed index.  A sorted index uses
 the B-tree method of retrieval.  A hashed index uses hash addressing
 for exact match retrievals.

 Example

   RDO>  DEFINE INDEX EMP_EMPLOYEE_ID FOR EMPLOYEES
   cont>   DUPLICATES ARE NOT ALLOWED
   cont>   TYPE IS SORTED.
   cont>     EMPLOYEE_ID.
   cont>  END EMP_EMPLOYEE_ID INDEX.

Additional information available:

MoreFormatExamples

 To define an index for a relation using the DEFINE INDEX statement,
 you need the Rdb/VMS DEFINE privilege for the relation.

 When the DEFINE INDEX statement executes, Rdb/VMS adds the index
 definition to the physical database.  If you have invoked the
 database with the PATHNAME specification, the definition is also
 added to the data dictionary.

 If you define a sorted index for a relation that contain no data, the
 root node for the index is not created until the first record is
 stored.  When an RMU/VERIFY operation encounters a sorted index with
 no root node, it reports the index as empty.

 You must execute this statement in a read/write transaction.  If you
 issue this statement when there is no active transaction, Rdb/VMS
 starts a read/write transaction implicitly.

 You cannot define a hashed index in a single-file database.

 If you define a hashed index, you must specify a storage map for it
 with the index-store-clause.  The storage area that you specify must
 have a MIXED page format.

 An optional keyword that causes Rdb/VMS to create ascending or
 descending index segments.  If you omit the ASCENDING or DESCENDING
 keywords, ASCENDING is the default.  To sort records in a particular
 way in the result of a particular query, specify the sort order in
 the SORTED BY clause.

 Other users are allowed to be attached to the database when you issue
 the DEFINE INDEX statement.

Format

 DEFINEINDEX name ┬───────────────>────────────┬─> Ftypebox (O)typebox (R) relation-name ─┐
                   └> 
D
E
S
C
R
I
P
T
I
O
N typebox (I)typebox (S) typebox (/)typebox (*) text */ ┘                      │
 ┌─────────────────────────────<───────────────────────────────────────┘
 └───┬─────────>──────────┬────┬───────────>────────────┬───────┐
     └> duplicates-clause ┘    └> index-storage-clause ─┘       │
 ┌─────────────────────────────<────────────────────────────────┘
 └─┬─────────────>───────────────┬───────────────────────────┬──┐
   ├─> 
T
Y
P
E typebox (I)typebox (S) ──┬───> 
H
A
S
H
E
D ───┘                           │  │
   │             └───> 
S
O
R
T
E
D ──> sorted-index-param-list ───┤  │
   └──────────────> sorted-index-param-list ─────────────────┘  │
 ┌───────────────────────── . <─────────────────────────────────┘
 └┬─> field-name ──┬─┬──────────────────>────────────────┬─┬──┬─┐
  │                │ ├> 
A
S
C
E
N
D
I
N
G ───────────────────────┤ │  │ │
  │                │ ├> 
D
E
S
C
E
N
D
I
N
G ──────────────────────┤ │  │ │
  │                │ ├> 
S
I
Z
E 
I
S n ───────────────────────┤ │  │ │
  │                │ └> 
M
A
P
P
I
N
G 
V
A
L
U
E
S lo-val TO hi-val ─┘ │  │ │
  │                └────────────────────<──────────────────┘  │ │
  └───────────────────────── . <──────────────────────────────┘ │
 ┌────────────────────────── . <────────────────────────────────┘
 └─> 
E
N
D ┬───>────┬> typebox (I)typebox (N)typebox (D)typebox (E)X ─> .
         └─> name ┘

Additional information available:

TYPE_ISASCENDINGDESCENDINGSIZE_ISMAPPING_VALUES

nametextrelation-nameduplicates-clauseindex-storage-clause
sorted-index-paramfield-name

name

 The name of the index.  You can use this name to refer to the index
 in other statements.  When choosing a name, follow these rules:

  o  Use a name that is unique among all index names in the database.

  o  Use any valid VMS name.  However, the name cannot end in a dollar
     sign ($) or underscore (_).

  o  Do not use any Rdb/VMS reserved words (see The VAX Rdb/VMS
     Reference Manual appendix for reserved words).

text

 A text string that adds a comment to the index definition.

relation-name

 The name of the relation that includes the index.

duplicates-clause

 duplicates-clause =

 ───> 
D
U
P
L
I
C
A
T
E
S typebox (A)typebox (R)typebox (E) ──┬───────>─┬──> 
A
L
L
O
W
E
D ──>
                       └─> 
N
O
T ──┘

 A clause specifying whether each value of the index must be unique.
 If you try to store the same value twice in an indexed field defined
 as DUPLICATES NOT ALLOWED, Rdb/VMS returns an error message and does
 not store or modify the record.

index-storage-clause

 index-storage-clause =

 STORE ─┬> index-within-clause ──────────────────────────┬─┐
        └> 
U
S
I
N
G ─┬> field-name ─┬> index-within-clause ─┘ │
                  └───── , <─────┘                         │
  ┌────────────────────────<───────────────────────────────┘
  └─┬─────────────>──────────────┬──>
    └> default-threshold-clause ─┘

 The index-storage-clause creates a storage map definition for the
 index.  It allows you to choose which storage area files will be used
 to store index entries.  You can store all index entries for one
 relation in a single storage area, or you can partition the entries
 over multiple storage areas.

Additional information available:

index-within-clausefield-namedefault-threshold-clause

index-within-clause

 index-within-clause =

 WITHIN ─┬─> area-name ─┬──────────────>───────────────┬─┐
         │              └> 
W
I
T
H 
L
I
M
I
T typebox (O)F ─┬> literal ─┬┘ │
         │                                └──── , <───┘  │
         │ ┌────────────────────────<────────────────────┘
         │ └─┬──────────>──────────┬──┬─────────>
         │   └─> threshold-clause ─┘  │
         └───────────── typebox (;) <───────────┘

Additional information available:

WITH_LIMIT_OF

area-namethreshold-clause

area-name

 The name of the storage area in which you want the index stored.  You
 must have defined this storage area with either the DEFINE DATABASE
 statement or the DEFINE STORAGE AREA clause of the CHANGE DATABASE
 statement before you refer to it in the store clause.

 If the index is a hashed index, the storage area must have a MIXED
 page format.

WITH_LIMIT_OF

 The maximum value for the index key that will reside in the specified
 storage area.

 The number of literals in this clause must be the less than or equal
 to the number of fields in the USING clause.  Repeat this clause to
 partition the index entries among multiple storage areas.

 When you define a multisegmented index using multiple keys and use
 the STORE USING...WITH LIMITS clauses, if the values for the first
 key are all the same, then set the limit for the first key at that
 value.  By doing this, you ensure that the value of the second key
 determines the storage area in which each record will be stored.

 Note that the last storage area you specify CANNOT have a WITH LIMIT
 OF clause associated with it.

threshold-clause

 threshold-clause=

 ──> 
T
H
R
E
S
H
O
L
D
S typebox (A)typebox (R)typebox (E) ──────────┐
  ┌─────────────<─────────────┘
  └─> ( ──> val1 ─┬───────────>──────────┬─> ) ──>
                  └─> ,val2 ──┬─────>────┤
                              └─> ,val3 ─┘

 Specifies associated threshold values for each storage area with
 uniform format that is specified in the index-within-clause.  By
 setting threshold values, you can make sure that Rdb/VMS does not
 overlook a page with sufficient space to store compressed data.

 The threshold values specify when the page is marked as FULL in the
 SPAM page free space inventory lists.  For example, if you set
 default values of 70, 85, and 95 percent, ranges of guaranteed free
 space on each data page are 30, 15, and 5 percent, respectively.  If
 you do not set default values, the values are (0,0,0).  With values
 of (0,0,0), Rdb/VMS will use the record length when setting the SPAM
 fullness.  Rdb/VMS will never store a record on a page at threshold
 3.  The value you set for the highest threshold can be used to
 reserve space on the page for future record growth.

 If you specify a value of 40 for the "val1" parameter, but do not
 specify values for the "val2" or "val3" parameters, the threshold
 values will be set at (40,100,100).

 If you use data compression, you should use logical area thresholds
 to obtain optimum storage performance.

field-name

 The name of the field that specifies what value will be used as a
 limit for partitioning the index across multiple storage areas.

 If the index key is multisegmented, you can include some or all of
 the fields that are joined to form the index key.  Separate multiple
 field names with commas.

default-threshold-clause

 default-threshold-clause =

 ──> 
D
E
F
A
U
L
T 
T
H
R
E
S
H
O
L
D
S typebox (A)typebox (R)typebox (E) ──┐
  ┌─────────────<─────────────┘
  └─> ( ──> val1 ─┬───────────>──────────┬─> ) ──>
                  └─> ,val2 ──┬─────>────┤
                              └─> ,val3 ─┘

 Specifies associated threshold values for each storage area with
 uniform format that is NOT specified in the index-within-clause.  By
 setting threshold values, you can make sure that Rdb/VMS does not
 overlook a page with sufficient space to store compressed data.

 The threshold values specify when the page is marked as FULL in the
 SPAM page free space inventory lists.  For example, if you set
 default values of 70, 85, and 95 percent, ranges of guaranteed free
 space on each data page are 30, 15, and 5 percent, respectively.  If
 you do not set default values, the values are (0,0,0).  With values
 of (0,0,0), Rdb/VMS will use the record length when setting the SPAM
 fullness.  Rdb/VMS will never store a record on a page at threshold
 3.  The value you set for the highest threshold can be used to
 reserve space on the page for future record growth.

 If you specify a value of 40 for the "val1" parameter, but do not
 specify values for the "val2" or "val3" parameters, the threshold
 values will be set at (40,100,100).

 If you use data compression, you should use logical area thresholds
 to obtain optimum storage performance.

TYPE_IS

 Specifies whether Rdb/VMS creates a B-tree index structure (SORTED),
 or a hashed index structure (HASHED).  If you specify HASHED, you
 cannot choose options from the sorted-index-param-list.  Hashed
 indexes are effective only for exact match retrievals.  Sorted
 indexes are effective for range retrievals.

sorted-index-param

 sorted-index-param-list =

  ────┬────┬──────────────────>──────────────────────┬───┬─────────>
      │    ├─> 
N
O
D
E 
S
I
Z
E ─────> number-bytes ────────┤   │
      │    ├─> 
P
E
R
C
E
N
T 
F
I
L
L ──────> percentage ──────┤   │
      │    └─> 
U
S
A
G
E ─────────┬─────> 
U
P
D
A
T
E ───┬────┘   │
      │                       └─────> 
Q
U
E
R
Y ────┘        │
      │                                                  │
      └────────────────────<─────────────────────────────┘

Additional information available:

NODE-SIZEPERCENT-FILL

USAGE-mode

NODE-SIZE

 The size of each index node.  The number and level of the resulting
 index nodes depend on this value, the number and size of the index
 keys, and the value of the PERCENT FILL clause.

PERCENT-FILL

 Sets the initial percentage of fullness for each node in the index
 structure.  The valid range is 1 to 100 percent.

USAGE-mode

 USAGE UPDATE, the default, sets the percent fullness of each index
 node at 70 percent.  USAGE QUERY sets this value at 100 percent.
 Supplying the PERCENT FILL and USAGE clause is allowed in the syntax;
 however, the USAGE option takes precedence over an explicit PERCENT
 FILL value.

field-name

 The name of the field(s) that make up the index.  You can create a
 multisegment index by naming two or more fields, which are joined to
 form the index.  All of the fields must be part of the same relation.
 Separate multiple field names with periods.

                                 NOTE

         If field-name refers to a field defined as VARYING
         STRING data type, the size of the field must be less
         than or equal to 255.

ASCENDING

 An optional keyword that causes Rdb/VMS to create ascending index
 segments.  ASCENDING is the default.  To sort records in a particular
 way in the result of a particular query, specify the sort order in
 the SORTED BY clause.

DESCENDING

 An optional keyword that causes Rdb/VMS to create descending index
 segments.  To guarantee that records will be sorted in a particular
 way in the result of a particular query, specify the sort order in
 the SORTED BY clause.

SIZE_IS

 A compression clause that specifies that the "first n" characters of
 a certain key are to be used in the index.  These are specified with
 the DUPLICATES ARE ALLOWED clause.  For example, if you wanted to
 place an index on a 100 byte field that is generally unique to the
 first 20 bytes, you could specify the first 20 bytes and save as much
 as 80 bytes per entry.

MAPPING_VALUES

 A compression clause for all-numeric fields that translates the field
 values into a more compactly encoded form.  You can mix mapped and
 unmapped fields, but the most storage space is gained by building
 indexes of multiple fields of data type WORD or LONGWORD.  Rdb/VMS
 attempts to pack all such fields into the smallest possible space.

 The "lo-val" (low value) through "hi-val" (high value) parameters
 specify the range of integers as the value of the index key.

 The valid range of the compressed key:

  o  Cannot be zero

  o  The range "high-val" through "low-val" is limited to (2**31) - 4
     x (10**scale)

     If the value of the key is less than zero or greater than
     (2**31) - 4 x (10**scale), Rdb/VMS signals an exception.


 The following notes refer to compressed indexes:

 All text compressed indexes require the DUPLICATES ARE ALLOWED
 clause, which is the default for the DEFINE INDEX statement.  If the
 SIZE IS clause is specified, the field referred to by the clause must
 be of the TEXT or VARYING TEXT data type.  The field must also be the
 same length or greater in length than the value specified in the SIZE
 IS clause.

 For integer field compressed indexes, the index field must be of data
 type WORD or LONGWORD.  You can mix mapped and unmapped fields, but
 the most storage space is gained by building indexes of multiple
 fields of data type WORD or LONGWORD.  Rdb/VMS attempts to pack all
 such fields into the smallest possible space.

 Compressed key suffixes also enable the user to use fields longer
 than 254 characters as index keys.

 If any data values already stored are less than "lo-val" or greater
 than "hi-val", the DEFINE INDEX statement will fail.

 A subsequent STORE or MODIFY operation that attempts to store a value
 less than "lo-val" or greater than "hi-val" will fail.

Examples

 Example 1

 The following example creates a simple relation index:

 DEFINE INDEX EMP_EMPLOYEE_ID FOR EMPLOYEES
   DUPLICATES ARE NOT ALLOWED.
     EMPLOYEE_ID.
  END EMP_EMPLOYEE_ID INDEX.

 This statement names the index and names the field to serve as the
 index key.

 The clause, DUPLICATES ARE NOT ALLOWED, causes Rdb/VMS to return an
 error message if a user attempts to store an identification number
 that is already assigned.


 Example 2

 The following example defines a hashed index, and uses the store
 clause to partition the index into different storage areas:

 DEFINE INDEX EMPLOYEES_HASH
   DESCRIPTION IS /* Hash index for employees */
   FOR EMPLOYEES
   DUPLICATES ARE NOT ALLOWED
   STORE USING EMPLOYEE_ID
    WITHIN
     EMPIDS_LOW WITH LIMIT OF "00200";
     EMPIDS_MID WITH LIMIT OF "00400";
     EMPIDS_OVER
   TYPE IS HASHED.
   EMPLOYEE_ID.
 END EMPLOYEES_HASH.

 Example 3

 The following example creates a multisegment index:

 DEFINE INDEX EMP_FULL_NAME FOR EMPLOYEES
    DUPLICATES ARE ALLOWED.
      LAST_NAME.
      FIRST_NAME.
      MIDDLE_INITIAL.
 END EMP_FULL_NAME INDEX.

 This statement names three fields.  Rdb/VMS concatenates these three
 fields to make the multisegment index key.


 Example 4

 The following example defines the EMP_EMPLOYEE_ID index and causes
 the LAST_NAME segment to be defined in DESCENDING order.

 DEFINE INDEX EMP_FULL_NAME FOR EMPLOYEES.
 LAST_NAME DESCENDING.
 FIRST_NAME ASCENDING.
 MIDDLE_INITIAL.
 END EMP_FULL_NAME INDEX.

 Having defined such an index on a particular field does not guarantee
 that Rdb/VMS will use that index in a particular query retrieval.  To
 ensure a particular sort order is returned by a particular query, you
 have to specify that order in the RSE:

 FOR E IN EMPLOYEES
     SORTED BY DESCENDING E.STATUS_CODE,
               ASCENDING E.LAST_NAME,
               DESCENDING E.EMPLOYEE_ID
                               .
                    .
                    .



 Example 5

 The following example defines the JH_EMPLOYEE_ID index and sets each
 node size to 350 bytes and the initial fullness of each node to 50
 percent:

 DEFINE INDEX JH_EMPLOYEE_ID FOR JOB_HISTORY
     DUPLICATES ARE ALLOWED
     NODE SIZE 350
     PERCENT FILL 50.
         EMPLOYEE_ID.
 END JH_EMPLOYEE_ID INDEX.
 Example 6

 The following example defines three new uniform storage areas
 (EMPIDS_LOW_UNIFORM, EMPIDS_MID_UNIFORM, and EMPIDS_OVER_UNIFORM) for
 the MF_PERSONNEL database.  It also defines a sorted index for the
 LAST_NAME field of the EMPLOYEES relation.

 RDO> CHANGE DATABASE FILENAME MF_PERSONNEL
 cont>  DEFINE STORAGE AREA EMPIDS_LOW_UNIFORM
 cont>      FILENAME EMPIDS_LOW_UNIFORM.RDA
 cont>      END EMPIDS_LOW_UNIFORM STORAGE AREA
 cont>  DEFINE STORAGE AREA EMPIDS_MID_UNIFORM
 cont>      FILENAME EMPIDS_MID_UNIFORM.RDA
 cont>      END EMPIDS_MID_UNIFORM STORAGE AREA
 cont>  DEFINE STORAGE AREA EMPIDS_OVER_UNIFORM
 cont>      FILENAME EMPIDS_OVER_UNIFORM.RDA
 cont>      END EMPIDS_OVER_UNIFORM STORAGE AREA.
 RDO> !
 RDO> INVOKE DATABASE FILENAME MF_PERSONNEL
 RDO> !
 RDO> DEFINE INDEX EMP_FIRST_NAME_SORTED
 cont> FOR EMPLOYEES
 cont> DUPLICATES ARE ALLOWED
 cont> STORE USING FIRST_NAME
 cont>    WITHIN EMPIDS_LOW_UNIFORM WITH LIMIT OF "HOWARD"
 cont>        THRESHOLDS ARE (70,80,95);
 cont>        EMPIDS_MID_UNIFORM WITH LIMIT OF "PETER";
 cont>        EMPIDS_OVER_UNIFORM
 cont>    DEFAULT THRESHOLDS ARE (60,70,80)
 cont> TYPE IS SORTED.
 cont> FIRST_NAME.
 cont> END EMP_FIRST_NAME_SORTED INDEX.

 The EMP_FIRST_NAME_SORTED index stores index entries into one of the
 three storage areas based on the values stored in the LAST_NAME
 field.

 Note that the EMPIDS_LOW_UNIFORM storage area receives the threshold
 values of (70,80,95) specified with the THRESHOLDS clause, and the
 EMPIDS_MID_UNIFORM and EMPIDS_OVER_UNIFORM storage areas receive the
 default threshold values of (60,70,80) specified with the DEFAULT
 THRESHOLDS clause.

 Example 7

 To create a compressed index for fields that use SIGNED BYTE, SIGNED
 WORD, and SIGNED LONGWORD data types, use the MAPPING VALUES clause
 of the DEFINE INDEX statement for the field or fields being indexed,
 as shown in the examples.  You can use the DUPLICATES ARE NOT ALLOWED
 clause with an integer compressed index.  In the examples,
 PRODUCT_ID, YEAR_NUMBER, and PRODUCT_DESCR are the three fields that
 are defined with the DUPLICATES ARE NOT ALLOWED clause.

 DEFINE INDEX PS_DATE_2 FOR PRODUCT_SCHEDULE
 DUPLICATES ARE NOT ALLOWED.
   PRODUCT_ID.
   YEAR_NUMBER         MAPPING VALUES 1970 to 2070.
   PRODUCT_DESCR       SIZE IS 20.
 END.

 Note that you can mix mapped and unmapped fields, but the most
 storage space is gained by building indexes of multiple fields of
 data type of SIGNED WORD or SIGNED LONGWORD.  Rdb/VMS attempts to
 pack all such fields into the smallest possible space.

 Example 8

 The following examples define several text and integer compressed
 indexes.

 !
 ! Define the integer compressed index PORT_NUM_CINDEX on the PORT_NUM
 ! field of the PORT relation:
 !
 DEFINE INDEX PORT_NUM_CINDEX FOR PORT DUPLICATES NOT ALLOWED.
 PORT_NUM MAPPING VALUES 90 TO 1000.
 END.
 !
 ! Define the text compressed index PORT_COUNTRY_CITY_INDEX on the
 ! COUNTRY and CITY fields of the PORT relation:
 !
 DEFINE INDEX PORT_COUNTRY_CITY_CINDEX FOR PORT.
 COUNTRY SIZE 10.
 CITY SIZE 5.
 END.
 !
 ! Produce a list of ports sorted by COUNTRY and CITY name from
 ! the PORT relation:
 !
 FOR P IN PORT SORTED BY P.COUNTRY, P.CITY PRINT P.* END_FOR
 !
 ! Define the integer compressed index CHANNEL_DEPTH_CINDEX on the
 ! CHANNEL_DEPTH field of the PORT relation:
 !
 DEFINE INDEX CHANNEL_DEPTH_CINDEX FOR PORT.
 CHANNEL_DEPTH MAPPING VALUES 20 TO 100.
 END.
 !
 ! Define the integer compressed index DISTANCE_CINDEX on the
 ! DISTANCE field of the PORT relation:
 !
 DEFINE INDEX DISTANCE_CINDEX FOR PORT.
 DISTANCE MAPPING VALUES 0 TO 30000.
 END.
 !
 ! Define the integer compressed index TRANS_COST_CINDEX on the
 ! TRANS_COST field of the PORT relation:
 !
 DEFINE INDEX TRANS_COST_CINDEX FOR PORT.
 TRANS_COST MAPPING VALUES 0 TO 10000.
 END.
 !
 ! Define the integer compressed index PNUM_CDEP_DIST_TCOST_CINDEX
 ! on the PORT_NUM, CHANNEL_DEPTH, DISTANCE, and TRANS_COST fields
 ! of the PORT relation:
 !
 DEFINE INDEX PNUM_CDEP_DIST_TCOST_CINDEX FOR PORT
 DUPLICATES NOT ALLOWED.
 PORT_NUM MAPPING VALUES 100 TO 1000.
 CHANNEL_DEPTH MAPPING VALUES 20 TO 100.
 DISTANCE MAPPING VALUES 0 TO 30000.
 TRANS_COST MAPPING VALUES 0 TO 10000.
 END.
 !
 ! Define the CHANNEL_DEPTH_DESC_CINDEX index on the CHANNEL_DEPTH
 ! field of the PORT relation. The DESCENDING keyword causes
 descending
 ! index segments to be created for the CHANNEL_DEPTH field.
 !
 DEFINE INDEX CHANNEL_DEPTH_DESC_CINDEX FOR PORT.
 CHANNEL_DEPTH DESCENDING MAPPING VALUES 20 TO 100.
 END.

Museum

Related Articles

RDB/VMS Relational Database Operator DEFINE_INDEX — VMS RDB_4.1A

More

Format

name

text

relation-name

duplicates-clause

index-storage-clause

index-within-clause

area-name

WITH_LIMIT_OF

threshold-clause

field-name

default-threshold-clause

TYPE_IS

sorted-index-param

NODE-SIZE

PERCENT-FILL

USAGE-mode

field-name

ASCENDING

DESCENDING

SIZE_IS

MAPPING_VALUES

Examples