Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ chrtbl(1M) — DG/UX 5.4R3.00

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

ctype(3C)

setlocale(3C)

environ(5)



chrtbl(1M)                     DG/UX 5.4R3.00                     chrtbl(1M)


NAME
       chrtbl - generate character classification and conversion tables

SYNOPSIS
       chrtbl [file]
       chrtbl -d [ ctypefile [ numfile ] ]

DESCRIPTION
       The chrtbl command can be used two ways:  without the -d option, to
       create tables of character classification information; and with the
       -d option to dump a text version of such tables.

       The chrtbl -d ctypefile numfile command dumps to it's standard
       output a text version of the LC_CTYPE character table in file
       ctypefile and the LC_NUMERIC numeric information table in file
       numfile.  If numfile, or both ctypefile and numfile are not
       specified, the corresponding table from the current locale is dumped.
       You can modify the resulting text file, and use it as input to
       chrtbl, to produce modified LC_CTYPE and LC_NUMERIC files.  These
       files may be used to either replace the existing LC_CTYPE and
       LC_NUMERIC files in an existing locale, or to create a new locale.
       However, you must never modify any of the files (including LC_CTYPE
       and LC_NUMERIC) in /usr/lib/locale/C, the C locale.

       The chrtbl command without the -d option creates two tables
       containing information on character classification, upper/lower-case
       conversion, character-set width, and numeric formatting.  One table
       is an array of (257*2) + 7 bytes that is encoded so a table lookup
       can be used to determine the character classification of a character,
       convert a character [see ctype(3C)], and find the byte and screen
       width of a character in one of the supplementary code sets.  The
       other table contains information about the format of non-monetary
       numeric quantities: the first byte specifies the decimal delimiter;
       the second byte specifies the thousands delimiter; and the remaining
       bytes comprise a null terminated string indicating the grouping (each
       element of the string is taken as an integer that indicates the
       number of digits that comprise the current group in a formatted non-
       monetary numeric quantity).

       chrtbl reads the user-defined character classification and conversion
       information from file and creates three output files in the current
       directory.  To construct file, use the file supplied in
       /usr/lib/locale/C/chrtblC, or the output of chrtbl -d as a starting
       point.  You may add entries, but do not change the original values
       supplied with the system.  For example, for other locales you may
       wish to add eight-bit entries to the ASCII definitions provided in
       this file.

       One output file, ctype.c (a C-language source file), contains a
       (257*2)+7-byte array generated from processing the information from
       file.  You should review the content of ctype.c to verify that the
       array is set up as you had planned.  (In addition, an application
       program could use ctype.c.)  The first 257 bytes of the array in
       ctype.c are used for character classification.  The characters used



Licensed material--property of copyright holder(s)                         1




chrtbl(1M)                     DG/UX 5.4R3.00                     chrtbl(1M)


       for initializing these bytes of the array represent character
       classifications that are defined in /usr/include/ctype.h; for
       example, L means a character is lower case and S|B means the
       character is both a spacing character and a blank.  The second 257
       bytes of the array are used for character conversion.  These bytes of
       the array are initialized so that characters for which you do not
       provide conversion information will be converted to themselves.  When
       you do provide conversion information, the first value of the pair is
       stored where the second one would be stored normally, and vice versa;
       for example, if you provide <0x41 0x61>, then 0x61 is stored where
       0x41 would be stored normally, and 0x61 is stored where 0x41 would be
       stored normally.  The last 7 bytes are used for character width
       information for up to three supplementary code sets.

       The second output file (a data file) contains the same information,
       but is structured for efficient use by the character classification
       and conversion routines (see ctype(3C)).  The name of this output
       file is the value you assign to the keyword LCCTYPE read in from
       file.  Before this file can be used by the character classification
       and conversion routines, it must be installed in the
       /usr/lib/locale/locale directory with the name LCCTYPE by someone
       who is super-user or a member of group bin.  This file must be
       readable by user, group, and other; no other permissions should be
       set.  To use the character classification and conversion tables in
       this file, set the LCCTYPE environment variable appropriately (see
       environ(5) or setlocale(3C)).

       The third output file (a data file) is created only if numeric
       formatting information is specified in the input file.  The name of
       this output file is the value you assign to the keyword LCNUMERIC
       read in from file.  Before this file can be used, it must be
       installed in the /usr/lib/locale/locale directory with the name
       LCNUMERIC by someone who is super-user or a member of group bin.
       This file must be readable by user, group, and other; no other
       permissions should be set.  To use the numeric formatting information
       in this file, set the LCNUMERIC environment variable appropriately
       (see environ(5) or setlocale(3C)).

       The name of the locale where you install the files LCCTYPE and
       LCNUMERIC should correspond to the conventions defined in file.  For
       example, if French conventions were defined, and the name for the
       French locale on your system is french, then you should install the
       files in /usr/lib/locale/french.

       If no input file is given, or if the argument "-" is encountered,
       chrtbl reads from standard input.

       The syntax of file allows the user to define the names of the data
       files created by chrtbl, the assignment of characters to character
       classifications, the relationship between upper and lower-case
       letters, byte and screen widths for up to three supplementary code
       sets, and three items of numeric formatting information: the decimal
       delimiter, the thousands delimiter and the grouping.  The keywords
       recognized by chrtbl are:



Licensed material--property of copyright holder(s)                         2




chrtbl(1M)                     DG/UX 5.4R3.00                     chrtbl(1M)


       LCCTYPE         name of the data file created by chrtbl to contain
                        character classification, conversion, and width
                        information

       isupper          character codes to be classified as upper-case
                        letters

       islower          character codes to be classified as lower-case
                        letters

       isdigit          character codes to be classified as numeric

       isspace          character codes to be classified as spacing
                        (delimiter) characters

       ispunct          character codes to be classified as punctuation
                        characters

       iscntrl          character codes to be classified as control
                        characters

       isblank          character code for the blank (space) character

       isxdigit         character codes to be classified as hexadecimal
                        digits

       ul               relationship between upper- and lower-case
                        characters

       cswidth          byte and screen width information (by default, each
                        is one character wide)

       LCNUMERIC       name of the data file created by chrtbl to contain
                        numeric formatting information

       decimalpoint    decimal delimiter

       thousandssep    thousands delimiter

       grouping         string in which each element is taken as an integer
                        that indicates the number of digits that comprise
                        the current group in a formatted non-monetary
                        numeric quantity.

       Any lines with the number sign (#) in the first column are treated as
       comments and are ignored.  Blank lines are also ignored.

       Characters for isupper, islower, isdigit, isspace, ispunct, iscntrl,
       isblank, isxdigit, and ul can be represented as a hexadecimal or
       octal constant (for example, the letter a can be represented as 0x61
       in hexadecimal or 0141 in octal).  Hexadecimal and octal constants
       may be separated by one or more space and/or tab characters.

       The dash character (-) may be used to indicate a range of consecutive



Licensed material--property of copyright holder(s)                         3




chrtbl(1M)                     DG/UX 5.4R3.00                     chrtbl(1M)


       numbers.  Zero or more space characters may be used for separating
       the dash character from the numbers.

       The backslash character (\) is used for line continuation.  Only a
       carriage return is permitted after the backslash character.

       The relationship between upper- and lower-case letters (ul) is
       expressed as ordered pairs of octal or hexadecimal constants: <upper-
       casecharacter lower-casecharacter>.  These two constants may be
       separated by one or more space characters.  Zero or more space
       characters may be used for separating the angle brackets (< >) from
       the numbers.

       The following is the format of an input specification for cswidth:
       n1:s1,n2:s2,n3:s3
       where,
            n1   byte width for supplementary code set 1, required
            s1   screen width for supplementary code set 1
            n2   byte width for supplementary code set 2
            s2   screen width for supplementary code set 2
            n3   byte width for supplementary code set 3
            s3   screen width for supplementary code set 3

       decimalpoint and thousandssep are specified by a single character
       that gives the delimiter.  grouping is specified by a quoted string
       in which each member may be in octal or hex representation.  For
       example, \3 or \x3 could be used to set the value of a member of the
       string to 3.

EXAMPLE
       The following is an example of an input file used to create the USA-
       ENGLISH code set definition table in a file named usa and the non-
       monetary numeric formatting information in a file name num-usa.
              LCCTYPE  usa
              isupper   0x41 - 0x5a
              islower   0x61 - 0x7a
              isdigit   0x30 - 0x39
              isspace   0x20 0x9 - 0xd
              ispunct   0x21 - 0x2f    0x3a - 0x40    \
                        0x5b - 0x60    0x7b - 0x7e
              iscntrl   0x0 - 0x1f     0x7f
              isblank   0x20
              isxdigit  0x30 - 0x39    0x61 - 0x66    \
                        0x41 - 0x46
              ul       <0x41 0x61> <0x42 0x62> <0x43 0x63>  \
                       <0x44 0x64> <0x45 0x65> <0x46 0x66>  \
                       <0x47 0x67> <0x48 0x68> <0x49 0x69>  \
                       <0x4a 0x6a> <0x4b 0x6b> <0x4c 0x6c>  \
                       <0x4d 0x6d> <0x4e 0x6e> <0x4f 0x6f>  \
                       <0x50 0x70> <0x51 0x71> <0x52 0x72>  \
                       <0x53 0x73> <0x54 0x74> <0x55 0x75>  \
                       <0x56 0x76> <0x57 0x77> <0x58 0x78>  \
                       <0x59 0x79> <0x5a 0x7a>
              cswidth        1:1,0:0,0:0



Licensed material--property of copyright holder(s)                         4




chrtbl(1M)                     DG/UX 5.4R3.00                     chrtbl(1M)


              LCNUMERIC     numusa
              decimalpoint       .
              thousandssep       ,
              grouping            "\3"

FILES
       /usr/lib/locale/locale/LCCTYPE
                       data files containing character classification,
                       conversion, and character-set width information
                       created by chrtbl
       /usr/lib/locale/locale/LCNUMERIC
                       data files containing numeric formatting information
                       created by chrtbl
       /usr/include/ctype.h
                       header file containing information used by character
                       classification and conversion routines
       /usr/lib/locale/C/chrtblC
                       input file used to construct LCCTYPE and LCNUMERIC
                       in the default locale.

DIAGNOSTICS
       The error messages produced by chrtbl are intended to be self-
       explanatory.  They indicate errors in the command line or syntactic
       errors encountered within the input file.

SEE ALSO
       ctype(3C), setlocale(3C), environ(5).

CAUTION
       Changing the files in /usr/lib/locale/C will cause the system to
       behave unpredictably.


























Licensed material--property of copyright holder(s)                         5


Typewritten Software • bear@typewritten.org • Edmonds, WA 98026