Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ chrtbl(1M) — UnixWare 2.01

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

ctype(3C)

environ(5)

setlocale(3C)






       chrtbl(1M)                                                chrtbl(1M)


       NAME
             chrtbl - generate character classification and conversion
             tables

       SYNOPSIS
             chrtbl [file]

       DESCRIPTION
             The chrtbl command creates two tables containing information
             on character classification, upper/lowercase conversion,
             character-set width, and numeric formatting.  One table is an
             array of (257*2) + 7 bytes that is encoded so a table lookup
             can be used to determine the character classification of a
             character, convert a character [see ctype(3C)], and find the
             byte and screen width of a character in one of the
             supplementary code sets.  The other table contains information
             about the format of non-monetary numeric quantities: the first
             byte specifies the decimal delimiter; the second byte
             specifies the thousands delimiter; and the remaining bytes
             comprise a null-terminated string indicating the grouping
             (each element of the string is taken as an integer that
             indicates the number of digits that comprise the current group
             in a formatted non-monetary numeric quantity).

             chrtbl reads the user-defined character classification and
             conversion information from file and creates three output
             files in the current directory.  To construct file, use the
             file supplied in /usr/lib/locale/C/chrtbl_C as a starting
             point.  You may add entries, but do not change the original
             values supplied with the system.  For example, for other
             locales you may wish to add eight-bit entries to the ASCII
             definitions provided in this file.

             One output file, ctype.c (a C language source file), contains
             a (257*2)+7-byte array generated from processing the
             information from file.  You should review the content of
             ctype.c to verify that the array is set up as you had planned.
             (In addition, an application program could use ctype.c.)  The
             first 257 bytes of the array in ctype.c are used for character
             classification.  The characters used for initializing these
             bytes of the array represent character classifications that
             are defined in ctype.h; for example, _L means a character is
             lowercase and _S|_B means the character is both a spacing
             character and a blank.  The second 257 bytes of the array are
             used for character conversion.  These bytes of the array are
             initialized so that characters for which you do not provide


                           Copyright 1994 Novell, Inc.               Page 1













      chrtbl(1M)                                                chrtbl(1M)


            conversion information will be converted to themselves.  When
            you do provide conversion information, the first value of the
            pair is stored where the second one would be stored normally,
            and vice versa; for example, if you provide <0x41 0x61>, then
            0x61 is stored where 0x41 would be stored normally, and 0x61
            is stored where 0x41 would be stored normally.  The last 7
            bytes are used for character width information for up to three
            supplementary code sets.

            The second output file (a data file) contains the same
            information, but is structured for efficient use by the
            character classification and conversion routines [see
            ctype(3C)].  The name of this output file is the value you
            assign to the keyword LC_CTYPE read in from file.  Before this
            file can be used by the character classification and
            conversion routines, it must be installed in the
            /usr/lib/locale/locale directory with the name LC_CTYPE by
            someone who is super-user or a member of group bin.  This file
            must be readable by user, group, and other; no other
            permissions should be set.  To use the character
            classification
            and conversion tables in this file, set the LC_CTYPE
            environment variable appropriately [see environ(5) or
            setlocale(3C)].

            The third output file (a data file) is created only if numeric
            formatting information is specified in the input file.  The
            name of this output file is the value you assign to the
            keyword LC_NUMERIC read in from file.  Before this file can be
            used, it must be installed in the /usr/lib/locale/locale
            directory with the name LC_NUMERIC by someone who is super-
            user or a member of group bin.  This file must be readable by
            user, group, and other; no other permissions should be set.
            To use the numeric formatting information in this file, set
            the LC_NUMERIC environment variable appropriately [see
            environ(5) or setlocale(3C)].

            The name of the locale where you install the files LC_CTYPE
            and LC_NUMERIC should correspond to the conventions defined in
            file.  For example, if French conventions were defined, and
            the name for the French locale on your system is french, then
            you should install the files in /usr/lib/locale/french.

            If no input file is given, or if the argument ``-'' is
            encountered, chrtbl reads from standard input.



                          Copyright 1994 Novell, Inc.               Page 2













       chrtbl(1M)                                                chrtbl(1M)


             The syntax of file allows the user to define the names of the
             data files created by chrtbl, the assignment of characters to
             character classifications, the relationship between upper and
             lowercase letters, byte and screen widths for up to three
             supplementary code sets, and three items of numeric formatting
             information: the decimal delimiter, the thousands delimiter,
             and the grouping.  The keywords recognized by chrtbl are:

             LC_CTYPE         name of the data file created by chrtbl to
                              contain character classification, conversion,
                              and width information

             isupper          character codes to be classified as uppercase
                              letters

             islower          character codes to be classified as lowercase
                              letters

             isdigit          character codes to be classified as numeric

             isspace          character codes to be classified as spacing
                              (delimiter) characters

             ispunct          character codes to be classified as
                              punctuation characters

             iscntrl          character codes to be classified as control
                              characters

             isblank          character code for the blank (space)
                              character

             isxdigit         character codes to be classified as
                              hexadecimal digits

             ul               relationship between upper- and lowercase
                              characters

             cswidth          byte and screen width information (by
                              default, each is one character wide)

             LC_NUMERIC       name of the data file created by chrtbl to
                              contain numeric formatting information





                           Copyright 1994 Novell, Inc.               Page 3













      chrtbl(1M)                                                chrtbl(1M)


            decimal_point    decimal delimiter

            thousands_sep    thousands delimiter

            grouping         string in which each element is taken as an
                             integer that indicates the number of digits
                             that comprise the current group in a
                             formatted non-monetary numeric quantity.

            Any lines with the number sign (#) in the first column are
            treated as comments and are ignored.  Blank lines are also
            ignored.

            Characters for isupper, islower, isdigit, isspace, ispunct,
            iscntrl, isblank, isxdigit, and ul can be represented as a
            hexadecimal or octal constant (for example, the letter a can
            be represented as 0x61 in hexadecimal or 0141 in octal).
            Hexadecimal and octal constants may be separated by one or
            more space and/or tab characters.

            The dash character (-) may be used to indicate a range of
            consecutive numbers.  Zero or more space characters may be
            used for separating the dash character from the numbers.

            The backslash character (\) is used for line continuation.
            Only a carriage return is permitted after the backslash
            character.

            The relationship between upper- and lowercase letters (ul) is
            expressed as ordered pairs of octal or hexadecimal constants:
            <uppercase_character lowercase_character>.  These two
            constants may be separated by one or more space characters.
            Zero or more space characters may be used for separating the
            angle brackets (< >) from the numbers.

            The following is the format of an input specification for
            cswidth:

                  cswidth n1[[:s1][,n2[:s2][,n3[:s3]]]]

            where,

                  n1    byte width for supplementary code set 1, required
                  s1    screen width for supplementary code set 1
                  n2    byte width for supplementary code set 2
                  s2    screen width for supplementary code set 2


                          Copyright 1994 Novell, Inc.               Page 4













       chrtbl(1M)                                                chrtbl(1M)


                   n3    byte width for supplementary code set 3
                   s3    screen width for supplementary code set 3

             decimal_point and thousands_sep are specified by a single
             character that gives the delimiter.  grouping is specified by
             a quoted string in which each member may be in octal or hex
             representation.  For example, \3 or \x3 could be used to set
             the value of a member of the string to 3.

             In a C locale, or in a locale where the decimal point
             character is not defined, the decimal point character defaults
             to a period (.).

       EXAMPLES
             The following is an example of an input file used to create
             the USA-ENGLISH code set definition table in a file named usa
             and the non-monetary numeric formatting information in a file
             name num-usa.
                   LC_CTYPE  usa
                   isupper   0x41 - 0x5a
                   islower   0x61 - 0x7a
                   isdigit   0x30 - 0x39
                   isspace   0x20 0x9 - 0xd
                   ispunct   0x21 - 0x2f   0x3a - 0x40 \
                             0x5b - 0x60   0x7b - 0x7e
                   iscntrl   0x0 - 0x1f    0x7f
                   isblank   0x20
                   isxdigit  0x30 - 0x39   0x61 - 0x66 \
                             0x41 - 0x46
                   ul       <0x41 0x61> <0x42 0x62> <0x43 0x63>  \
                            <0x44 0x64> <0x45 0x65> <0x46 0x66>  \
                            <0x47 0x67> <0x48 0x68> <0x49 0x69>  \
                            <0x4a 0x6a> <0x4b 0x6b> <0x4c 0x6c>  \
                            <0x4d 0x6d> <0x4e 0x6e> <0x4f 0x6f>  \
                            <0x50 0x70> <0x51 0x71> <0x52 0x72>  \
                            <0x53 0x73> <0x54 0x74> <0x55 0x75>  \
                            <0x56 0x76> <0x57 0x77> <0x58 0x78>  \
                            <0x59 0x79> <0x5a 0x7a>
                   cswidth           1:1,0:0,0:0
                   LC_NUMERIC  num_usa
                   decimal_point           .
                   thousands_sep           ,
                   grouping                "\3"





                           Copyright 1994 Novell, Inc.               Page 5













      chrtbl(1M)                                                chrtbl(1M)


      FILES
            /usr/lib/locale/locale/LC_CTYPE
                  data files containing character classification,
                  conversion, and character-set width information created
                  by chrtbl

            /usr/lib/locale/locale/LC_NUMERIC
                  data files containing numeric formatting information
                  created by chrtbl

            /usr/include/ctype.h
                  header file containing information used by character
                  classification and conversion routines

            /usr/lib/locale/C/chrtbl_C
                  input file used to construct LC_CTYPE and LC_NUMERIC in
                  the default locale.

      REFERENCES
            ctype(3C), environ(5), setlocale(3C)

      DIAGNOSTICS
            The error messages produced by chrtbl are intended to be
            self-explanatory.  They indicate errors in the command line or
            syntactic errors encountered within the input file.

      NOTICES
            Changing the files in /usr/lib/locale/C will cause the system
            to behave unpredictably.



















                          Copyright 1994 Novell, Inc.               Page 6








Typewritten Software • bear@typewritten.org • Edmonds, WA 98026