chrtbl(1m) — Atari System V ue12





   chrtbl(1M)                                                       chrtbl(1M)


   NAME
         chrtbl - generate character classification and conversion tables

   SYNOPSIS
         chrtbl [file]

   DESCRIPTION
         The chrtbl command creates two tables containing information on
         character classification, upper/lower-case conversion, character-set
         width, and numeric formatting.  One table is an array of (257*2) + 7
         bytes that is encoded so a table lookup can be used to determine the
         character classification of a character, convert a character [see
         ctype(3C)], and find the byte and screen width of a character in one
         of the supplementary code sets.  The other table contains information
         about the format of non-monetary numeric quantities: the first byte
         specifies the decimal delimiter; the second byte specifies the
         thousands delimiter; and the remaining bytes comprise a null
         terminated string indicating the grouping (each element of the string
         is taken as an integer that indicates the number of digits that
         comprise the current group in a formatted non-monetary numeric
         quantity).

         chrtbl reads the user-defined character classification and conversion
         information from file and creates three output files in the current
         directory.  To construct file, use the file supplied in
         /usr/lib/locale/C/chrtblC as a starting point.  You may add entries,
         but do not change the original values supplied with the system.  For
         example, for other locales you may wish to add eight-bit entries to
         the ASCII definitions provided in this file.

         One output file, ctype.c (a C-language source file), contains a
         (257*2)+7-byte array generated from processing the information from
         file.  You should review the content of ctype.c to verify that the
         array is set up as you had planned.  (In addition, an application
         program could use ctype.c.)  The first 257 bytes of the array in
         ctype.c are used for character classification.  The characters used
         for initializing these bytes of the array represent character
         classifications that are defined in /usr/include/ctype.h; for
         example, L means a character is lower case and S|B means the
         character is both a spacing character and a blank.  The second 257
         bytes of the array are used for character conversion.  These bytes of
         the array are initialized so that characters for which you do not
         provide conversion information will be converted to themselves.  When
         you do provide conversion information, the first value of the pair is
         stored where the second one would be stored normally, and vice versa;
         for example, if you provide <0x41 0x61>, then 0x61 is stored where
         0x41 would be stored normally, and 0x61 is stored where 0x41 would be
         stored normally.  The last 7 bytes are used for character width
         information for up to three supplementary code sets.




   7/91                                                                 Page 1





   chrtbl(1M)                                                       chrtbl(1M)


         The second output file (a data file) contains the same information,
         but is structured for efficient use by the character classification
         and conversion routines (see ctype(3C)).  The name of this output
         file is the value you assign to the keyword LCCTYPE read in from
         file.  Before this file can be used by the character classification
         and conversion routines, it must be installed in the
         /usr/lib/locale/locale directory with the name LCCTYPE by someone
         who is super-user or a member of group bin.  This file must be
         readable by user, group, and other; no other permissions should be
         set.  To use the character classification and conversion tables in
         this file, set the LCCTYPE environment variable appropriately (see
         environ(5) or setlocale(3C)).

         The third output file (a data file) is created only if numeric
         formatting information is specified in the input file.  The name of
         this output file is the value you assign to the keyword LCNUMERIC
         read in from file.  Before this file can be used, it must be
         installed in the /usr/lib/locale/locale directory with the name
         LCNUMERIC by someone who is super-user or a member of group bin.
         This file must be readable by user, group, and other; no other
         permissions should be set.  To use the numeric formatting information
         in this file, set the LCNUMERIC environment variable appropriately
         (see environ(5) or setlocale(3C)).

         The name of the locale where you install the files LCCTYPE and
         LCNUMERIC should correspond to the conventions defined in file.  For
         example, if French conventions were defined, and the name for the
         French locale on your system is french, then you should install the
         files in /usr/lib/locale/french.

         If no input file is given, or if the argument "-" is encountered,
         chrtbl reads from standard input.

         The syntax of file allows the user to define the names of the data
         files created by chrtbl, the assignment of characters to character
         classifications, the relationship between upper and lower-case
         letters, byte and screen widths for up to three supplementary code
         sets, and three items of numeric formatting information: the decimal
         delimiter, the thousands delimiter and the grouping.  The keywords
         recognized by chrtbl are:

               LCCTYPE         name of the data file created by chrtbl to
                                contain character classification, conversion,
                                and width information

               isupper          character codes to be classified as upper-case
                                letters

               islower          character codes to be classified as lower-case
                                letters



   Page 2                                                                 7/91





   chrtbl(1M)                                                       chrtbl(1M)


               isdigit          character codes to be classified as numeric

               isspace          character codes to be classified as spacing
                                (delimiter) characters

               ispunct          character codes to be classified as
                                punctuation characters

               iscntrl          character codes to be classified as control
                                characters

               isblank          character code for the blank (space) character

               isxdigit         character codes to be classified as
                                hexadecimal digits

               ul               relationship between upper- and lower-case
                                characters

               cswidth          byte and screen width information (by default,
                                each is one character wide)

               LCNUMERIC       name of the data file created by chrtbl to
                                contain numeric formatting information

               decimalpoint    decimal delimiter

               thousandssep    thousands delimiter

               grouping         string in which each element is taken as an
                                integer that indicates the number of digits
                                that comprise the current group in a formatted
                                non-monetary numeric quantity.

         Any lines with the number sign (#) in the first column are treated as
         comments and are ignored.  Blank lines are also ignored.

         Characters for isupper, islower, isdigit, isspace, ispunct, iscntrl,
         isblank, isxdigit, and ul can be represented as a hexadecimal or
         octal constant (for example, the letter a can be represented as 0x61
         in hexadecimal or 0141 in octal).  Hexadecimal and octal constants
         may be separated by one or more space and/or tab characters.

         The dash character (-) may be used to indicate a range of consecutive
         numbers. Zero or more space characters may be used for separating the
         dash character from the numbers.

         The backslash character (\) is used for line continuation.  Only a
         carriage return is permitted after the backslash character.




   7/91                                                                 Page 3





   chrtbl(1M)                                                       chrtbl(1M)


         The relationship between upper- and lower-case letters (ul) is
         expressed as ordered pairs of octal or hexadecimal constants:
         <upper-case_character lower-case_character>.  These two constants may
         be separated by one or more space characters.  Zero or more space
         characters may be used for separating the angle brackets (< >) from
         the numbers.

         The following is the format of an input specification for cswidth:
         n1:s1,n2:s2,n3:s3
         where,
              n1    byte width for supplementary code set 1, required
              s1    screen width for supplementary code set 1
              n2    byte width for supplementary code set 2
              s2    screen width for supplementary code set 2
              n3    byte width for supplementary code set 3
              s3    screen width for supplementary code set 3

         decimalpoint and thousandssep are specified by a single character
         that gives the delimiter. grouping is specified by a quoted string in
         which each member may be in octal or hex representation. For example,
         \3 or \x3 could be used to set the value of a member of the string to
         3.

   EXAMPLE
         The following is an example of an input file used to create the USA-
         ENGLISH code set definition table in a file named usa and the non-
         monetary numeric formatting information in a file name num-usa.
               LCCTYPE  usa
               isupper   0x41 - 0x5a
               islower   0x61 - 0x7a
               isdigit   0x30 - 0x39
               isspace   0x20 0x9 - 0xd
               ispunct   0x21 - 0x2f   0x3a - 0x40 \
                         0x5b - 0x60   0x7b - 0x7e
               iscntrl   0x0 - 0x1f    0x7f
               isblank   0x20
               isxdigit  0x30 - 0x39   0x61 - 0x66 \
                         0x41 - 0x46
               ul       <0x41 0x61> <0x42 0x62> <0x43 0x63>  \
                        <0x44 0x64> <0x45 0x65> <0x46 0x66>  \
                        <0x47 0x67> <0x48 0x68> <0x49 0x69>  \
                        <0x4a 0x6a> <0x4b 0x6b> <0x4c 0x6c>  \
                        <0x4d 0x6d> <0x4e 0x6e> <0x4f 0x6f>  \
                        <0x50 0x70> <0x51 0x71> <0x52 0x72>  \
                        <0x53 0x73> <0x54 0x74> <0x55 0x75>  \
                        <0x56 0x76> <0x57 0x77> <0x58 0x78>  \
                        <0x59 0x79> <0x5a 0x7a>
               cswidth           1:1,0:0,0:0
               LCNUMERIC  numusa
               decimalpoint           .
               thousandssep           ,


   Page 4                                                                 7/91





   chrtbl(1M)                                                       chrtbl(1M)


               grouping                "\3"

   FILES
         /usr/lib/locale/locale/LCCTYPE
                         data files containing character classification,
                         conversion, and character-set width information
                         created by chrtbl
         /usr/lib/locale/locale/LCNUMERIC
                         data files containing numeric formatting information
                         created by chrtbl
         /usr/include/ctype.h
                         header file containing information used by character
                         classification and conversion routines
         /usr/lib/locale/C/chrtblC
                         input file used to construct LCCTYPE and LCNUMERIC
                         in the default locale.

   SEE ALSO
         environ(5).
         ctype(3C), setlocale(3C) in the Programmer's Reference Manual.

   DIAGNOSTICS
         The error messages produced by chrtbl are intended to be self-
         explanatory. They indicate errors in the command line or syntactic
         errors encountered within the input file.

   WARNING
         Changing the files in /usr/lib/locale/C will cause the system to
         behave unpredictably.
























   7/91                                                                 Page 5

Museum

Related Articles