Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ chrtbl(1a) — NEWS-os 5.0.1

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

environ(5)

ctype(3C)

setlocale(3C)



chrtbl(1M)       SYSTEM ADMINISTRATION COMMANDS        chrtbl(1M)



NAME
     chrtbl - generate character  classification  and  conversion
     tables

SYNOPSIS
     chrtbl [file]

DESCRIPTION
     The chrtbl command creates two tables containing information
     on  character  classification,  upper/lower-case conversion,
     character-set width, and numeric editing.  One table  is  an
     array of (257*2) + 7 bytes that is encoded so a table lookup
     can be used to determine the character classification  of  a
     character, convert a character (see ctype(3C)), and find the
     byte and screen width of a character in one of  the  supple-
     mentary  code  sets.   The  other table is 2 bytes long: the
     first byte specifies the decimal delimiter; the second  byte
     specifies the thousands delimiter.

     chrtbl reads the user-defined character  classification  and
     conversion  information  from  file and creates three output
     files in the current directory.  To construct file, use  the
     file  supplied  in  /usr/lib/locale/C/chrtblC as a starting
     point.  You may add entries, but do not change the  original
     values  supplied  with  the  system.  For example, for other
     locales you may wish to add eight-bit entries to  the  ASCII
     definitions provided in this file.  One output file, ctype.c
     (a C-language source file), contains a (257*2)+7-byte  array
     generated  from  processing  the information from file.  You
     should review the content of  ctype.c  to  verify  that  the
     array is set up as you had planned.  (In addition, an appli-
     cation program could use ctype.c.)  The first 257  bytes  of
     the  array in ctype.c are used for character classification.
     The characters used for  initializing  these  bytes  of  the
     array  represent  character classifications that are defined
     in /usr/include/ctype.h; for example, L means  a  character
     is  lower case and S|B means the character is both a spac-
     ing character and a blank.  The  second  257  bytes  of  the
     array are used for character conversion.  These bytes of the
     array are initialized so that characters for  which  you  do
     not  provide  conversion  information  will  be converted to
     themselves.  When you do provide conversion information, the
     first value of the pair is stored where the second one would
     be stored normally, and vice versa; for example, if you pro-
     vide  <0x41  0x61>,  then 0x61 is stored where 0x41 would be
     stored normally, and 0x61 is  stored  where  0x41  would  be
     stored  normally.   The  last 7 bytes are used for character
     width information for up to three supplementary code sets.

     The second output file  (a  data  file)  contains  the  same
     information,  but  is  structured  for  efficient use by the
     character  classification  and  conversion   routines   (see



                                                                1





chrtbl(1M)       SYSTEM ADMINISTRATION COMMANDS        chrtbl(1M)



     ctype(3C)).   The  name of this output file is the value you
     assign to the keyword LCCTYPE read in  from  file.   Before
     this  file  can  be used by the character classification and
     conversion  routines,  it   must   be   installed   in   the
     /usr/lib/locale/locale  directory  with the name LCCTYPE by
     someone who is super-user or a member of  group  bin.   This
     file  must  be  readable by user, group, and other; no other
     permissions should be set.  To use the character classifica-
     tion  and  conversion  tables in this file, set the LCCTYPE
     environment  variable  appropriately  (see   environ(5)   or
     setlocale(3C)).

     The third output file (a  data  file)  is  created  only  if
     numeric  editing information is specified in the input file.
     The name of this output file is the value you assign to  the
     keyword  LCNUMERIC read in from file.  Before this file can
     be used, it must be installed in the  /usr/lib/locale/locale
     directory  with the name LCNUMERIC by someone who is super-
     user or a member of group bin.  This file must  be  readable
     by  user,  group,  and other; no other permissions should be
     set.  To use the numeric editing information in  this  file,
     set  the  LCNUMERIC environment variable appropriately (see
     environ(5) or setlocale(3C)).

     The name of the locale where you install the files  LCCTYPE
     and  LCNUMERIC should correspond to the conventions defined
     in file.  For example, if French conventions  were  defined,
     and the name for the French locale on your system is french,
     then you should install the files in /usr/lib/locale/french.

     If no input file is given, or if the argument "-" is encoun-
     tered, chrtbl reads from standard input.

     The syntax of file allows the user to define  the  names  of
     the  data files created by chrtbl, the assignment of charac-
     ters to character classifications, the relationship  between
     upper  and lower-case letters, byte and screen widths for up
     to three supplementary code sets, and two items  of  numeric
     editing information: the decimal delimiter and the thousands
     delimiter.  The keywords recognized by chrtbl are:

          LCCTYPE         name  of  the  data  file  created  by
                           chrtbl  to contain character classifi-
                           cation, conversion, and width informa-
                           tion

          isupper          character codes to  be  classified  as
                           upper-case letters

          islower          character codes to  be  classified  as
                           lower-case letters




                                                                2





chrtbl(1M)       SYSTEM ADMINISTRATION COMMANDS        chrtbl(1M)



          isdigit          character codes to  be  classified  as
                           numeric

          isspace          character codes to  be  classified  as
                           spacing (delimiter) characters

          ispunct          character codes to  be  classified  as
                           punctuation characters

          iscntrl          character codes to  be  classified  as
                           control characters

          isblank          character code for the  blank  (space)
                           character

          isxdigit         character codes to  be  classified  as
                           hexadecimal digits

          ul               relationship   between   upper-    and
                           lower-case characters

          cswidth          byte and screen width information  (by
                           default, each is one character wide)

          LCNUMERIC       name  of  the  data  file  created  by
                           chrtbl   to  contain  numeric  editing
                           information

          decimalpoint    decimal delimiter

          thousandssep    thousands delimiter

     Any lines with the number sign (#) in the first  column  are
     treated  as  comments and are ignored.  Blank lines are also
     ignored.

     Characters for isupper, islower, isdigit, isspace,  ispunct,
     iscntrl,  isblank,  isxdigit, and ul can be represented as a
     hexadecimal or octal constant (for example, the letter a can
     be  represented  as  0x61  in hexadecimal or 0141 in octal).
     Hexadecimal and octal constants may be separated by  one  or
     more space and/or tab characters.

     The dash character (-) may be used to indicate  a  range  of
     consecutive  numbers.  Zero  or more space characters may be
     used for separating the dash character from the numbers.

     The backslash character (\) is used for  line  continuation.
     Only  a  carriage  return  is  permitted after the backslash
     character.





                                                                3





chrtbl(1M)       SYSTEM ADMINISTRATION COMMANDS        chrtbl(1M)



     The relationship between upper- and lower-case letters  (ul)
     is  expressed  as ordered pairs of octal or hexadecimal con-
     stants:  <upper-case_character lower-case_character>.  These
     two  constants may be separated by one or more space charac-
     ters.  Zero  or  more  space  characters  may  be  used  for
     separating the angle brackets (< >) from the numbers.

     The following is the format of an  input  specification  for
     cswidth:
     n1:s1,n2:s2,n3:s3
     where,
          n1   byte width for supplementary code set 1, required
          s1   screen width for supplementary code set 1
          n2   byte width for supplementary code set 2
          s2   screen width for supplementary code set 2
          n3   byte width for supplementary code set 3
          s3   screen width for supplementary code set 3

EXAMPLE
     The following is an example of an input file used to  create
     the ASCII code set definition table in a file named ascii.
          LC_CTYPE  ascii
          isupper   0x41 - 0x5a
          islower   0x61 - 0x7a
          isdigit   0x30 - 0x39
          isspace   0x20 0x9 - 0xd
          ispunct   0x21 - 0x2f    0x3a - 0x40    \
                    0x5b - 0x60    0x7b - 0x7e
          iscntrl   0x0 - 0x1f     0x7f
          isblank   0x20
          isxdigit  0x30 - 0x39    0x61 - 0x66    \
                    0x41 - 0x46
          ul       <0x41 0x61> <0x42 0x62> <0x43 0x63>  \
                   <0x44 0x64> <0x45 0x65> <0x46 0x66>  \
                   <0x47 0x67> <0x48 0x68> <0x49 0x69>  \
                   <0x4a 0x6a> <0x4b 0x6b> <0x4c 0x6c>  \
                   <0x4d 0x6d> <0x4e 0x6e> <0x4f 0x6f>  \
                   <0x50 0x70> <0x51 0x71> <0x52 0x72>  \
                   <0x53 0x73> <0x54 0x74> <0x55 0x75>  \
                   <0x56 0x76> <0x57 0x77> <0x58 0x78>  \
                   <0x59 0x79> <0x5a 0x7a>
          cswidth        1:1,0:0,0:0
          LC_NUMERIC     num_ascii
          decimal_point       .
          thousands_sep       ,

FILES
     /usr/lib/locale/locale/LCCTYPE
                     data files containing character  classifica-
                     tion,  conversion,  and  character-set width
                     information created by chrtbl
     /usr/lib/locale/locale/LCNUMERIC



                                                                4





chrtbl(1M)       SYSTEM ADMINISTRATION COMMANDS        chrtbl(1M)



                     data files containing numeric editing infor-
                     mation created by chrtbl
     /usr/include/ctype.h
                     header file containing information  used  by
                     character classification and conversion rou-
                     tines
     /usr/lib/locale/C/chrtblC
                     input file used to  construct  LCCTYPE  and
                     LCNUMERIC in the default locale.

SEE ALSO
     environ(5).
     ctype(3C),  setlocale(3C)  in  the  Programmer's   Reference
     Manual.

DIAGNOSTICS
     The error messages produced by chrtbl  are  intended  to  be
     self-explanatory.  They  indicate errors in the command line
     or syntactic errors encountered within the input file.

WARNING
     Changing the files in /usr/lib/locale/C will cause the  sys-
     tem to behave unpredictably.
































                                                                5



Typewritten Software • bear@typewritten.org • Edmonds, WA 98026