Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ charmap(4) — HP-UX 9.05

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

localedef(1M)

localedef(4)

charmap(4)

NAME

charmap − symbolic translation file for localedef scripts

SYNOPSIS

localedef -f charmap locale_name

DESCRIPTION

Invoking the localedef command with the -f option causes symbolic names in the localedef script to be translated into the encodings given in the charmap file (see localedef(1M). A localedef script can be written partly or completely in terms of the symbolic names.

The charmap file has two sections: a declarations section, and a character definition section. 

Declarations Section

The following declarations can precede the character definitions.  Each consists of the symbol shown in the following list, including the surrounding angle brackets, followed by one or more blanks (tab or space characters), followed by the value of the symbol.  No declarations are required (all are optional). 

<code_set_name>
The name of the coded character set for which the charmap file is defined.

<mb_cur_max>
The maximum number of bytes in a multibyte character. Defaults to 1 if not given.

<mb_cur_min>
The minimum number of bytes in a character for the encoded character set. The value must be less than or equal to <mb_cur_max>.  If not given, the default is equal to <mb_cur_max>. 

<escape_char>
The character used to escape characters that otherwise would have special meaning. If not given, the default is backslash (\). 

<comment_char>
The character used to begin comments when placed in column one of the charmap file.  If not given, the default is the # character. 

Character Definition Section

The character-set mapping definitions are the lines immediately following an identifier line containing the string CHARMAP and preceding a trailer line consisting of the string END CHARMAP.  Empty lines and lines beginning with the comment character are ignored.  The character definition lines are of two forms. 

<symbolic_name> encoding [comment_text]
<symbolic_name> ...  <symbolic_name> encoding [comment_text]

The first form defines a single character and its encoding.  A symbolic name is one or more visible characters from the character set illustrated in the EXAMPLES section below enclosed in angle brackets.  Metacharacters such as angle brackets, escape characters, or comment characters must be escaped if they are used in the name.  Two or more symbolic names can be given for the same encoding.  The encoding is a character constant in one of four forms. 

character A single character has the value of that character’s encoding in the current character set (i.e. the character set in the executing environment). 

decimal An escape character followed by the letter d, followed by one to three decimal digits. 

octal An escape character followed by one to three octal digits. 

hexadecimal An escape character followed by an x, followed by two hexadecimal digits. 

Multibyte characters are represented by the concatenation of character constants.  All constants used in the encoding of a multibyte character must be of the same form. 

The second form of character definition line defines a range of characters consisting of all characters from the first symbolic name to the second, inclusive.  The symbolic name must consist of one or more nonnumeric characters followed by an integer formed of one or more decimal digits.  The integer part of the second symbolic name must be larger than that of the first.  The range is then interpreted as a list of symbolic names consisting of the same character portion and successive integer values from the first through the last.  These names are assigned successive encodings starting with the one given. 

For example, the character definition line

<C4>...<C6>   \d129

is equivalent to:

<C4>        \d129
<C5>        \d130
<C6>        \d131

EXAMPLES

The following is the charmap file for the POSIX (same as C) locale.  Any charmap file is required to contain these symbolic names, but the mappings can be different for different encoded character sets. 

<code_set_name>         ROMAN8
<mb_cur_max>            1
<mb_cur_min>            1
<escape_char>           \
<comment_char>          #
CHARMAP
<NUL>                   \000       # demonstrates octal form
<alert>                 \x07       # demonstrates hex form
<backspace>             \d8        # demonstrates decimal form
<tab>                   \011
<newline>               \d10
<vertical-tab>          \x0b
<form-feed>             \014
<carriage-return>       \d13
<space>                 \x20
<exclamation-mark>      !
<quotation-mark>        "
<number-sign>           #
<dollar-sign>           $
<percent-sign>          %
<ampersand>             &
<apostrophe>            ’
<left-parenthesis>      (
<right-parenthesis>     )
<asterisk>              *
<plus-sign>             +
<comma>                 ,
<hyphen>                -
<hyphen-minus>          -
<period>                .
<full-stop>             .
<slash>                 /
<solidus>               /       # note duplicate definition
<zero>                  0
<one>                   1
<two>                   2
<three>                 3
<four>                  4
<five>                  5
<six>                   6
<seven>                 7
<eight>                 8
<nine>                  9
<colon>                 :
<semicolon>             ;
<less-than-sign>        <
<equals-sign>           \=
<greater-than-sign>     >
<question-mark>         ?
<commercial-at>         @
<commercial-at>         @
<A>                     A
<B>                     B
<C>                     C
<D>                     D
<E>                     E
<F>                     F
<G>                     G
<H>                     H
<I>                     I
<J>                     J
<K>                     K
<L>                     L
<M>                     M
<N>                     N
<O>                     O
<P>                     P
<Q>                     Q
<R>                     R
<S>                     S
<T>                     T
<U>                     U
<V>                     V
<W>                     W
<X>                     X
<Y>                     Y
<Z>                     Z
<left-square-bracket>   [
<backslash>             \
<reverse-solidus>       \       # note duplicate definition
<right-square-bracket>  ]
<circumflex>            ^
<circumflex-accent>     ^       # note duplicate definition
<underscore>            \_
<low-line>              \_      # note duplicate definition
<grave-accent>          ‘
<a>                     a
<b>                     b
<c>                     c
<d>                     d
<e>                     e
<f>                     f
<g>                     g
<h>                     h
<i>                     i
<j>                     j
<k>                     k
<l>                     l
<m>                     m
<n>                     n
<o>                     o
<p>                     p
<q>                     q
<r>                     r
<s>                     s
<t>                     t
<u>                     u
<v>                     v
<w>                     w
<x>                     x
<y>                     y
<z>                     z
<left-brace>            {
<left-curly-bracket>    {       # note duplicate definition
<vertical-line>         |
<right-brace>           }
<right-curly-bracket>   }       # note duplicate definition
<tilde>                 ~
END CHARMAP

SEE ALSO

localedef(1M), localedef(4)

STANDARDS CONFORMANCE

localedef POSIX.2, XPG4. 

Hewlett-Packard Company  —  HP-UX Release 9.0: August 1992

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026