localedef(4)

NAME

localedef − format and semantics of input script

DESCRIPTION

This is a description of the syntax and meaning of the script that is provided as input to the localedef command to create a locale (see localedef(1M).

The following is a list of category tags, keywords and subsequent expressions which are recognized by localedef. The order of keywords within a category is irrelevant with the exception of the modifier and copy keywords and other exceptions noted under the LC_COLLATE description. (Note that, as a convention, the category tags are composed of uppercase characters, while the keywords are composed of lowercase characters).

Category Tags and Keywords

The following keywords do not belong to any category.

langname String identifying the name of the language. It follows the naming convention of the LANG environment variable:

language [_ territory][ .codeset]

(see environ(5)). This keyword is required by localedef if the command line invoking localedef does not contain the locale_name (see localedef(1M)).

langid Decimal number identifying the language ID. This keyword is required by localedef if the command line invoking localedef does not contain the locale_name (see localedef(1M)). The language ID specified should be in the range of 1 to 999, and any user-defined language should assign its language ID in the range of 901 to 999.

revision String identifying the revision number of the locale.inf file. The string is restricted to contain at most 6 characters, all digits and one optional decimal point (.) character.

comment_char Single character indicating the character to be interpreted as starting a comment line within the script. The default comment_char is #. All lines beginning with a comment_char are ignored.

escape_char A single character indicating the character to be interpreted as an escape character within the script. The default escape_char is \e . escape_char is used to escape localedef metacharacters to remove special meaning and in the character constant decimal, octal, and hexadecimal formats.

The following keywords can be used in any category.

modifier String identifying the name of the modifier (see environ(5)). A modifier is used when a category has more than one definition. A modifier string is associated with each definition. Since this keyword is used to associate a modifier with a set of specifications, it must come before any keyword in that set of specifications.

copy A string naming another valid locale available on the system. This causes the category in the locale being created to be a copy of the same category in the named locale. Since the copy keyword defines the entire category, if used, it must be the only keyword in the category.

LC_CTYPE:
The following keywords belong to the LC_CTYPE category and should come between the category tag LC_CTYPE and END LC_CTYPE :

upper Character codes classified as uppercase letters.

lower Character codes classified as lowercase letters.

digit Character codes classified as numeric.

space Character codes classified as spacing (delimiter) characters.

punct Character codes classified as punctuation characters.

cntrl Character codes classified as control characters.

blank Character codes for printable space characters. These also must be defined in space.

xdigit Character codes classified as hexadecimal digits.

alpha Character codes classified as alphabetic characters. If omitted, this class is the concatenation of the upper and lower classes.

print Character codes classified as printable characters. If omitted, this class is the concatenation of the upper, lower, alpha, digit, xdigit, and punct classes and the <space> character.

graph Character codes classified as graphic characters. If omitted, this class is all characters included in graph except the <space> character.

first Character codes classified as the first bytes of two-byte characters.

second Character codes classified as the second bytes of two-byte characters.

toupper Lowercase to uppercase character relationships.

tolower Uppercase to lowercase character relationships.

bytes_char String containing the maximum number of bytes per character for the character set used for a specified language (a langinfo(5) item).

alt_punct String mapped into the ASCII equivalent string “b!"#$%&’()*+,-./:;<=>?@[\]^_‘{}~”, where b is a blank (a langinfo(5) item).

code_scheme Specifies the multi-byte character encoding scheme used. The operand should be a string. Currently, “HP15” and “EUC” strings are recognized. If this keyword is not specified, or the operand is a null string (""), the encoding scheme is single-byte, or HP15 if bytes_char is 2. See Native Language Support User’s Guide.

cswidth Defines the number of bytes contained in a character, and the number of columns per character displayed on output devices. This keyword should be specified if the encoding scheme is “EUC”. EUC can be divided into 4 Supplementary Code Sets. The first SCS, Supplementary Code Set 0, contains ASCII characters and is assumed to contain 1 byte per character and require 1 column on the output devices. The operand is a string containing three ordered pairs of digits delimited by colons and commas. The format is:

X:x,Y: y,Z:z

Field	Interpretation
X	SCS 1, number of bytes
x	SCS 1, output width
Y	SCS 2, number of bytes, after SS2
y	SCS 2, output width
Z	SCS 3, number of bytes, after SS3
z	SCS 3, output width

LC_COLLATE:
The following keywords belong to the LC_COLLATE category and should come between the category tag LC_COLLATE and END LC_COLLATE. The first three keywords can be in any order, but must come before the order_start keyword. Any number of these three keywords can be specified.

collating-element <symbol> from string
Defines a multi-character collating element, symbol, composed of the characters in string. String is limited to two characters.

collating-symbol <symbol>
Makes symbol a collating symbol which can be used to define a place in the collating sequence. Symbol does not represent any actual character.

order_start Denotes the start of the collation sequence. The directives have an effect on string collation.

The lines following the order_start keyword and before the order_end keyword contain collating element entries, one per line.

order_end Marks the end of the list of collating element entries.

LC_MONETARY:
The following keywords belong to the LC_MONETARY category and should come between the category tag LC_MONETARY and END LC_MONETARY. These keywords, except crncystr, and mon_grouping, are identical to the members in struct lconv defined in <locale.h> (see localeconv(3C)):

int_curr_symbol
currency_symbol
mon_decimal_point
mon_thousands_sep
positive_sign
negative_sign
int_frac_digits
frac_digits
p_cs_precedes
p_sep_by_space
n_cs_precedes
n_sep_by_space
p_sign_posn
n_sign_posn

crncystr String for specifying the currency (a langinfo(5) item).

mon_grouping A semicolon-separated list of integers. The initial integer defines the size of the group immediately preceding the decimal delimiter, and the following integers define the preceding groups (an lconv item).

LC_NUMERIC:
The following keywords belong to the LC_NUMERIC category and should come between the category tag LC_NUMERIC and END LC_NUMERIC:

grouping semicolon-separated list of integers. The initial integer defines the size of the group immediately preceding the decimal delimiter, and the following integers define the preceding groups (see struct lconv defined in <locale.h> and localeconv(3C)).

decimal_point
same as RADIXCHAR, a langinfo(5) item.

thousands_sep
same as THOUSEP, a langinfo(5) item.

alt_digit String mapped into the ASCII equivalent string “0123456789b+-.,eE”, where b is a blank (a langinfo(5) item). The alt_digit keyword is a HP extension to the localedef POSIX standards and it has a different meaning than the alt_digits defined in POSIX standards.

LC_TIME:
The following keywords belong to the LC_TIME category and should come between the category tag LC_TIME and END LC_TIME. These keywords define information described in langinfo(5) (see langinfo(5)).

d_t_fmt
d_fmt
t_fmt
t_fmt_ampm

day Seven semicolon-separated strings giving names for the days of the week beginning with Sunday. Correspond to langinfo items day_1 through day_7.

abday Seven semicolon-separated strings giving abbreviated names for the days of the week beginning with Sunday. Correspond to langinfo items abday_1 through abday_7.

mon Twelve semicolon-separated strings giving names for the months, beginning with January. Correspond to langinfo items mon_1 through mon_12.

abmon Twelve semicolon-separated strings giving abbreviated names for the months, beginning with January. Correspond to langinfo items abmon_1 through abmon_12.

am_pm Two semicolon-separated strings giving the representations for AM and PM.

year_unit
mon_unit
day_unit
hour_unit
min_unit
sec_unit
era_d_fmt

era Names and dates of eras or emperors.

LC_MESSAGES:
The following keywords belong to the LC_MESSAGES category and should come between the category tag LC_MESSAGES and END LC_MESSAGES:

yesexpr An Extended Regular Expression matching acceptable affirmative responses to yes/no queries.

noexpr An Extended Regular Expression matching acceptable negative responses to yes/no queries.

yesstr String identifying the affirmative response for yes/no questions (a langinfo(5) item). This keyword is now obsolete and yesexpr should be used instead.

nostr String identifying the negative response for yes/no questions (a langinfo(5) item). This keyword is now obsolete and noexpr should be used instead.

LC_ALL:
The following keywords belong to the LC_ALL category and should come between the category tag LC_ALL and END LC_ALL:

direction
String indicating text direction (a langinfo(5) item).

context
String indicating character context analysis. String “null” or “0” indicates no context analysis is required. String “1” indicates Arabic context analysis required.

Keyword Operands

Keyword operands consist of character-code constants, strings, and metacharacters. The types of legal expressions are: character lists, string lists, integer lists, shift, collating element entries, regular expression, and string:

character lists
Character list operands follow the keywords upper, lower, digit, space, punct, cntrl, blank, xdigit, alpha, print, graph, first, and second and consist of single character-code constants or symbolic names separated by semicolons or a character-code range consisting of a constant or symbolic name followed by an ellipsis followed by another constant or symbolic name. The constant preceding the ellipsis must have a smaller code value than the constant following the ellipsis. A range represents a set of consecutive character codes. If the list is longer than a single line, the escape character must be used at the end of each line as a continuation character. It is an error to use any symbolic name that is not defined in an accompanying charmap file (see charmap(4)).

string lists
String list operands follow the keywords day, abday, mon, abmon and am_pm , and consist of strings separated by semicolons. If longer than one line, the escape character must be used for continuation.

integer lists
Integer list operands follow the keywords mon_grouping, int_frac_digits, frac_digits, p_cs_precedes, p_sep_by_space, n_cs_precedes, n_sep_by_space, p_sign_posn, n_sign_posn, and grouping. An integer list consists of one or more decimal digits separated by semicolons.

shift Shift operands follow keywords toupper and tolower, and must consist of two character-code constants enclosed by left and right parentheses and separated by a comma. Each such character pair is separated from the next by a semicolon. For tolower, the first constant represents an uppercase character and the second the corresponding lowercase character. For toupper, the first constant represents an lowercase character and the second the corresponding uppercase character.

collating element entry
The order_start keyword is followed by collating element entries, one per line, in ascending order by collating position. The collating element entries have the form:

collation_element [ weight [; weight ]]

collation_element can be a character, a collating symbol enclosed in angle brackets representing a character or collating element, the special symbol UNDEFINED or an ellipsis (...).

A character stands for itself; a collating symbol can be a symbolic name for a character that is interpreted by the charmap file, a multi-character collating element defined by a collating-element keyword, or a collating symbol defined by the collating-symbol keyword.

The special symbol UNDEFINED specifies the collating position of any characters not explicitly defined by collating element entries. For example, if some group of characters is to be omitted from the collation sequence and just collate after all defined characters, a collating symbol might be defined before the order_start keyword:

collating-symbol <HIGH>

Then somewhere in the list of collating element entries:

UNDEFINED <HIGH>

Notice that there is no second weight. This means that on a second pass all characters collate by their encoded value.

An ellipsis is interpreted as a list of characters with an encoded value higher than that of the character on the preceding line and lower than that on the following line. Because it is tied to encoded value of characters, the ellipsis is inherently non-portable. If it is used, a warning is issued and no output generated unless the -c option was given.

The weight operands provide information about how the collating element is to be collated on first and subsequent passes. Weight can be a two-character string, the special symbol IGNORE, or a collating element of any of the forms specified for collating_element except UNDEFINED. If there are no weights, the character is collating strictly by its position in the list. If there is only one weight given, the character sorts by its relative position in the list on the second collation pass.

An equivalence class is defined by a series of collating element entries all having the same character or symbol in the first weight position. For example, in many locales all forms of the character This is represented in the collating element entries as:

’A’ ’A’;’A’ # first element of equivalence class
’a’ ’A’;’a’ # next element of class

Two-to-one collating elements are specified by collating-elements defined before the order_start keyword. For example, the two-to-one collating element CH in Spanish, would be defined before the order_start keyword as

collating element <CH> from CH

It would then be used in a collating element entry as <CH>.

A one-to-two collating element is defined by having a two-character string in one of the weight positions. For example, if the character ’X’ collates equal to the pair "AE", the collating element entry would be:

’X’ AE ;’X’

A don’t-care character is defined by the special symbol IGNORE. For example, the dash character, ’-’ may be a don’t care on the first collation pass. The collating element entry is:

’-’IGNORE;’-’

Symbols defined by the collating-symbol keyword can be used to indicate that a given character collates higher or lower than some position in the sequence. For example if all characters with an encoded value less than that of ’0’ are to collate lower than all other charactes on the first pass, and in relative order on the second pass, define a collating symbol before the order_start keyword:

collating-symbol <LOW>

The first two collating element entries are then:

... <LOW>;...
’0’ ’0’;’0’

This also illustrates the use of the ellipsis to indicate a range. The first ellipsis is interpreted as "all characters in the encoded character set with a value lower than ’0’"; the second ellipsis means that all characters in the range defined by the first collate in relative order.

regular expression
Regular expression operands follow the keywords yesexpr and noexpr, which can be Extended Regular Expressions as described in regexp(5).

string String operands follow all langinfo-type, lconv-type and era keywords except mon_grouping and grouping. Each expression is a string (see Strings section below).

The expressions following the langinfo-type keywords define the strings associated with items in langinfo(5). Each expression consists of a string to be associated with the item identified by the keyword.

The expressions following the lconv-type keywords define the strings associated with members of lconv struct in localeconv(3C). Each expression consists of a string to be associated with the member identified by the keyword.

Each expression following the keyword era defines how the years are counted and displayed for one era (or emperor’s reign). The expressions must be in the following format:

direction:offset:start_date:end_date:name:format

where:

direction Either a + or - character. The + character indicates the time axis should be such that the years count in the positive direction when moving from the starting date towards the ending date. The - character indicates the time axis should be such that the years count in the negative direction when moving from the starting date towards the ending date.

offset A number in the range [SHRT_MIN,SHRT_MAX] indicating the number of the first year of the era.

start_date A date in the form yyyy/mm/dd where yyyy, mm, and dd are the year, month and day numbers, respectively, of the start of the era. Years prior to the year 0 A.D. are represented as negative numbers. For example, an era beginning March 5th in the year 100 B.C. would be represented as 3-100/3/5. Years in the range [SHRT_MIN+1,SHRT_MAX-1] are supported.

end_date The ending date of the era in the same form as the start_date above or one of the two special values -* or +*. A value of -* indicates the ending date of the era extends to the beginning of time while +* indicates it extends to the end of time. The ending date can be chronologically either before or after the starting date of an era. For example, the expressions for the Christian eras A.D. and B.C. would be:

+:0:0000/01/01:+∗:A.D.:%o %N
+:1:-0001/12/31:-∗:B.C.:%o %N

name A string representing the name of the era which is substituted for the %N directive of date and strftime() (see date(1) and strftime(3C)).

format A string for formatting the %E directive of date(1) and strftime(3C). This string is usually a function of the %o and %N directives. If format is not specified, the string specified for the LC_TIME category keyword era_d_fmt (see above) is used as a default.

Constants

Constants represent character codes in the operands. They can used in the following forms:

decimal constants
An escape character followed by a ’d’ followed by up to three decimal digits.

octal constants An escape character followed by up to three octal digits.

hexadecimal constants
An escape character followed by a ’x’ followed by two hexadecimal digits.

character constants
A single character enclosed in single quotes or separated from any other single character by a semicolon, comma or <blank> having the numerical value of the character in the machine’s character set.

symbolic names A string enclosed between < and > is a symbolic name. localedef scripts can be written entirely in symbolic names and have them interpreted according to a charmap file. This aids portability of localedef scripts between different encoded character sets (see charmap(4)).

Symbolic names can be defined within a script by the collating-element and collating-symbol keywords. These are not character constants. It is an error if such an internally defined symbolic name collides with one defined in a charmap file.

Strings

Strings are used in string and string list operands. A string is a sequence of zero or more characters either surrounded by double quotes (") or delimited by semicolons or <blank>s Within a string, the double-quote character must be preceded by an escape character. The following escape sequences also can be used:

\n newline

\t horizontal tab

\b backspace

\r carriage return

\f form feed

\\ backslash

\’ single quote

\ddd bit pattern

The escape \ddd consists of the escape character followed by 1, 2, or 3 octal digits specifying the value of the desired character. Also, an escape character (\) and an immediately-following newline are ignored.

Although the backslash (\) has been used for illustration, another escape character can be substituted by the escape_char keyword.

Metacharacters

Metacharacters are characters having a special meaning to localedef in operands. To escape the special meaning of these characters, surround them with single quotes or precede them by an escape character. localedef meta-characters include:

< Indicates the beginning of a symbolic name.

> Indicates the end of a symbolic name.

( Indicates the beginning of a character shift pair following the toupper and tolower keywords.

) Indicates the end of a character shift pair.

, Used to separate the characters of a character shift pair.

" Used to quote strings.

; Used as a separator in list operands.

escape character
Used to escape special meaning from other metacharacters and itself. It is backslash (\) by default, but can be redefined by the escape_char keyword.

Comments

Comments are lines beginning with a comment character. The comment character is pound sign (#) by default, but can be redefined by the comment_char keyword. Comments and blank lines are ignored.

Separators

Separator characters include blanks and tabs. Any number of separators can be used to delimit the keywords, metacharacters, constants and strings that comprise a localedef script except that all characters between < and > are considered to be part of the symbolic name even they are <blank>s.

GRAMMAR

The following is a yacc-style grammar for a localedef script as specified by POSIX.2. It omits some elements that are used in localedef in HP-UX but are not required by POSIX.2 such as langname, langid, etc.

The following tokens are processed (in addition to those string constants shown in the grammar):

LOC_NAME String of characters representing the name of a locale.

CHAR Any single character.

NUMBER Decimal number represented by one or more decimal digits.

COLLSYMBOL String of characters in the set of visible glyphs defined in table 2-3, enclosed between angle brackets. The string must not duplicate any charmap symbol defined in the current charmap (if it exists).

CHARSYMBOL Symbolic name, enclosed between angle brackets, from the current charmap (if it exists).

OCTAL_CHAR One or more octal representations of the encoding of each byte in a single character. The octal representation consists of an escape_char (normally a backslash) followed by two or three octal digits.

HEX_CHAR One or more hexadecimal representations of the encoding of each byte in a single character. The hexadecimal representation consists of an escape_char followed by the constant x and two hexadecimal digits.

DECIMAL_CHAR One or more decimal representations of the encoding of each byte in a single character. The decimal representation consists of an escape_char followed by a d and two, three or four decimal digits.

ELLIPSIS The string “...”.

EXTENDED_REG_EXP An extended regular expression (see regexp(5).

EOL The line termination character (new-line character).

This subclause presents the grammar for the locale definition.

%token LOC_NAME
%token CHAR
%token NUMBER
%token COLLSYMBOL COLLELEMENT
%token CHARSYMBOL OCTAL_CHAR HEX_CHAR DECIMAL_CHAR
%token ELLIPSIS
%token EXTENDED_REG_EXP
%token EOL
%start locale_definition
%%
locale_definition       : global_statements locale_categories
        |               locale_categories
        ;
global_statements       : global_statements symbol_redefine
        | symbol_redefine
        ;
symbol_redefine : ’escape_char’ CHAR EOL
        | ’comment_char’ CHAR EOL
        ;
locale_categories       : locale_categories locale_category
        | locale_category
        ;
locale_category : lc_ctype
        | lc_collate
        | lc_messages
        | lc_monetary
        | lc_numeric
        | lc_time
        ;
/*      The following grammar rules are common to all categories */
char_list       : char_list char_symbol
        | char_symbol
        ;
char_symbol     : CHAR
        | CHARSYMBOL
        | OCTAL_CHAR
        | HEX_CHAR
        | DECIMAL_CHAR
        ;
locale_name     : LOC_NAME
        | ’"’ LOC_NAME ’"’
        ;
/*      The following is the LC_CTYPE category grammar */
lc_ctype        : ctype_hdr ctype_keywords      ctype_tlr
        | ctype_hdr ’copy’ locale_name EOL ctype_tlr
        ;
ctype_hdr       : ’LC_CTYPE’ EOL
        ;
ctype_body      : ’copy’ locale_name EOL
        | ctype_keywords
        ;
ctype_keywords : ctype_keywords ctype_keyword
        | ctype_keyword
        ;
ctype_keyword   : charclass_keyword charclass_list EOL
        | charconv_keyword charconv_list EOL
        ;
charclass_keyword       : ’upper’
        | ’lower’
        | ’alpha’
        | ’digit’
        | ’alnum’
        | ’xdigit’
        | ’space’
        | ’print’
        | ’graph’
        | ’blank’
        | ’cntrl’
        ;
charclass_list : charclass_list ’;’ char_symbol
        | charclass_list ’;’ ELLIPSIS ’;’ char_symbol
        | char_symbol
        ;
charconv_keyword        : ’toupper’
        | ’tolower’
        ;
charconv_list   : charconv_list ’;’ charconv_entry
        | charconv_entry
        ;
charconv_entry : ’(’ char_symbol ’,’ char_symbol ’)’
        ;
ctype_tlr       : ’END’ ’LC_CTYPE’ EOL
        ;
/*      The following is the LC_COLLATE category grammar */
lc_collate      : collate_hdr collate_keywords collate_tlr
        | collate_hdr ’copy’ locale_name EOL collate_tlr
        ;
collate_hdr     : ’LC_COLLATE’ EOL
        ;
collate_keywords:               order_statements
        | opt_statements order_statements
        ;
opt_statements : opt_statements collating_symbols
        | opt_statements collating_elements
        | collating_symbols
        | collating_elements
        ;
collating_symbols       : ’collating-symbol’ COLLSYMBOL EOL
        ;
collating_elements      : ’collating-element’ COLLELEMENT
         ’from’ ’"’ char_list ’"’ EOL
        ;
order_statements        : order_start collation_order order_end
        ;
order_start     : ’order_start’ EOL
        ;
collation_order : collation_order collation_entry
        | collation_entry
        ;
collation_entry : COLLSYMBOL EOL
        | collation_element weight_list EOL
        | collation_element             EOL
        ;
collation_element       : char_symbol
        | COLLELEMENT
        | ELLIPSIS
        | ’UNDEFINED’
        ;
weight_list     : weight_list ’;’ weight_symbol
        | weight_list ’;’
        | weight_symbol
        ;
weight_symbol   : char_symbol
        | COLLSYMBOL
        | ’"’ char_list ’"’
        | ELLIPSIS
        | ’IGNORE’
        ;
order_end       : ’order_end’ EOL
        ;
collate_tlr     : ’END’ ’LC_COLLATE’ EOL
        ;
/*      The following is the LC_MESSAGES category grammar */
lc_messages     : messages_hdr messages_keywords        messages_tlr
        | messages_hdr ’copy’ locale_name EOL messages_tlr
        ;
messages_hdr    : ’LC_MESSAGES’ EOL
        ;
messages_keywords       : messages_keywords messages_keyword
        | messages_keyword
        ;
messages_keyword        : ’yesexpr’ ’"’ EXTENDED_REG_EXP ’"’ EOL
        | ’noexpr’ ’"’ EXTENDED_REG_EXP ’"’ EOL
        ;
messages_tlr    : ’END’ ’LC_MESSAGES’ EOL
        ;
/*      The following is the LC_MONETARY category grammar */
lc_monetary     : monetary_hdr monetary_keywords        monetary_tlr
        | monetary_hdr ’copy’ locale_name EOL   monetary_tlr
        ;
monetary_hdr    : ’LC_MONETARY’ EOL
        ;
monetary_keywords       : monetary_keywords monetary_keyword
        | monetary_keyword
        ;
monetary_keyword        : mon_keyword_string mon_string EOL
        | mon_keyword_char NUMBER EOL
        | mon_keyword_char ’-1’ EOL
        | mon_keyword_grouping mon_group_list EOL
        ;
mon_keyword_string      : ’int_curr_symbol’
        | ’currency_symbol’
        | ’mon_decimal_point’
        | ’mon_thousands_sep’
        | ’positive_sign’
        | ’negative_sign’
        ;
mon_string      :’"’ char_list ’"’
        | ’""’
        ;
mon_keyword_char        : ’int_frac_digits’
        | ’frac_digits’
        | ’p_cs_precedes’
        | ’p_sep_by_space’
        | ’n_cs_precedes’
        | ’n_sep_by_space’
        | ’p_sign_posn’
        | ’n_sign_posn’
        ;
mon_keyword_grouping    : ’mon_grouping’
        ;
mon_group_list : NUMBER
        | mon_group_list ’;’ NUMBER
        ;
monetary_tlr    : ’END’ ’LC_MONETARY’ EOL
        ;
/*      The following is the LC_NUMERIC category grammar */
lc_numeric      : numeric_hdr numeric_keywords numeric_tlr
        | numeric_hdr ’copy’ locale_name EOL    numeric_tlr
        ;
numeric_hdr     : ’LC_NUMERIC’ EOL
        ;
numeric_keywords        : numeric_keywords numeric_keyword
        | numeric_keyword
        ;
numeric_keyword : num_keyword_string num_string EOL
        | num_keyword_grouping num_group_list EOL
        ;
num_keyword_string      : ’decimal_point’
        | ’thousands_sep’
        ;
num_string      : ’"’ char_list ’"’
        | ’""’
        ;
num_keyword_grouping    : ’num_grouping’
        ;
num_group_list : NUMBER
        | num_group_list ’;’ NUMBER
        ;
numeric_tlr     : ’END’ ’LC_TIME’ EOL
        ;
/*      The following is the LC_TIME category grammar */
lc_time : time_hdr time_keywords        time_tlr
        | time_hdr ’copy locale_name    EOL time_tlr
        ;
time_hdr        : ’LC_TIME’ EOL
        ;
time_keywords   : time_keywords time_keyword
        | time_keyword
        ;
time_keyword    : time_keyword_name time_list EOL
        | time_keyword_fmt time_string EOL
        | time_keyword_opt time_list EOL
        ;
time_keyword_name       : ’abday’
        | ’day’
        | ’abmon’
        | ’mon’
        ;
time_keyword_fmt        : ’d_t_fmt’
        | ’d_fmt’
        | ’t_fmt’
        | ’am_pm’
        | ’t_fmt_ampm’
        ;
time_keyword_opt        : ’era’
        | ’era_year’
        | ’era_d_fmt’
        | ’alt_digits’
        ;
time_list       : time_list ’;’ time_string
        | time_string
        ;
time_string     : ’"’ char_list ’"’
        ;
time_tlr        : ’END’ ’LC_TIME’ EOL
        ;

EXAMPLES

The following localedef script creates the locale.inf file for the american language using the ROMAN8 code set:

# language: american
# code set: ROMAN8
langname        "american"
langid          1
revision        "1.1"
escape_char     \
comment_char    ’#’
##################################################
# Set up the LC_CTYPE category of the table
LC_CTYPE
upper   ’A’...’Z’;
        \xa1...\xa7;\xad;\xae;\xb1;\xb4;\xb6; \
        \xd0;\xd2;\xd3;\xd8;\xda...\xdc; \
        \xde..\xe1;\xe3;\xe5...\xe9; \
        \xeb;\xed;\xee;\xf0
lower   ’a’...’z’;
        \xb2;\xb5;\xb7;\xc0...\xcf; \xd1; \
        \xd4...\xd7;\xd9; \xdd;\xde ;\xe2; \
        \xe4;\xea;\xec;\xef;\xf1
digit   ’0’...’9’
space   ’ ’;\x9...\xd
punct   ’!’...’/’;’:’ - ’@’;
        ’[’...’‘’;’{’ - ’~’ \
        \d168...\d172;\d175;\d176;\d179;
        \d184...\d191;\d242...\d254
cntrl   \x0...\x1f;\x7f;
        \200...\240;\377
blank   ’ ’;\t
xdigit ’0’...’9’;’a’...’f’;
        ’A’...’F’
# isfirst and issecond are irrelevant here
# alpha, graph and print get default values
tolower ( ’A’,’a’ );( ’B’,’b’ );( ’C’,’c’ ); \
        ( ’D’,’d’ );( ’E’,’e’ );( ’F’,’f’ );
        ( ’G’,’g’ );( ’H’,’h’ ); \
        (\x49,\x69); (\x4a,\x6a);
        (\113,\153); (\114,\154);
        (\d77,\d109); (\d78,\d110);
        ( ’O’,’o’ ); ( ’P’,’p’ );
        ( ’Q’,’q’ );( ’R’,’r’ ); ( ’S’,’s’ ); \
        ( ’T’,’t’ );( ’U’,’u’ );( ’V’,’v’ ); \
        ( ’W’,’w’ );( ’X’,’x’ );( ’Y’,’y’ ); \
        ( ’Z’,’z’ );(\xa1,\xc8);(\xa2,\xc0); \
        (\xa3,\xc9);(\xa4,\xc1);(\xa5,\xcd); \
        (\xa6,\xd1);(\xa7,\xdd);(\xad,\xcb); \
        (\xae,\xc3);(\xb1,\xb2);(\xb4,\xb5); \
(\xb6,\xb7);(\xd0,\xd4);(\xd2,\xd6); \
(\xd3,\xd7);(\xd8,\xcc);(\xda,\xce); \
(\xdb,\xcf);(\xdc,\xc5);(\xdf,\xc2); \
(\xe0,\xc4);(\xe1,\xe2);(\xe3,\xe4); \
(\xe5,\xd5);(\xe6,\xd9);(\xe7,\xc6); \
(\xe8,\xca);(\xe9,\xea);(\xeb,\xec); \
(\xed,\xc7);(\xee,\xef);(\xf0,\xf1)
# toupper is the reverse of tolower
bytes_char       "1"
alt_punct        ""
code_scheme      ""
END LC_CTYPE
##################################################
# Set up the LC_COLLATE category of the table
# dictionary collating sequence:
# spaces, decimal digits,
# alphabetic characters, punctuation,
# control characters
LC_COLLATE
modifier      "nofold"
order_start
’ ’        ’ ’;’ ’
\xa0       \xa0; \xa0
’0’        ’0’;’0’
# ’1’ through ’8’ in numerical order
’9’        ’9’;’9’
# Equivalence class of ’A’ starts here
’A’        ’A’;’A’
# One-to-two, AE ligature
\xd3      ’A’;"AE"
\xe0      ’A’;\xe0
\xa1      ’A’;\xa1
\xa2      ’A’;\xa2
\xd8      ’A’;\xd8
\xd0      ’A’;\xd0
# Equivalence class of ’A’ ends here
\xe1      ’A’;\xe1
´B´        ’B’;’B’
´C´        ’C’;’C’
\xb4      ’C’;\xb4
´D´        ’D’;’D’
\xe3      ’D’;\xe3
´E´        ’E’;’E’
\xdc      ’E’;\xdc
\xa3      ’E’;\xa3
\xa4      ’E’;\xa4
\xa5      ’E’;\xa5
´F´        ’F’;’F’
´G´        ’G’;’G’
´H´        ’H’;’H’
´I´        ’I’;’I’
\xe5      ’I’;\xe5
\xe6      ’I’;\xe6
\xa6      ’I’;\xa6
\xa7      ’I’;\xa7
´J´        ’J’;’J’
´K´        ’K’;’K’
´L´        ’L’;’L’
´M´        ’M’;’M’
´N´        ’N’;’N’
\xb6      ’N’;\xb6
´O´        ’O’;’O’
\xe7      ’O’;\xe7
\xe8      ’O’;\xe8
\xdf      ’O’;\xdf
\xda      ’O’;\xda
\xe9      ’O’;\xe9
\xd2      ’O’;\xd2
´P´        ’P’;’P’
´Q´        ’Q’;’Q’
´R´        ’R’;’R’
# Remainder of LC_COLLATE omitted for space considerations
order_end
END LC_COLLATE
##################################################
# Set up the LC_MONETARY category of the table
LC_MONETARY
int_curr_symbol      "USD "
currency_symbol      "$"
mon_decimal_point    "."
mon_thousands_sep    ","
mon_grouping         3;0
positive_sign        ""
negative_sign        "-"
int_frac_digits      "2"
frac_digits          "2"
p_cs_precedes        "1"
p_sep_by_space       "0"
n_cs_precedes        "1"
n_sep_by_space       "0"
p_sign_posn          "1"
n_sign_posn          "1"
crncystr             "-US$"
END LC_MONETARY
##################################################
# Set up the LC_NUMERIC category of the table
LC_NUMERIC
grouping        3;0
thousands_sep   ","
decimal_point   "."
alt_digit      ""
END LC_NUMERIC
##################################################
# Set up the LC_TIME category of the table
LC_TIME
# date & time format string
d_t_fmt    "%a, %b %.1d, %Y %I:%M:%S %p"
# date format string
d_fmt      "%a, %b %.1d, %Y"
# time format string
t_fmt      "%I:%M:%S"
# 12-hr time format
t_fmt_ampm "%I:%M:%S %p"
# Days of week
day     "Sunday";"Monday";"Tuesday";"Wednesday";"Thursday"; \
"Friday";"Saturday"
# Weekday abbreviations
abday   "Sun";"Mon";"Tue";"Wed";"Thu";"Fri";"Sat"
# Month names
mon     "January";"February";"March";"April";"May";"June"; \
"July";"August";"September";"October";"November";"December"
# month abbreviations
abmon "Jan";"Feb";"Mar";"Apr";"May";"Jun";"Jul";"Aug"; \
"Sep";"Oct";"Nov";"Dec"
# AM, PM strings
am_pm "AM";"PM"
year_unit      ""
mon_unit       ""
day_unit       ""
hour_unit      ""
min_unit       ""
sec_unit       ""
# There is no era or emperor year for the "american" language,
# but here is an example of the "japanese" era_d_fmt and era specification:
# normal era format string
era_d_fmt "%N%onen"
era "+:2:1990/01/01:+∗:Heisei"
# special fmt for 1st year
     "+:1:1989/01/08:1989/12/31:Heisei:%Ngannen"
     "+:2:1927/01/01:1989/01/07:Shouwa"
# special fmt for 1st year
     "+:1:1926/12/25:1926/12/31:Shouwa:%Ngannen"
     "+:2:1913/01/01:1926/12/24:Taishou"
# special fmt for 1st year
     "+:1:1912/07/30:1912/12/31:Taishou:%Ngannen"
     "+:2:1869/01/01:1912/07/29:Meiji"
# special fmt for 1st year
     "+:1:1868/09/08:1868/12/31:Meiji:%Ngannen"
# revert to regular year numbering
# for years prior to the supported eras
     "-:1868:1868/09/07:-∗::%o"
END LC_TIME
##################################################
# Set up the LC_MESSAGES category of the table
LC_MESSGAES
# could be "[yY][eE][sS]"
yesexpr "yes"
noexpr "no"
END LC_MESSAGES
##################################################
# Set up the LC_ALL category of the table
LC_ALL
# left-to-right orientation
direction      ""
context        ""
END LC_ALL

Hewlett-Packard Company — HP-UX Release 9.0: August 1992

Museum