Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ kbdcomp(1M) — mips UMIPS RISC/os 5.01

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

kbdload(1M)

kbdset(1)

kbdstrm(7)



KBDCOMP(1M)         RISC/os Reference Manual          KBDCOMP(1M)



NAME
     kbdcomp - compile kbd tables

SYNOPSIS
     kbdcomp [-vrR] [-o outfile ] [ infile ]

DESCRIPTION
     kbdcomp compiles tables for use with the kbdstrm(7) STREAMS
     module, a programmable string-translation module. The module
     has two separate abilities, each of which may be used alone
     or in combination.

     The first ability, lookup, is that of performing simple sub-
     stitution of bytes in an input Stream.  This ability is
     based on a simple 256-entry lookup table (as there are 256
     possible bit combinations for a byte). As input is received,
     each byte is looked up in the translation table, and the
     table value for that byte is substituted in place of the
     original byte.  The process is quick, and can be performed
     on each STREAMS message with no message copying or duplica-
     tion.

     The second ability, mapping, allows searching for
     occurrences of specified strings of bytes (or individual
     bytes) in an input Stream, and substituting other strings
     (or bytes) for them as they are recognized. There are three
     kinds of mapping that are differentiated by the relationship
     between the number of bytes in the input and the number of
     bytes in the output. one-many mapping means that for a given
     byte in the input, many bytes are substituted. many-one map-
     ping means that for many bytes in the input one byte is sub-
     stituted.  many-many mapping includes the other two types as
     a proper subset, but also includes substitution of many
     bytes in the input with many bytes of output. kbdstrm can
     perform all three types of mapping. The lookup ability
     described in the previous paragraph (i.e., what amounts to
     one-one mapping) is a common special case useful enough to
     be included separately. By using combinations of both lookup
     and mapping, a larger class of input translation and conver-
     sion problems can be solved than can be solved by the use of
     either alone.

     During operation, processing occurs in two major passes: the
     lookup table pass always precedes string mapping.  The
     string mapping procedure is nonrecursive for a given table
     and there is no feedback mechanism (that is, input is
     scanned in order as received and output is not re-scanned
     for occurrences of recognizable input strings). As an exam-
     ple of mapping, suppose one wishes to translate all
     occurrences of the string this in an input Stream into the
     string there. The module recognizes and buffers occurrences
     of the string th (as each byte is received); if the



                        Printed 11/19/92                   Page 1





KBDCOMP(1M)         RISC/os Reference Manual          KBDCOMP(1M)



     following character is i, it will also be buffered, but if x
     is then received, a mismatch is recognized and no transla-
     tion occurs. Assuming thi has been buffered, if the next
     character seen is s, a match is recognized, the buffer con-
     taining this is discarded, and the string there replaces it.

     It should be obvious that both input and output strings can
     be of any non-zero length (see however, the section below on
     limitations). Each string to be recognized and translated
     must be unique, and no complete input string may constitute
     the leading substring of any other (e.g., one may not define
     abc and ab simultaneously, but may so define abc, abd, and
     abxy).

     Given a filename (or stdin if no name is supplied), kbdcomp
     will compile tables into the output file specified by the -o
     option. If the -o option is not supplied, output is to the
     file kbd.out.

     The -v option causes parsing and verification - no output
     file is produced; if no error messages are printed, then the
     input file is syntactically correct. The -r option causes
     the compiler to check for and report on byte values that
     cannot be generated in a table (see the description below).
     The option -R is equivalent to -r but it tries to print
     printable characters as themselves rather than in octal for-
     mat.

   Input Language
     Source files for kbdcomp are a series of table declarations.
     Within each table declaration are a number of definitions
     and functions. A table declaration is one of the forms map
     or link:

          map type ( name ) { expressions }

          link ( string )

     The link form will be described later below. The name of a
     map must be a simple token not containing any colons, com-
     mas, quotes, or spaces. (For our purposes, a simple token is
     a sequence of alphabetic and/or numeric characters with no
     embedded punctuation, whitespace, or special symbols.)  The
     type field is an optional field that may be either of the
     keywords full or sparse.  If omitted, the type defaults to
     sparse.  The effect of this field is described in more
     detail below. The expressions contained in the map declara-
     tion are one of the following forms. Reserved keywords are
     printed in constant-width font, variables in italics:


          keylist ( string string )



 Page 2                 Printed 11/19/92





KBDCOMP(1M)         RISC/os Reference Manual          KBDCOMP(1M)



          define ( word value  )
          word ( extension result )
          string ( word word )
          strlist ( string string )
          error ( string )
          timed

     The keylist form is for defining lookup table entries while
     the remaining forms are the separate string functions.

     The definition form (define) allows a mnemonic word (the
     first argument) to be associated with a string (the second
     argument). It is useful for replacing complicated sequences
     (e.g., those containing special symbols or control charac-
     ters) with mnemonic words to facilitate the design and rea-
     dability of tables.

     Using the word form (where word must be a previously defined
     sequence) in a manner similar to a C function call results
     in the value of word being concatenated with extension; when
     the combination is recognized, it is mapped to result. The
     value may be a string of characters or a single byte. The
     following is an illustration (not intended to be complete):


          map (some_accents) {
               define(acute '\047')
               define(grave '`' )
               acute(a '\341')     # same as string("\047a" "\341"
               grave(a '\340')
               # ...et cetera ...
               keylist("zyZY" "yzYZ")
          }

     This map defines the single quote and reverse quote keys as
     dead-keys, which when followed by a produce a character from
     the ISO 8859-1 codeset. It is not necessary for the defini-
     tion, extension, or result to be a single byte; they may be
     arbitrary strings.

     Strings in definitions and arguments may generally be
     entered either without quotation or between double quotes.
     Byte constants may likewise be entered unquoted or between
     single quotes. The only time quotation is strictly required
     is when the string contains parentheses, spaces, tab charac-
     ters, or other special symbols. The language makes no real
     distinction between byte constants and string constants:
     both are treated as null-terminated strings; the choice of
     whether to use a one-character string or a byte constant is
     thus a matter of taste. Most quoting conventions of C are
     recognized, except that octal constants must be exactly
     three digits long. Octal constants may be used in strings as



                        Printed 11/19/92                   Page 3





KBDCOMP(1M)         RISC/os Reference Manual          KBDCOMP(1M)



     well. In the example above, the arguments to keylist need
     not be quoted, as they contain no special symbols. The fol-
     lowing example illustrates some situations where strings
     must be quoted:


          string(abc "two words")     # literal space
          keylist("[{}]" "(())")      # brackets/parenthesis
          define(esc_seq "\033\t(")   # tab and parenthesis
          define(space ' ')           # literal space
          string(abc "keylist")       # keyword used as argument

     Comments in files (inside or outside of map declarations)
     may be entered in the same manner as for sh(1); that is,
     after a # at the end of a line, or on a line beginning with
     #, as shown in the above examples.  The keylist form allows
     single bytes to be mapped to other single bytes; it defines
     actions that are treated in the lookup table (i.e., are per-
     formed before mapping). Any byte value that is not expli-
     citly changed by being included in a keylist form will, of
     course, be left unchanged; if no keylist forms appear in a
     map definition, then kbdcomp does not generate a lookup
     table for the map, and the lookup phase is skipped during
     module operation. Each byte in the first string argument to
     keylist is mapped to the byte at the same position in the
     second string argument. That is, given two strings X and Y
     as arguments: X  maps to Y ; X  maps to Y  and so forth. The
     two arguments m1
u
st evaluat1
e
tojstrings coj
n
taining the same number of bytes. The string form has a function similar to mnemonic forms defined with define and may be used for any type of many- many mapping. The first argument to string is mapped to the second argument (see the comment in the sample map above). Mappings using both keylist and string or any define forms may be combined: if i is mapped to a with a keylist form, and a is used in the sequence ` a, then when the user types ` i, the sequence ` a is seen by the string mapping process (because lookup is done first) and translated accordingly. The keylist form is intended mainly for use in simple key- board rearrangement and case-conversion applications; string is for one-many mapping or for isolated instances of many- many mapping; the define form and words defined with it are intended for more general use in groups of related sequences. In some situations while a one-one mapping with keylist may be an obvious choice, the same effect may be achieved with string forms to avoid having a contradictory mapping. For example, suppose one desires, simultaneously, to translate x into y and y into abc. If x is mapped to y via a keylist form and y is mapped to abc via a string form, then it may beimpossible to obtain y itself (unless defined Page 4 Printed 11/19/92


KBDCOMP(1M)         RISC/os Reference Manual          KBDCOMP(1M)



     in another sequence), even though that was not the intention
     - the intention was to obtain y whenever the user enters x.
     This is a contradictory mapping:


          keylist(x y)
          string(y abc)  # "y" itself cannot be generated

     There are cases where the intention is that y not be gen-
     erated, but most often the intention is to generate it.
     This problem (a relatively common one in codeset mapping)
     can be "solved" by using a string form to map x to y ini-
     tially rather than using a keylist form. This allows both y
     and abc to be generated:


          string(x y)
          string(y abc)

     Entering a large number of one-one mappings with string can
     be somewhat tedious. To make things easier, the strlist form
     is provided. The two string arguments to strlist are inter-
     preted in the same manner as arguments to keylist (i.e.,
     they are one-one mappings), except that they are not done by
     the lookup table, but are processed as string mappings. In
     the following example, the first three strlng definitions
     can be reduced to the strlist form which follows:


          string(a b)
          string(c d)
          string(e f)

          strlist(ace bdf)

     It is important to recognize the difference between string
     and strlist:  with string, the two arguments are a single
     mapping definition (which may be of any type) whereas with
     strlist, one or more one-one string mappings are defined
     simultaneously. A set of mappings deined with a combination
     of string and strlist do not exhibit the same type of incom-
     patibility described above between keylist and string.

     Some further aspects of module processing can now be
     presented. When a partial match in an input sequence is
     detected during string processing, it is buffered; if at
     some point the match no longer succeeds, the first byte of
     the matched buffer is normally sent to the neighboring
     module.  The rest of the input is left in the buffer and
     scanned again to see if it matches the beginning of another
     sequence. The error entry allows one to send a string (or
     byte) constant (called a fallback character) instead of the



                        Printed 11/19/92                   Page 5





KBDCOMP(1M)         RISC/os Reference Manual          KBDCOMP(1M)



     byte that began the previous sequence; this is particularly
     useful in codeset mapping and conversion applications where
     the character which failed to be translated might be one
     which does not occur or has some other meaning in the target
     codeset. The following (somewhat contrived) example illus-
     trates use of the error form:


          # turn arrow keys into vi commands
          map (vi_map) {
               string("\033[A" k) # up
               string("\033[B" j) # down
               error("!")
          }

     Given input of the escape character followed by [A or [B, a
     single character (j or k) is generated. If presented with
     the sequence escape-[Q, the module will produce the sequence
     ![Q. The error string ! replaces escape because the sequence
     failed to match when Q was received. The remaining charac-
     ters are re-scanned, and neither [ nor Q is found to begin a
     recognized sequence.

     One-one mapping with strings or other defined forms (rather
     than via a keylist lookup table) is generally performed with
     a linear search operation when looking for bytes which begin
     sequences. However, if the table is specified as a full
     table, it is initially indexed rather than searched
     linearly, and thus processed much more quickly when there
     are a large number of entries. This should be kept in mind
     in codeset mapping applications where nearly all characters
     are mapped, and many (or most) are one-one mappings. If only
     a very few characters are mapped with string functions, one
     must decide on whether to trade a small gain in processing
     speed for the space needed to store the index if a table is
     made full.

     The link form is used to produce a composite table.  A com-
     posite table is really a form of linkage that allows several
     tables to be used together in sequence as if the sequence
     were a single table. The string argument to link is of the
     following form:

          composite:component1,component2,componentn

     The target composite name is followed by a colon, and the
     ordered component list is comma-separated. If the string
     argument contains spaces or special characters, it must be
     quoted. (This string is not interpreted by kbdcomp, but is
     left intact in the output file; it is interpreted by the
     module at run time.)  When a composite table is used, the
     effect is similar to pushing more than one instance of the



 Page 6                 Printed 11/19/92





KBDCOMP(1M)         RISC/os Reference Manual          KBDCOMP(1M)



     kbdstrm(7) module in the sense that the component tables
     function sequentially but it is accomplished within a single
     instance of the module. As output is produced by processing
     with one table in the composite, the data is subsequently
     processed by the next component and so forth until the final
     result emerges at the end of the sequence. (There is no res-
     triction on the use of any combination of full and sparse
     tables in a composite.)

     Composite tables are useful for simplifying complex mapping
     situations by modularizing the processing and for increasing
     the re-usability of tables or different mapping applica-
     tions. Tables primarily implementing codeset mappings may be
     linked to other tables primarily implementing compose- or
     dead-key sequences. With a single table implementing a com-
     mon codeset mapping, several different tables implementing
     combinations of codeset mapping and compose-key layouts may
     be built. A typical coniguration might use one table or map-
     ping from an external to internal codeset, then use one or
     more separate tables working in the internal codeset to pro-
     vide compose- or dead-key functionality, as in the following
     example. One table, 646Sp-8859, maps from an ISO 646 variant
     (Spanish) external codeset to ISO 8859-1; this is combined
     with two other tables respectively implementing 8859-1 by
     compose sequences, and by dead-key sequences:


          link("composed:646Sp-8859,8859-1-cmp")
          link("deadkey:646Sp-8859,8859-1-dk")

     Composite tables can also be built while the module is run-
     ning from the kbdload(1M) command line; details are in the
     kbdload(1M) manual page. The component tables are linked and
     processed in the given order (left-to-right).  Because the
     link argument is actually parsed at runtime (by kbdstrm(7)),
     it is not an error to refer to tables that are not contained
     in the file currently being compiled.  An error will be gen-
     erated when the file is loaded if any component of a link is
     not present in memory at that time.

     The directive timed may appear any place within a map
     declaration. If used, it causes the table within which it is
     defined to be interpreted in timeout mode. In this mode,
     string mappings are considered to not match if more than a
     certain amount of time elapses after receipt of the first
     byte of a sequence without its being being fully received
     and mapped. Given a timed map in which abc is to be mapped
     to xyz and the timeout value is 30, if the user types ab,
     then waits for longer than 30 time units before typing c,
     the entire sequence will not be translated. In this case the
     sequence is treated as any other mismatch would be: a is
     passed to the neighboring module, and b is checked to see if



                        Printed 11/19/92                   Page 7





KBDCOMP(1M)         RISC/os Reference Manual          KBDCOMP(1M)



     it begins a sequence. The timer is reset when a mismatch
     occurs, so that if bc is defined in this situation and c has
     just been received, it will be mapped as expected. The
     default timeout is typically 1/5 to 1/3 of a second (see
     kbdstrm(7) for details).

     Timeout mode is generally useful in situations where termi-
     nal function keys are being interpreted, to distinguish
     between a string typed by the user and a function key string
     sent by the terminal; it is not intended for use with
     "batch" applications such as kbdpipe(1M).  In a composite
     table, some components may be timed and some not, making the
     mode useful for combinations of codeset mapping and function
     key mapping.

     Timing depends on several factors, including terminal baud-
     rate, system load, and the user's typing speed. If the
     timeout value is too long, then typed sequences that happen
     to be the same as function keys will be erroneously mapped;
     if the value is too short, then function keys may be missed
     under a heavy system load or with low speed devices. See
     kbdset(1) for information on how to change the timeout
     value, and kbdstrm(7) for information on how an administra-
     tor may change the default timeout value. This directive
     should never be used in tables that implement codeset map-
     ping, as it makes the results quite unpredictable. Long
     timeouts, on the order of seconds, may be useful in some
     contexts.

   Building and Debugging
     Users who intend to build their own tables may study the
     source tables supplied with the distribution in the files
     /usr/lib/kbd/*.map.

     If characters other than alpha-numerics are to be used,
     quoted strings are preferred to unquoted strings; quotation
     is required for some characters, as mentioned above. Map
     names and the first arguments of define should be alpha-
     numeric tokens.

     The report generated by the -r option may be useful for
     debugging complex tables. The report (produced on stderr)
     consists of two octal lists. One list contains byte values
     that cannot be generated from the lookup table (if keylist
     forms are used). The other list contains byte values that
     cannot be generated in any way; in other words, values that
     are neither parts of ``result text'' (i.e., products of
     string mappings) nor generated by the lookup table (if there
     is one), but that are used in other sequences. The report
     does not exhaustively list unreachable paths, but may indi-
     cate whether they exist and help pinpoint them.




 Page 8                 Printed 11/19/92





KBDCOMP(1M)         RISC/os Reference Manual          KBDCOMP(1M)



   Output Files
     The files produced by kbdcomp begin with a header. The magic
     string is kbd!map, with a version number. This header is
     immediately followed by the tables themselves. (A file can
     contain more than one table.) The lines below can be added
     to the /etc/magic file for the file(1) command to recognize
     kbdstrm files.


          0    string         kbd!map   kbd map file
          >8   byte >0        Ver %d:
          >10  short          >0        with %d table(s)

LIMITATIONS
     A maximum length of 128 bytes for input strings and 256
     bytes for output strings is imposed. The total amount of
     space consumed by a single table is limited to around 65,000
     bytes. Versions are strictly incompatible; "object" tables
     are machine-dependent in their byte order and structure
     size. Thus, while source files are portable, the output of
     kbdcomp is not. This implies that when using remote devices
     across a network between heterogeneous machines, tables must
     be loaded on the machine where the module is actually pushed
     (i.e., the remote site).

FILES
     /usr/lib/kbd             directory containing system stan-
                              dard map files.
     /usr/lib/kbd/*.map       sources for kbd files.

SEE ALSO
     kbdload(1M), kbdset(1), kbdstrm(7)























                        Printed 11/19/92                   Page 9



Typewritten Software • bear@typewritten.org • Edmonds, WA 98026