Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ euc(4) — mips UMIPS RISC/os 5.01

Media Vault

Software Library

Restoration Projects

Artifacts Sought



EUC(4)              RISC/os Reference Manual               EUC(4)



NAME
     EUC - Extended UNIX Code

DESCRIPTION
     Up to four code sets can be used concurrently at both the
     file and process level, enabling the use of languages which
     require characters not within the US ASCII range.  The
     external code set represents the set of characters that can
     be used.  There are many diverse code sets used around the
     world.  Each of these code sets is mapped to a different
     internal code set representation which is then used by the
     RISC/os system during processing. The internal code set
     scheme is called the Extended UNIX Code, or EUC.

     The EUC code comprises a primary code set (code set 0) which
     is always assigned to the 7-bit US ASCII character set, and
     three supplementary code sets (code sets 1 through 3) which
     can be assigned to other character sets.

     The EUC code sets are distinguished by the values of the
     most significant bits (MSB) of the EUC representation and by
     single-shift characters. This combination defines the inter-
     nal coding template for each of the four code sets. The MSB
     of each byte is the left-most bit in the standard represen-
     tation of a byte.

     The representation of the single-byte primary code set has
     the MSB set to zero. The three supplementary code sets have
     the MSB of each byte set to one. Code sets 2 and 3 are
     further distinguished by single-shift character 2 (SS2) and
     single-shift character 3 (SS3), respectively. This coding
     scheme conforms to the International Standard ISO 2022.


          Code Set    EUC Representation
          Code Set 0  0xxxxxxx
          Code Set 1  1xxxxxxx [ 1xxxxxxx [...] ]
          Code Set 2  SS2 1xxxxxxx [ 1xxxxxxx [...] ]
          Code Set 3  SS3 1xxxxxxx [ 1xxxxxxx [...] ]

     A single-shift character is a single byte which indicates a
     temporary shift for the next character to code set 2 or 3.
     SS2 is represented in hexadecimal by 8E, and SS3 by 8F. The
     usage and definition of these shift codes conform to the
     International Standards ISO 2022 and ISO 6937/3.

     In addition to the primary and supplementary code sets, the
     internal EUC representations also include the space and
     delete characters, two control character sets, and unas-
     signed codes shown as follows.





                        Printed 11/19/92                   Page 1





EUC(4)              RISC/os Reference Manual               EUC(4)



          Code Set                      EUC Representation
          Space                         00100000
          Delete                        01111111
          Control Character Set 0 (C0)  000xxxxx
          Control Character Set 1 (C1)  100xxxxx

SEE ALSO
     Internationalized RISC/os Guide.















































 Page 2                 Printed 11/19/92



Typewritten Software • bear@typewritten.org • Edmonds, WA 98026