Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ keuctoibmj(3K) — Reliant UNIX 5.44c4

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

fkeuctoibmj(3K)

seuctoibmj(3K)

ceuctoibmj(3K)

kibmjtoeuc(3K)

kconv(3K)

keuctoibmj(3K)                   3/11/92                     keuctoibmj(3K)

NAME
     keuctoibmj - EUC to IBM host code kanji converter

SYNOPSIS
     cc [flag...] file... -lkanji [library...]

     #include <mlx-j/kanji.h>

     int keuctoibmj(kjbuf *bp, int exp);

DESCRIPTION
     keuctoibmj  converts  EUC  encoded  kanji  text  into   IBM   Japanese
     (Katakana) Kanji Mixed Host Code (IBM CCSID's 13218 and 09122).

     The exp argument must be set to KJEXP or KJNOEXP. KJEXP  indicates
     that   half-size  katakana  should  be  expanded  to  their  full-size
     equivalents, KJNOEXP causes half-size katakana to  be  preserved  in
     the output.

     bp  is  a  pointer  to  a  structure  of  type  kjbuf,   defined   in
     <mlx-j/kanji.h> as follows:

          typedef struct {
                  unsigned char   *kjin;
                  sizet          kjisz;
                  sizet          kjicnt;
                  unsigned char   *kjout;
                  sizet          kjosz;
                  sizet          kjocnt;
                  int             kjshift;
                  int             kjeof;
          } kjbuf;

     kjin and  kjout  are  pointers  to  the  input  and  output  buffer,
     respectively.  The  buffer  sizes  must  be  at  least KJMINIBUF and
     KJMINOBUF (4 and 9 bytes). However, for efficiency, the sizes should
     be  something  more reasonable, say BUFSIZ bytes. The input and output
     buffers must not overlap in memory.

     kjisz indicates the number of bytes of input present  in  kjin,  and
     kjosz must be set to the size of the output buffer.

     When keuctoibmj returns,  kjicnt  is  set  to  the  number  of  bytes
     consumed  from kjin, and kjocnt is set to the number of bytes placed
     into kjout.

     kjshift is used to keep track of the shift state. It must be  be  set
     to 0  for  the  initial call and not be changed between invocations of
     keuctoibmj.

     The kjeof field is used to handle partial characters at  the  end  of
     the  input  buffer.  For  example, if the first byte of a 2-byte kanji



Page 1                       Reliant UNIX 5.44                      3, 1911

keuctoibmj(3K)                   3/11/92                     keuctoibmj(3K)

     character is the last byte in the input buffer,  keuctoibmj  does  not
     immediately  convert  that  byte, since it cannot yet decide how to do
     the conversion. If kjeof is 0, the call returns with kjicnt one less
     than kjisz. The caller is responsible for moving the unconverted byte
     to the front of the input buffer, refilling the remaining  space  with
     more  input,  and  calling  keuctoibmj  again.  The  first byte of the
     character is now at the front of the input buffer and the character is
     converted correctly by the second call.

     However, if a partial character is present at EOF, this approach  does
     not  work,  since no more bytes are available to do the conversion. In
     this case, one more call must be made to keuctoibmj, with  kjeof  set
     to  a non-zero value. This forces keuctoibmj to convert whatever input
     is left. Note that  this  is  particularly  important  when  expanding
     half-size  katakana.  Two  adjacent  half-size katakana can combine to
     form a single full-size character, but the same single characters  can
     translate to two separate full-size ones, depending on the context. If
     such a half-size katakana is found at EOF, the final call with  kjeof
     set to a non-zero value ensures the correct conversion.

RETURN VALUE
     Successful conversion returns a value of  0.  Otherwise,  one  of  the
     error  codes  below  is  returned. Note that all error values indicate
     non-fatal conditions, that is, conversion does not stop when an  error
     condition is detected.

     KJERROR
          An invalid byte or byte  sequence  was  detected  in  the  input.
          Typically,  this  happens  when  a  byte which introduces a kanji
          character is not followed by a byte sequence to validly  complete
          it, or when a partial character is found at EOF.

     KJGAIJI
          An EUC codeset 3 (gaiji) character was found in the input.  Gaiji
          characters are not part of IBM host code and are skipped.

     KJESC
          When converting from EUC to IBM host code, it is possible that  a
          shift-in or shift-out byte is present in the ASCII portion of the
          input. These bytes cannot be passed through to the  output  since
          they  would  change  the  shift state. Instead, such bytes in the
          input are skipped and the return value is set to KJESC.

     KJNOMAP
          An EUC kanji character which does not exist in IBM host code  was
          present in the input. Such characters are skipped.

NOTES
     The function forces a shift-out byte at the end of the output  if  the
     input  ended with a kanji character. This means that if keuctoibmj was
     called with kjeof set, and kjocnt has a value which  indicates  that
     the  output buffer was filled to its capacity, it is necessary to call


Page 2                       Reliant UNIX 5.44                      3, 1911

keuctoibmj(3K)                   3/11/92                     keuctoibmj(3K)

     the function one more time, with kjeof set, and a kjisz value of  0,
     to get the final shift-out byte.

     Invalid byte sequences in the input are copied through to  the  output
     unchanged (in shifted-out mode).

BUGS
     If more than one error condition is detected during  conversion  of  a
     single  input  buffer,  the  return  code  always  indicates the first
     problem that was found.

     Invalid byte sequences in the  input  may  cause  keuctoibmj  to  lose
     synchronization  with kanji character boundaries. If this happens, all
     converted output following the error is likely to be garbage.

FILES
     /usr/lib/libkanji.a

SEE ALSO
     fkeuctoibmj(3K), seuctoibmj(3K), ceuctoibmj(3K), kibmjtoeuc(3K),
     kconv(3K).

































Page 3                       Reliant UNIX 5.44                      3, 1911

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026