Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ fkcode(3K) — Reliant UNIX 5.44c4

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

fkconv(3K)

fkverify(3K)

fkcode(3K)                        2/4/92                         fkcode(3K)

NAME
     fkcode - kanji code detector

SYNOPSIS
     cc [flag...] file... -lkanji [library...]

     #include <mlx-j/kanji.h>

     int fkcode(FILE *stream, FILE *save, unsigned limit);

DESCRIPTION
     fkcode examines an input stream for kanji  characters  and  determines
     the  type of encoding used. The routine recognizes kanji characters in
     EUC, Shift-JIS, JIS, Old JIS and NEC-JIS codes.

     stream is the input stream to be examined. It must be open for reading
     and must be aligned on a character boundary.

     save is a stream open for  writing.  Bytes  consumed  from  stream  to
     determine  the input encoding are written to this file. This is useful
     if the input is connected to a  pipe  or  a  device  which  cannot  be
     rewound. save may be set to NULL if consumed input does not need to be
     saved.

     limit can be used to limit the number of bytes  read  from  stream  in
     order  to  determine  the  input  encoding.  If  limit is set to zero,
     lookahead is limited only by EOF.

RETURN VALUE
     fkcode returns -1 if an I/O error occurs on stream or save. Otherwise,
     the  return  value  is one of the constants defined in <mlx-j/kanji.h>
     which indicates an input encoding:

          KJJIS           - (New) JIS
          KJOJIS          - Old JIS
          KJNECJIS       - NEC-JIS
          KJSJIS          - Shift-JIS
          KJEUC           - Extended UNIX Code
          KJEUCORSJIS   - ambiguous input, either EUC or Shift-JIS
          KJASCII         - input only contains 7-bit ASCII bytes
          KJUNKNOWN       - unknown encoding

NOTES
     Neither stream nor save are rewound when  fkcode  returns.  Both  file
     pointers are left on a character boundary. If the input code is one of
     the 7-bit codes, the stream file pointer points at the byte  following
     the  first  shift-in  sequence,  that  is,  stream  is  positioned  in
     shifted-in mode.

     A return value of KJEUCORSJIS is most commonly caused by  an  input
     file  that  contains  only half-size katakana. The byte values for EUC
     and Shift-JIS completely overlap for these.



Page 1                       Reliant UNIX 5.44                       2, 194

fkcode(3K)                        2/4/92                         fkcode(3K)

     A return value of KJUNKNOWN indicates that the input was not  in  any
     recognized encoding.

BUGS
     fkcode determines the input encoding by looking  for  the  first  byte
     sequence  in  the  input  stream  which  can be uniquely assigned to a
     codeset. If the input  file  contains  more  than  one  encoding,  the
     encoding used first is reported.

     EUC and Shift-JIS cannot always be distinguished. This means  that  in
     the  worst case, with a limit value of 0, fkcode consumes all input to
     EOF (and possibly writes all of the input to  save)  before  reporting
     failure.  The  same  is  true  if the input does not contain any kanji
     characters.

FILES
     /usr/lib/libkanji.a

SEE ALSO
     fkconv(3K), fkverify(3K).


































Page 2                       Reliant UNIX 5.44                       2, 194

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026