Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ sj_open(1) — NEWS-os 4.1C

Media Vault

Software Library

Restoration Projects

Artifacts Sought

SJ2LIB(1)  —  NEWS-OS Programmer’s Manual

NAME

sj_open, sj_close, sj_getkan, sj_douoncnt, sj_gakusyuu, sj_touroku, sj_syoukyo, sj_getalpha, sj_getcode, sj_getromaji, sj_set_romajitbl, sj_clear_romajitbl − Kana Kanji conversion library

OUTLINE

#include <sj2lib.h>
 sj_open(usrdic)
char ∗usrdic;/∗ input  ∗/
 sj_close()
 sj_getkan(yomi, bun)
unsigned char ∗yomi;/∗ input  ∗/
struct bunsetu ∗bun;/∗ output ∗/
 sj_douoncnt(yomi)
unsigned char ∗yomi;/∗ input  ∗/
 sj_getdouon(yomi, dou)
unsigned char ∗yomi;/∗ input  ∗/
struct douon ∗dou;/∗ output ∗/
 sj_gakusyuu(id)
struct studyrec ∗id;/∗ input  ∗/
 sj_touroku(yomi, kanji, code)
unsigned char ∗yomi;/∗ input  ∗/
unsigned char ∗kanji;/∗ input  ∗/
unsigned char code;/∗ input  ∗/
 sj_syoukyo(yomi, kanji)
unsigned char ∗yomi;/∗ input  ∗/
unsigned char ∗kanji;/∗ input  ∗/
 sj_getalpha(alpha, out)
unsigned char ∗alpha;/∗ input  ∗/
unsigned char ∗out;/∗ output ∗/
 sj_getcode(kana, out, flag)
unsigned char ∗kana;/∗ input  ∗/
unsigned char ∗out;/∗ output ∗/
int flag;/∗ input  ∗/
 sj_getromaji(roma, kana)
unsigned char roma;/∗ input  ∗/
unsigned char ∗kana;/∗ output ∗/
 sj_set_romajitbl(tbl, num)
struct romaji ∗∗tbl;/∗ input  ∗/
int num/∗ input  ∗/
 sj_clear_romajitbl()

DESCRIPTION

Opening the dictionary
 sj_open(usrdic)
char  ∗usrdic;  /∗ Pointer ti the user dictionary file name ∗/

 
sj_open opens the main dictionary called "/usr/sony/dict/sj2main.dic" for read only.  It also opens the user dictionary specified by usrdic for read and write.  When usrdic is NULL, this function will open the "/usr/sony/dict/sj2usr.dic".  When this function is successful in opening both main dictionary and user dictionary, it will return 0; and when it fails to open any one of these dictionaries, it will return a non-zero value as an error status code. 
 
This function must be always called before starting the Kana Kanji conversion routine.
 
Datails of the error status codes are given below:
 

0x01The main dictionary is already open.
0x02The function has failed to open the main dictionary.
0x04Illegal main dictionary. The main dictionary is not open.
0x10The user dictionary is already open.
0x20The user function has failed to open the user dictionary.
0x40Illegal user dictionary. The user dictionary is not open.

 
At least 1 bit of the error status code should be enabled.
 

Closing the dictionary
 sj_close()

 
sj_close closes the main and user dictionaries that were opened using sj_open.  When this function is successful in closing both the main dictionary and user dictionary, it will return a 0; and when it fails to close any one of these dictionaries, it will return a non-zero value as an error status code. 
 
This function must be always called before terminating the Kana Kanji conversion routine.
 
Datails of the error status codes are given below:
 

0x01The main dictionary is already closed.
0x10The user dictionary is already closed.

 
At least 1 bit of the error status code should be enabled.
 

Text conversion
 sj_getkan(yomi, bun)
unsigned char   ∗yomi;  /∗ pointer to the kana character string ∗/
struct bunsetu  ∗bun;   /∗ pointer to the clause structure ∗/

 
sj_get converts the specified text at one time, and returns the result.  The data on each clause is set in struct bunsetu bun[], and the number of clauses will be the return value.  Struct bunsetu is declared in the sj2lib.h file, and has the structure given below:
 

struct bunsetu {
        int             srclen;         /∗ Kana length ∗/
        int             destlen;        /∗ Kanji length ∗/
        unsigned char   ∗srcstr;        /∗ pointer to Kana ∗/
        unsigned char   ∗deststr;       /∗ pointer to Kanji ∗/
};

 
As a rule, the character string entered in Kana is composed using half-size codes. It can, however, contain the shift JIS character string but the shift JIS character string will not be converted into Kanji. The Kana character string must be terminated with a null byte and must not exceed 511 bytes. If the string exceeds 511 bytes, a 0 is returned as the number of clauses without the text being converted. The output of the struct bunsetu bun[] must be retrieved at the user program side. The Kana and Kanji output is stored in the following area of the library side.
 

unsigned charyomiout[512];
unsigned charkanjiout[1024];

 

Obtaining the number of homonyms
 sj_douoncnt(yomi)
unsigned char∗yomi;/∗ pointer to the Kana character string ∗/

 
sj_ducument converts the Kana input into a clause and returns the number of homonyms.  As a rule, the character string entered in Kana is composed using half-size codes.  It can, however, contain the shift JIS character string.  The shift JIS character string will not be converted into Kanji.  The Kana character string must be terminated with a null byte and must not exceed 32 bytes.  If one clause is not structured with the Kana character string as the result of conversion, 0 is returned as the number of homonyms. 
 

Obtaining the homonym
 sj_getdouon(yomi, dou)
unsigned char∗yomi;/∗ pointer to the Kana character string ∗/
struct douon∗dou;/∗ pointer to the homonym structure ∗/

 
sj_douoncnt converts the character string entered in Kana into a clause, sets the homonyms to the specified structure, and returns the number of homonyms as the return value at the same time.  The data on the homonyms is set in the struct douon dou[].  The struct douon  is declared in the sj2lib.h file and has the structure given below:
 

struct douon {
        unsigned char    ddata[128];  /∗ homonyms data ∗/
        int              dlen;        /∗ homonyms length ∗/
        struct studyrec  dcid;        /∗ study data ∗/
};

 
As a rule, the character string entered in Kana is composed using half-size codes. It can, however, contain the shift JIS character string. The shift JIS character string will not be converted into Kanji. The Kana character string must be terminated with a null byte and must not exceed 32 bytes. If one clause was not configured with the Kana character string as the result of conversion, 0 is returned as the number of homonyms. The output of the struct douon dou[] must be retrieved at the user program side. The ddata is a null-terminated character string.
 

String the clause
 sj_gakusyuu(id)
struct studyrec  ∗id;/∗ pointer to the study data ∗/

 
sj_gakusyuu receives the dcid given by sj_getdouon, and studies the clauses.  When the study is normally executed, 0 is returned; if not, a non-zero value is returned. 
 

Saving in the dictionary
 sj_touroku(yomi, kanji, code)
unsigned char  ∗yomi;/∗ pointer to Kana character string ∗/
unsigned char  ∗kanji;/∗ pointer to Kanji character string ∗/
unsigned char  code;/∗ part of speech code ∗/

 
B sj_touroku saves the specified Kanji in the user dictionary by receiving the Kana character string, Kanji character string, and a part of speech code. When the Kanji is saved in the dictionary, 0 is returned; if not a non-zero value is returned as an error status code. Datails of the error status codes are given below:
 

0x80Dictionary locked by another process.
0x20Homonyms overflow.
0x10Invalid grammer data.
0x08Invalid Kanji character string.
0x04Invalid Kana character string.
0x02The Kanji to be saved already exists in dictionary.
0x01No more space for save.

 
At least 1 bit of the error status code should be enabled.
 
The Kana character string is composed using half-size codes, and must be terminated with a null byte. The length must not exceed 32 characters. As a rule, the Kanji character string is composed using the shift JIS codes and must be terminated with a null byte. The length must not exceed 64 characters. The Kanji character string may contain half-size codes. Even then, it must not exceed 64 characters. The part of speech codes are given below:
 

0x01    common noun
0x02    name of person (family name)
0x03    place name
0x04    pronoun
0x05    name of person (first name)
0x06    name of prefecture/ward
0x07    numeral
0x08    prefix
0x09    suffix
0x0A    ordinal number
0x0B    adverb
0x0C    conjunction
0x0D    participial adjective
0x0E    adjective
0x0F    adjective verb
0x10    conjugation for Kana beginning with Sa
0x11    conjugation for Kana beginning with Za
0x12    the 1st conjugations of verbs
0x13    the 5th conjugations of verbs for Kana beginning with Ka
0x14    the 5th conjugations of verbs for Kana beginning with Ga
0x15    the 5th conjugations of verbs for Kana beginning with Sa
0x16    the 5th conjugations of verbs for Kana beginning with Ta
0x17    the 5th conjugations of verbs for Kana beginning with Na
0x18    the 5th conjugations of verbs for Kana beginning with Ba
0x19    the 5th conjugations of verbs for Kana beginning with Ma
0x1A    the 5th conjugations of verbs for Kana beginning with Ra
0x1B    the 5th conjugations of verbs for Kana beginning with Wa
0x1C    Single Kanji

 

Deleting the dictionary
 sj_syoukyo(yomi, kanji)
unsigned char  ∗yomi;/∗ pointer to Kana character string ∗/
unsigned char  ∗kanji;/∗ pointer to Kanji character string ∗/

 
sj_sjoukyo deletes the Kanji that was saved in the user dictionary using sj_touroku.  When the Kanji is deleted, 0 is returned; if not, a non-zero value is returned as an error status code.  Datails of the error status codes are given below:
 

0x80Dictionary locked by another process.
0x08Invalid Kanji character string.
0x04Invalid Kana character string.
0x01The Ksnji to be deleted does not exist in the deictionary.

 
The Kana character string is composed using half-size codes, and must be terminated with a null byte. The length must not exceed 32 characters. As a rule, the Kanji character string is composed using the shift JIS codes and must be terminated with a null byte. The length must not exceed 64 characters. The Kanji character string may contain half-size codes. Even then, it must not exceed 64 characters.
 

Converting half-size alphabet into full-size alphabet
 sj_getalpha(alpha, out)
unsigned char ∗alpha;/∗ pointer to the half-size alphabet ∗/
unsigned char ∗out;/∗ pointer to the full-size alphabet ∗/

sj_getalpha converts the half-size alphabet character string input ionto full-size alphabet character string.  The output of out[] must be retrieved at the user program side.  The out[] output is produced by a null byte termination. 
 

Converting half-size Katakana into full-size Katakana/Hiragana
 sj_getcode(kana, out, flag)
unsigned char ∗kana;/∗ pointer to half-size Katakana ∗/
unsigned char ∗out;/∗ pointer to full-size character ∗/
int flag;       /∗ Katakana/Hiragana selection flag ∗/

sj_getcode converts the half-size Katakana character string input into full-size Katakana or Hiragana character strings.  The output of out[] must be retrieved at the user program side.  The out[] output is produced by a null byte termination.  Select either flag defined in sj2lib.h file:
 

#define HIRAGANA0
#define KATAKANA1

 

Converting half-size alphabet into half-size Katakana
 sj_getromaji(roma, kana)
unsigned char  roma;   /∗ pointer to half-size alphabet ∗/
unsigned char  ∗kana;  /∗ pointer to half-size Katakana ∗/

 
sj-getromaji reads the half-size alphabet character string input, one character at a time, into the internal buffer, and converts them into the half-size Katakana code if possible.  The internal buffer size is 15 characters.  When the alphabet string is converted, the number of converted characters in the internal buffer is returned;
 when the string is not converted, 0 is returned. When the conversion is performed, the internal buffer is cleared of converted characters. The characters that were not converted will remain in the internal buffer. Include a null byte in the alphabet code to clear the internal buffer intentionally. At this time, the contents of the internal buffer will be output to Kana. The Kana[] output must be retrieved at the user program side. The maximum number of output characters that can be processed by sj_getromaji is 255.  A full-size Katakana name is saved in the internal alphabet character retrieval table.  sj_getromaji will output this name after converting it into the half-size Katakana.  The full-size Katakana name is stored in the area indicated below:
 

unsigned charz_kana_buf[512];

 

Setting the user defined alphabet table
 sj_set_romajitbl(tbl, num)
struct romaji  ∗∗tbl;/∗ pointer to the alphabet structure ∗/
int num;        /∗ number of definitions ∗/

 
sj_set_romajitbl sets the user defined alphabet table entered.  The sj_getromaji explained previously looks up this table first, and when it cannot find the specified character, sj_setromaji will search the default table.  If this table is not set, sj_getromaji will look up only the default table.  See the details on the default table in the SJ2 operating instruction manual.  Struct romaji has the structure defined in the sj2lib.h file as indicated below:
 

struct romaji {
        unsigned char  ∗tuduri;/∗ pointer to alphabet character ∗/
        unsigned char  ∗kana;/∗ pointer to full-size Katakana ∗/
};

 
Tuduri is the pointer to the half-size ASCII code character string which indicates the alphabet characters. Kana is the pointer to the full-size Katakana character string that corresponds to this alphabet character. sj_getromaji will produce the searched full-size Katakana codes output in half-size Katakana after converting these codes.  Tbl indicates the array of the pointer to this structure.  Num indicates the number of definitions. 
 

Removing the user defined alphabet characters
 sj_clear_romajitbl()

 
j_clear_romajitbl removes the alphabet character table that was set using sj_set_romajitbl. 

FILES

/usr/sony/bin/sj2Japanese input front end processor

/usr/sony/dict/sj2main.dicKana Kanji conversion main dictionary

/usr/sony/dict/sj2usr.dic.org
Kana Kanji conversion original user dictionary

/usr/sony/dict/sj2usr.dicDefault Kana Kanji conversion user dictionary

/usr/sony/include/sj2lib.hInclude files for Kana Kanji conversion library

/usr/sony/lib/libsj2lib.aKana Kanji conversion library

/usr/sony/demo/sj2/∗Kana Kanji conversion library application samples

NEWS-OSRelease 4.1C

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026