SJ3LIB(3) — NEWS-OS Programmer’s Manual
NAME
sj3_open, sj3_close, sj3_getkan, sj3_douoncnt, sj3_getdouon, sj3_gakusyuu, sj3_gakusyuu2, sj3_touroku, sj3_syoukyo, sj3_lockserv, sj3_unlockserv, sj3_rkinit, sj3_rkconv, sj3_hantozen, sj3_zentohan − Kana-Kanji conversion library
SYNOPSIS
#include <sj3lib.h>
sj3_open(hname, uname)
char ∗hname, ∗uname;
sj3_close()
sj3_getkan(reading, phrase, knjbuf, knjsiz)
unsigned char ∗reading;
struct bunsetu ∗phrase;
unsigned char ∗knjbuf;
int knjsiz;
sj3_douoncnt(reading)
unsigned char ∗reading;
sj3_getdouon(reading, homonym)
unsigned char ∗reading;
struct douon ∗homonym;
sj3_gakusyuu(id)
struct studyrec ∗id;
sj3_gakusyuu2(reading1, reading2, id)
unsigned char ∗reading1, ∗reading2;
struct studyrec ∗id;
sj3_touroku(reading, kanji, code)
unsigned char ∗reading, ∗kanji;
int code;
sj3_syoukyo(reading, kanji)
unsigned char ∗reading, ∗kanji;
int code;
sj3_lockserv()
sj3_unlockserv()
sj3_rkinit(file)
char ∗file;
sj3_rkconv(roman, kana)
unsigned char ∗roman, ∗kana;
sj3_hantozen(two_byte, one_byte)
unsigned char ∗two_byte, ∗one_byte;
sj3_zentohan(one_byte, two_byte)
unsigned char ∗one_byte, ∗two_byte;
DESCRIPTION
Connecting to the Kana-Kanji Conversion Server sj3serv
sj3_open(hname, uname)
char ∗hname; /∗ pointer to hostname of host to connect ∗/
char ∗uname; /∗ pointer to name of user using server ∗/
Given execution of the above source code, an attempt will be made to connect to the server (sj3serv) on the host specified by hname for the user whose name is specified by uname. If hname is NULL or the pointer points to the string "unix", the server will be connected under the socket type AF_UNIX. In all other cases it will be connected under the socket type AF_INET. If the connection is made successfully, the files sj3main.dic, user/uname/private.dic, and user/uname/study.dat (See below.) are opened. If the user dictionary and/or study file are not present they will be created. The error status returned by this function is 0 when connection is successful, and non-zero when it fails. This function must be called before any Kana-Kanji conversion is possible. Details concerning this function’s error status are given below.
SJ3_SERVER_DEAD Server dead when connecting server
SJ3_CONNECT_ERROR No server or cannot connect server
SJ3_ALREADY_CONNECTED Server already connected
SJ3_CANNOT_OPEN_MDICT Cannot open main dictionary
SJ3_CANNOT_OPEN_UDICT Cannot open user dictionary
SJ3_CANNOT_OPEN_STUDY Cannot open study dictionary
SJ3_CANNOT_MAKE_UDIR Cannot make user directory
SJ3_CANNOT_MAKE_UDICT Cannot make user dictionary
SJ3_CANNOT_MAKE_STUDY Cannot make study file
If an error occurs, at least one of these flags is set for the error status.
Connection to the server has been made on all errors other than SJ3_SERVER_DEAD and SJ3_CONNECT_ERROR.
Disconnecting from the Server
sj3_close()
The function shown above disconnects the server secured by sj3_open. The error status returned by this function is 0 when connection is successful, and non-zero when it fails. This function must be called after use of the Kana-Kanji conversion routine has ended. Details concerning this function’s error status are given below.
SJ3_SERVER_DEAD Servier died when disconnecting server
SJ3_DISCONNECT_ERROR Internal error occurred at server
SJ3_NOT_CONNECTED Server not connected
SJ3_NOT_OPEN_MDICT Main dictionary not open
SJ3_NOT_OPEN_UDICT User dictionary not open
SJ3_NOT_OPEN_STUDY Study file not open
SJ3_CLOSE_MDICT_ERROR Cannot close main dictionary
SJ3_CLOSE_UDICT_ERROR Cannot close user directory
SJ3_CLOSE_STUDY_ERROR Cannot close study dictionary
If an error occurs, at least one of these flags is set for the error status.
The server is always disconnected regardless of the error.
Global Conversion of Kana String Candidates
sj3_getkan(reading, phrase, knjbuf, knjsiz)
unsigned char ∗reading; /∗ pointer to reading ∗/
struct bunsetu ∗phrase; /∗ pointer to phrase structure ∗/
unsigned char ∗knjbuf; /∗ pointer to kanji buffer ∗/
int knjsiz; /∗ size of kanji buffer ∗/
This function converts conversion candidates matching the specified reading are converted globally and returns the result. The resulting kanji strings are put in knjbuf and information concerning each phrase in struct bunsetu phrase[]. knjsiz is size of knjbuf.
The return value of the function is the number of phrases represented. The data structure type struct bunsetu is declared in the header sj3lib.h as follows.
struct bunsetu {
int srclen; /∗ length of reading ∗/
int destlen; /∗ length of kanji ∗/
unsigned char ∗srcstr; /∗ pointer to reading ∗/
unsigned char ∗deststr; /∗ pointer to kanji ∗/
struct studyrec dicid; /∗ data for study ∗/
};
If a phrase is encountered which cannot return a kanji string, srcstr is set to the string representing the reading and specified as having length srclen. In this case, a phrase for which destptr is NULL with a destlen of 0 is formed. Readings received by this function are usually made up of 2-byte Shift JIS code, but may include 1-byte codes as well. However, 1-byte codes can never be converted to kanji. The reading string must be NULL-terminated and must not exceed 256 characters in size. If this limit is exceeded, the function does nothing and returns 0 as the number of phrases. If the server dies, the function returns −1.
The structure struct bunsetu phrase[] must be defined in the program source for memory allocation.
Counting Homonyms
sj3_douoncnt(reading)
unsigned char ∗reading; /∗ pointer to reading ∗/
This function converts the reading string into a phrase and returns the number of corresponding homonyms. Readings received by this function are usually made up of 2-byte Shift JIS code, but may include 1-byte codes as well. However, 1-byte codes can never be converted to kanji. The reading string must be NULL-terminated and must not exceed 64 characters in size. This function has a return value of 0 as the number of homonyms if no phrase could be formed by conversion.
If the server dies, the function returns −1.
Getting Homonyms
sj3_getdouon(reading, homonym)
unsigned char ∗reading; /∗ pointer to reading ∗/
struct douon ∗homonym; /∗ pointer to homonym structure ∗/
This function converts the reading string into a phrase, sets the homonym structure received accordingly, and returns the number of corresponding homonyms. Homonym-related information is stored in the structure struct douon homonym[]. The data structure type struct douon is declared in the header sj3lib.h as follows.
struct douon {
unsigned char ddata[256]; /∗ homonym data ∗/
int dlen; /∗ homonym length ∗/
struct studyrec dcid; /∗ data for study ∗/
};
Readings received by this function are usually made up of 2-byte Shift JIS code, but may include 1-byte codes as well. However, 1-byte codes can never be converted to kanji.
The reading string must be NULL-terminated and must not exceed 64 characters in size. This function has a return value of 0 as the number of homonyms if no phrase could be formed by conversion. If the server dies, the function returns −1. The structure struct douon homonym[] must be defined in the program source for memory allocation. ddata is a NULL-terminated character string.
Learning Phrases
sj3_gakusyuu(id)
struct studyrec ∗id; /∗ pointer to data used for study ∗/
This function can be used to learn phrases corresponding to data obtained using the sj3_getkan and sj3_getdouon functions.
The function returns −1 if the server dies, 0 if the phrase is learned successfully, and other values if it fails.
Learning Phrase Lengths
sj3_gakusyuu2(reading1, reading2, id)
unsigned char ∗reading1; /∗ pointer to reading of phrase 1 ∗/
unsigned char ∗reading2; /∗ pointer to reading of phrase 2 ∗/
struct studyrec ∗id; /∗ pointer to data for study ∗/
This function studies the lengths of phrase 1, represented by reading1, and phrase 2, represented by reading2 and id. The function returns −1 if the server dies, 0 if the phrases are learned successfully, and other values if it fails.
Registering Entries
sj3_touroku(reading, kanji, code)
unsigned char ∗reading; /∗ pointer to reading ∗/
unsigned char ∗kanji; /∗ pointer to kanji string ∗/
unsigned char code; /∗ part-of-speech code ∗/
This function registers entries specified by a reading, kanji string and part-of-speech code in the user dictionary. The function returns −1 if the server dies, 0 if the phrase is registered successfully, or an error status of values shown below if it fails.
Details concerning this function’s error status are given below.
SJ3_DICT_ERROR No dictionary or it is read-only
SJ3_DICT_LOCKED Dictionary locked (currently being read)
SJ3_BAD_YOMI_STR Bad reading string
SJ3_BAD_KANJI_STR Bad kanji string
SJ3_BAD_HINSI_CODE Bad part-of-speech code
SJ3_WORD_EXIST Entry already registered
SJ3_DOUON_FULL Cannot register any more homonyms
SJ3_DICT_FULL Cannot register any more entries
SJ3_INDEX_FULL Cannot register any more indices
SJ3_TOUROKU_FAILED Failed to register entry
If an error occurs, at least one of these flags is set for the error status. Readings can be made up of any of the Shift JIS codes shown below and must be NULL-terminated. Readings must not exceed 32 characters in length.
0x815b 0x8194 0x8145 0x81a7
0x824f -> 0x8258
0x8160 -> 0x8279
0x8281 -> 0x829a
0x82a0 -> 0x82f1
0x8394 0x8395 0x8396
The following characters cannot come at the beginning of a reading.
0x815b 0x82f0 0x82f1 0x829f 0x82a1 0x82a3 0x82a5 0x82a7 0x82c1 0x82e1 0x82e3 0x82e5 0x82ec 0x8395 0x8396
In general, kanji are represented by Shift JIS codes and must be NULL-terminated. Their length must not exceed 32 characters. This restriction includes kanji including 1-byte codes. Part of speech codes are as follows
SJ3_H_NRMNOUN regular noun
SJ3_H_PRONOUN pronoun
SJ3_H_LNAME last name
SJ3_H_FNAME first name
SJ3_H_LOCNAME place name
SJ3_H_PREFIC district name
SJ3_H_RENTAI rentai form
SJ3_H_CONJUNC conjunction
SJ3_H_SUBNUM counter
SJ3_H_NUMERAL numeral
SJ3_H_PREFIX prefix
SJ3_H_POSTFIX suffix
SJ3_H_ADVERB adverb
SJ3_H_ADJECT adjective
SJ3_H_ADJVERB adjectival verb
SJ3_H_SILVERB suru-type verb
SJ3_H_ZILVERB zuru-type verb
SJ3_H_ONEVERB ichidan verb
SJ3_H_KAVERB ka-godan verb
SJ3_H_GAVERB ga-godan verb
SJ3_H_SAVERB sa-godan verb
SJ3_H_TAVERB ta-godan verb
SJ3_H_NAVERB na-godan verb
SJ3_H_BAVERB ba-godan verb
SJ3_H_MAVERB ma-godan verb
SJ3_H_RAVERB ra-godan verb
SJ3_H_WAVERB wa-godan verb
SJ3_H_SINGLE single kanji
Deleting Entries from Dictionaries
sj3_syoukyo(reading, kanji, code)
unsigned char ∗reading; /∗ pointer to reading ∗/
unsigned char ∗kanji; /∗ pointer to kanji string ∗/
unsigned char code; /∗ part-of-speech code ∗/
This function registers deletes entries registered in the user dictionary using sj3_touroku. The function returns −1 if the server dies, 0 if the phrase is deleted successfully, or an error status of values shown below if it fails. Details concerning this function’s error status are given below.
SJ3_DICT_ERROR No dictionary or it is read-only
SJ3_DICT_LOCKED Dictionary locked (currently being read)
SJ3_BAD_YOMI_STR Bad reading string
SJ3_BAD_KANJI_STR Bad kanji string
SJ3_BAD_HINSI_CODE Bad part-of-speech code
SJ3_WORD_NOT_EXIST
No such entry
SJ3_SYOUKYO_FAILED Failed to delete entry
If an error occurs, at least one of these flags is set for the error status. The specification for reading strings, kanji strings, and part-of-speech codes conform to those given above for sj3_touroku.
Locking the Dictionary
sj3_lockserv()
This function locks dictionaries currently open so that no entries may be registered or deleted. Dictionaries must be locked when open due to the fact that the integrity of a study ID may be lost if a registration or deletion occurs after a client has obtained a study ID, but before the client is through with it.
Unlocking Dictionaries
sj3_unlockserv()
This function unlocks dictionaries locked using the function sj3_lockserv. Since no entries may be registered or deleted while a dictionary is locked, clients which no longer need to preserve study IDs should unlock the dictionary as soon as possible.
Initialization for Roman-Kana Conversion
sj3_rkinit(file)
char ∗file; /∗ pointer to filename ∗/
This function reads the definition file specified by file and creates a corresponding Roman-Kana conversion table. Each line of the definition file is made up of fields in the following format.
[ROMAN LETTER IN] [KANA OUT] [[ROMAN LETTER OUT]]
Converting Roman Letters to Kana
sj3_rkconv(roman, kana)
unsigned char ∗roman; /∗ pointer to Roman letters ∗/
unsigned char ∗kana; /∗ pointer to kana string ∗/
When a character string which was supplied in the form of Roman letters is convered, the result is returned as kana in Shift JIS character codes. The rules for converting Roman letters into kana are specified in sj3_rkinit( ). When the conversion is performed, if it is not clear whether or not it will be possible to convert the number of characters in the character string 0 is returned, and if it is not possible −1 is returned. If conversion was possible, the converted characters are deleted from among the Roman characters, and only the unconverted chracters remain.
Converting 1-Byte Codes to Corresponding 2-Byte Codes
sj3_hantozen(two_byte, one_byte)
unsigned char ∗two_byte; /∗ pointer to two-byte characters ∗/
unsigned char ∗one_byte; /∗ pointer to one-byte characters ∗/
This function converts the 1-byte alphabetic characters and 1-byte katakana in the string one_byte to their 2-byte counterparts. Note that katakana is always converted to hiragana. The result of the conversion is returned in two_byte. The array two_byte[] must be defined in the program source for memory allocation.
Converting 2-Byte Codes to Corresponding 1-Byte Codes
sj3_zentohan(one_byte, two_byte)
unsigned char ∗one_byte; /∗ pointer to one-byte characters ∗/
unsigned char ∗two_byte; /∗ pointer to two-byte characters ∗/
This function converts the 2-byte alphabetic characters and 2-byte hiragana and katakana in the string two_byte to their 1-byte counterparts. Note that both hiragana and katakana are always converted to 1-byte katakana. The result of the conversion is returned in one_byte. The array one_byte[] must be defined in the program source for memory allocation.
FILES
/usr/sony/bin/sj3serv Kana-Kanji conversion server
/usr/sony/lib/sj3/serverrc Setup file for the Kana-Kanji conversion server
/usr/sony/bin/sj3 Front-end processor for Japanese language input
/usr/sony/dict/sj3/ Default dictionary directory for the Kana-Kanji conversion server
sj3main.dic Main dictionary for Kana-Kanji conversion
user/uname/private.dic User dictionary for Kana-Kanji conversion
user/uname/study.dat Study file for Kana-Kanji conversion
/usr/sony/include/sj3lib.h Include file for Kana-Kanji conversion library
/usr/sony/lib/libsj3lib.a Kana-Kanji conversion library
/usr/sony/demo/sj3/∗ Sample on how to use the Kana-Kanji conversion library
SEE ALSO
NEWS-OSRelease 4.1C