webster(3) — UNIX Programmer’s Manual
NAME
webster - access to the Merriam-Webster database
SYNOPSIS
#include <text/webster.h>
DESCRIPTION
These functions provide programmatic access to the Merriam-Webster database used by /NextApps/Webster. They constitute the Merriam-Webster database access package of the text library, libtext. The link editor searches this library under the "-ltext" option. Declarations for these functions may be obtained either from the include file <text/webster.h>, or from the master include file <text/text.h>.
The following program fragment prints a simple, unformatted definition for a word on the standard output:
#include "webster.h"
void
main()
{
ReferenceBook ∗book;
Definition ∗d;
int exact = 1;
book = referenceOpen("Webster-Dictionary");
dictionaryFoldSenses = 80; /∗ fold output lines to 80 chars ∗/
if (d = getDefinition("amphora", book, exact))
{
putDefinition(d, 0);
printf("0);
freeDefinition(d);
}
}
The Webster databases include the Ninth Collegiate Dictionary (tm) and the Collegiate Thesaurus (tm). They are found in /NextLibrary/References/Webster-{Dictionary,Thesaurus}. Each book consists of a source file and an index file; the dictionary also may have a full-index file, artwork in pictures/..., and some ancillary material in info/... - the “front matter” from the paper book.
The conventions documented here are subject to change as multilingual dictionary formats emerge. The source file is currently a sequential list of tags followed by data. A tag is a one byte code describing an element of an entry (e.g., an entry, a function, a pronunciation, etc.). The formatting of the associated data is type-specific. A series of one or two byte codes marking special glyphs and font changes is also used. The index files are simple, alphabetically sorted files, each line containing a word and its byte offset location in the source file.
The three principal data structures are SenseList, which is a linked list of the elements in an entry; Definition, which is a full entry; and ReferenceBook, which is a descriptor holding open file descriptors and other material describing a reference book:
typedef struct SenseList
/∗ A SenseList is a tagged list of fields: s[0] is the tag,
∗ s[1] is the data. ∗/
{
char ∗s;
int n;
struct SenseList ∗next;
} SenseList;
typedef struct
/∗ A Definition entry contains any of a number of fields, all
∗optional. The field pointers are taken from the SenseList,
∗ and point to the first of 0 or more fields of the appropriate
∗ type which may be present in the entry. ∗/
{
char ∗entry;/∗ the main entry word ∗/
char ∗dotted;/∗ the dotted form of the word ∗/
char ∗pronunciation;/∗ the pronunciation field ∗/
char ∗function;/∗ the grammatical function ∗/
char ∗date;/∗ date word was first used ∗/
char ∗etymology;/∗ etymological description ∗/
char ∗inflection;/∗ inflected forms (-ed, -ies, -ing, etc.) ∗/
char ∗variant;/∗ variant forms and inflections ∗/
char ∗picture;/∗ pathname of file containing picture, if any ∗/
char ∗subject;/∗ internal subject codes, per-sense ∗/
SenseList ∗l;/∗ linked list of senses comprising the entry ∗/
int section;/∗ one of: dMain, dAbbr, dForeign, dBio, dGeo, etc. ∗/
long offset;/∗ where in the book it came from ∗/
struct ReferenceBook ∗book;/∗ what book it came from ∗/
} Definition;
typedef struct ReferenceBook
/∗ A reference book includes a tagged source file, an index,
∗ and possibly a full-text index. This data structure also
∗ contains pointers to book-specific i/o routines. ∗/
{
FILE ∗source, ∗index, ∗fullIndex;
Definition ∗(∗getDef)();
int (∗putDef)(), (∗freeDef)();
} ReferenceBook;
SUMMARY
ReferenceBook ∗referenceOpen(char ∗name);
Opens reference book name for searching and returns a pointer to the open descriptor, or zero in the event of failure. Currently name may be one of Webster-Dictionary or Webster-Thesaurus.
int referenceClose(ReferenceBook ∗r);
Closes r, and frees the associated storage.
Definition ∗nextDefinition(ReferenceBook ∗book);
Reads and returns the next definition from book. Steps through the list of definitions in the book.
Definition ∗getDefinition(char ∗word, ReferenceBook ∗book, int match);
Gets the first definition for word in book, and returns a pointer to the open descriptor, or zero if no definition was found. If match is true, the word must match exactly, otherwise prefix matches are accepted.
Definition ∗ getNextDefinition(char ∗word, ReferenceBook ∗book, int match);
Returns the next definition matching word, or zero if no there are more. This routine may be called after an initial call to getDefinition().
int freeDefinition(Definition ∗d);
Frees the storage used by d.
int putDefinition(Definition ∗d, int (∗output)());
Generates the definition d, calling the function output for each character processed. The information is preconverted at lookup time to some output format (dictionaryOutputFormat, one of W_ASCII, W_PS, etc). If no output function is given the information is written to the standard output.
The W_ASCII output format is the default, as it is likely to be the most useful for computational linguistic applications. When W_ASCII is used, the global variable dictionaryFoldSenses should be set to the desired line width in characters.
SEE ALSO
4th Berkeley Distribution — July 7, 1989