Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ spell(3) — NEXTSTEP 1.0

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

text(3)

pathutil(3)

spell(3)  —  UNIX Programmer’s Manual

NAME

spell - spelling correction functions

SYNOPSIS

#include <text/spell.h>

DESCRIPTION

These functions perform approximate string matching and spelling correction.  They constitute the spelling package of the text library, libtext.  The link editor searches this library under the "-ltext" option.  Declarations for these functions may be obtained either from the include file <text/spell.h>, or from the master include file <text/text.h>. 

SUMMARY

char ∗ SPbestpath(char ∗path);

Attempts to correct misspelled pathnames; returns the valid pathname most closely matching path, or zero if no reasonably matching name is found. 
 
To find the best match for path, fBSPbestpath follows path from the top down, using spelling metrics to find the best matches for unknown pieces of the path.  Ambiguities are resolved by arbitrarily picking a match.  Two spelling metrics are used: the fast Morgan metric is applied to match a pathname component in a large directory.  The slower, but more powerful Wagner/Fischer metric is applied in smaller directories.  Cannot handle missing of path segments, or missing /. 

char ∗∗ SPbestmatch(char ∗s, char ∗table[], int tabsize, int (∗metric) ());

Finds the best matches for s in a table of strings and returns them in a static, zero-terminated array.  table is assumed to be zero-terminated, but only the first tabsize entries are considered if tabsize is given.  metric(char ∗a, char ∗b) is a pointer to a function which returns 0 if a and b are identical, and a larger integer if they are different.  If metric is not supplied, the Wagner/Fischer minimum editing distance metric is used.  This method breaks down when tabsize is larger than 500. 

int SPmorgan(char ∗s, char ∗t);

Morgan’s spelling metric; covers 80-85% of all English spelling and typing errors.  Returns:

 0 if fIs and t are identical
1 if one character is incorrect, added or deleted
(e.g., “zbra” vs “zebra”)
2 if two characters are interchanged (e.g., “zebra” vs “zerba”)
3 otherwise
 

int SPwagfisch(char ∗s, char ∗t);

Wagner/Fischer’s metric, a generalization of Morgan’s (JACM 21, 1, Jan 1974, pp. 168-178). 
 
Returns the distance between s and t; this is the minimum cost editing s to make t.  The set of supported editing operations includes:

 changing one symbol to another symbol     rit -> rat
deleting a symbol                         bit -> bt
inserting a symbol                        bt  -> bat
 

The algorithm recognizes two meta characters: ∗ and ? . 

 ∗ matches any string of zero or more characters.
? matches any one character.
 

Accordingly, distance(a∗, abcde) is 0.  Note that meta character expansion affects the commutativity and triangularity of the metric.  This implementation disallows consecutive deletions to improve speed. 
 
Speed may also be improved by making SPmaxdistance small; this limits the depth of recursion of the algorithm. Every error costs roughly 3; so setting SPmaxdistance to be n∗3 will limit recursion to cases with n or less errors.  For n equal to one, behavior is similar to SPmorgan, but a little slower. 

SEE ALSO

text(3), pathutil(3)

NeXT, Inc.  —  July 7, 1989

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026