ISPELL(1) ISPELL(1)
NAME
ispell - Correct spelling for a file
SYNOPSIS
ispell [ -t | -x | -S | -d file | -p file | -w chars ]
file .....
ispell [ -t | -d file | -p file | -w chars ] -l
ispell [ -t | -d file | -p file ] { -a | -A }
ispell [ -w chars ] -c
ispell -v
DESCRIPTION
Ispell is fashioned after the spell program from ITS
(called ispell on Twenex systems.) The most common usage
is "ispell filename". In this case, ispell will display
each word which does not appear in the dictionary, and
allow you to change it. If there are "near misses" in the
dictionary (words which differ by only a single letter, a
missing or extra letter, or a pair of transposed letters),
then they are also displayed. If you think the word is
correct as it stands, you can type either "Space" to
accept it this one time, or "I" to accept it and put it in
your private dictionary. If one of the near misses is the
word you want, type the corresponding number. (If there
are more than 10 choices, you may have to type a carriage
return to complete a single-digit number). Finally, if
none of these choices is right, you can type "R" and you
will be prompted for a replacement word. If you want to
see a list of words that might be close using wildcard
characters, type "L" to lookup a word in the system dic-
tionary.
When a misspelled word is found, it is printed at the top
of the screen. Any near misses will be printed on the
following lines, and finally, two lines containing the
word are printed at the bottom of the screen. If your
terminal can type in reverse video, the word itself is
highlighted.
The -v option causes ispell to print its current version
identification on the standard output and exit.
The -l or "list" option to ispell is used to produce a
list of misspelled words from the standard input.
The -a option is intended to be used from other programs
through a pipe. In this mode, ispell expects the standard
input to consist of lines containing single words. Each
word is read, and a single line is written to the standard
output. If the word was found in the main dictionary, or
your personal dictionary, then the line contains only a
'*'. If the word was found through suffix removal, then
the line contains a '+', a space, and the root word. If
the word is not in the dictionary, but there are near
MIT 1
ISPELL(1) ISPELL(1)
misses, then the line contains an '&', a space, and a list
of the near misses separated by spaces. Also, each near
miss is capitalized the same as the input word if unless
such capitalization is illegal; in the latter case each
near miss is capitalized correctly according to the dic-
tionary. Finally, if the word neither appears in the dic-
tionary, and there are no near misses, then the line con-
tains only a '#'. This mode is also suitable for interac-
tive use when you want to figure out the spelling of a
single word. (These characters are the same as the codes
that the real spell program uses.)
The -A option works just like -a, except that if a line
begins with the string "&Include_File&", the rest of the
line is taken as the name of a file to read for further
words. Input returns to the original file when the
include file is exhausted. Inclusion may be nested up to
five deep. The key string may be changed with the envi-
ronment variable INCLUDESTRING (the ampersands, if any,
must be included).
When in the -a mode, ispell will also accept lines of sin-
gle words prefixed with either a '*' or a '@'. A line
starting with '*' tells ispell to insert the word into the
user's dictionary (similar to the I command). A line
starting with '@' causes ispell to accept this word in the
future (similar to the A command).
The -x option causes ispell to remove the .bak file that
it normally leaves. The .bak file contains the pre-
corrected text. If there are file opening / writing
errors, the .bak file may be left for recovery purposes
even with the -x option.
The -S option suppresses ispell's normal behavior of sort-
ing the list of possible replacement words. Some people
may prefer this, since it somewhat enhances the probabil-
ity that the correct word will be low-numbered.
The -t option selects TeX/LaTeX input mode. TeX/LaTeX
mode is also automatically selected if an input file has
the extension ".tex". In this mode, whenever a backslash
("\") is found, ispell will skip to the next whitespace.
Thus, for example, given
\chapter {This is a Ckapter} \cite{SCH86}
will find "Ckapter" but will not look for SCH. The -t
option does not recognize the TeX comment character "%".
The -d option is used to specify an alternate hashed dic-
tionary file, other than the default. If the filename
does not begin with a "/", the library directory for the
default dictionary file is prefixed. This is useful to
allow dictionaries which prefer alternate British
spellings ("centre", "tyre", etc), or add lists of
MIT 2
ISPELL(1) ISPELL(1)
special-purpose jargon and acronyms for subclasses of doc-
uments. There are some shortcomings in attempting to pro-
vide foreign-language dictionaries, but something like "-d
french" could be made to work somewhat. The -d option may
specify /dev/null, in which case the dictionary is limited
to the personal one. This may be useful for certain pri-
vate dictionaries.
The -p option is used to specify an alternate personal
dictionary file. If the file name does not begin with
"/", $HOME is prefixed. Also, the shell variable WORDLIST
may be set, which renames the personal dictionary in the
same manner. The command line overrides WORDLIST setting.
If neither is present "~/.ispell_words" is used.
The -w option may be used to specify characters other than
alphabetics which may also appear in words. For instance,
-w "&" will allow "AT&T" to be picked up. Underscores are
useful in many technical documents. There is an admit-
tedly crude provision in this option for 8-bit interna-
tional characters. Non-printing characters may be speci-
fied in the usual way by inserting a backslash followed by
the octal character code; e.g., "\014" for a form feed.
Alternatively, if "n" appears in the character string, the
(up to) three characters following are a DECIMAL code 0 -
255, for the character. For example, to include bells and
form feeds in your words (an admittedly silly thing to do,
but aren't most pedagogical examples):
n007n012
Numeric digits other than the three following "n" are sim-
ply numeric characters. Use of "n" does not conflict with
anything because actual alphabetics have no meaning -
alphabetics are already accepted. Ispell will typically
be used with input from a file, meaning that preserving
parity for possible 8 bit characters from the input text
is OK. If you specify the -l option, and actually type
text from the terminal, this may create problems if your
stty settings preserve parity.
The -c option is primarily intended for use by the munch-
list shell script. In this mode, a list of words is read
from the standard input. For each word, a list of possi-
ble root words and suffixes will be written to the stan-
dard output. Some of the root words will be illegal and
must be filtered from the output by other means; the
munchlist script does this. As an example, the command
"echo BOTHER | ispell -c" produces:
BOTH
BOTHE/R
BOTH/R
MIT 3
ISPELL(1) ISPELL(1)
Unless it has been installed without the feature by your
system administrator, ispell is aware of the correct capi-
talizations of words in the dictionary and in your per-
sonal dictionary. As well as recognizing words that must
be capitalized (e.g., George) and words that must be all-
capitals (e.g., NASA), it can also handle words with
"unusual" capitalization (e.g., "ITCorp" or "TeX"). If a
word is capitalized incorrectly, the list of possibilities
will include all acceptable capitalizations. (More than
one capitalization may be acceptable; for example, my dic-
tionary lists both "ITCorp" and "ITcorp".) Normally, this
feature will not cause you surprises, but there is one
circumstance you need to be aware of. If you add a word
to your dictionary that is at the beginning of a sentence
(e.g., the first word of this paragraph if "unless" were
not in the dictionary), it will be marked as "capitaliza-
tion required". A subsequent usage of this word without
capitalization (e.g., the quoted word in the previous sen-
tence), ispell will object and suggest the capitalized
version. You must then compare the actual spellings by
eye, and then type "I" to add the un-capitalized variant
to your personal dictionary.
The rules for capitalization are as follows:
(1) Any word may appear in all capitals, as in head-
ings.
(2) Any word that is in the dictionary in all-lowercase
form may appear either in lowercase or capitalized
(as at the beginning of a sentence).
(3) Any word that has "funny" capitalization (i.e., it
contains both cases and there is an uppercase char-
acter besides the first) must appear exactly as in
the dictionary, except as permitted by rule (1).
If the word is acceptable in all-lowercase, it must
appear thus in a dictionary entry.
The -w option is passed on to ispell. The -e ("effi-
cient") option causes the script to use a slower algorithm
that uses somewhat less space in TMPDIR (normally
/usr/tmp).
It is possible to install ispell in such a way as to only
support ASCII range text if desired.
COMPATABILITY
Invoking the ispell executable as spell is the same as
ispell with the -l option.
ENVIRONMENT
WORDLIST Personal dictionary file name
MIT 4
ISPELL(1) ISPELL(1)
INCLUDE_STRING Code for file inclusion under the -A
option
TMPDIR Directory used for some of munchlist's tem-
porary files
FILES
$HOME/.ispell_words user's private dictionary
/usr/share/dict/web2 list of words for the Lookup
function
/usr/libdata/ispell.hash hashed dictionary for ispell
SEE ALSO
spell(1), egrep(1), look(1), ispell(4)
BUGS
It takes about five seconds for ispell to read in the hash
table.
The hash table is stored as a quarter-megabyte (or larger)
array, so a PDP-11 version does not seem likely.
Ispell should understand more troff syntax, and deal more
intelligently with contractions.
While alternate dictionaries for foreign languages could
be defined, and the international characters included in
words, rules concerning word endings / pluralization
accommodate English only.
When the -x flag is specified, ispell will unlink any
existing .bak file.
Munchlist requires tremendous amounts of temporary file
space for large dictionaries. It does respect the TMPDIR
environment variable, so this space can be redirected.
However, a lot of the temporary space it needs is for
sorting, so TMPDIR is only a partial help on systems with
an uncooperative sort(1). As a benchmark, the 15000-word
dict.191 takes about 1200 blocks in TMPDIR, and 2000 in
sort's temporary directories. Munching dict.191 with
/usr/share/dict/web2 (28000 words output) took another
1500 blocks or so, and ran for the better part of an hour.
AUTHOR
Pace Willisson (pace@mit-vax)
Collected, revised, and enhanced for the Usenet by Walt
Buehring.
Further enhanced and debugged by Isaac Balbin, Stewart
Clamen, Mark Davies, Steve Dum, Gary Johnson, Don Kark,
Steve Kelem, Jim Knutson, Geoff Kuenning, Evan Marcus,
Dave Mason, Rob McMahon, Bob McQueer, David Neves, Joe
Orost, Israel Pinkas, Gary Puckering, Bill Randle, Marc
Ries, Rich Salz, Greg Schaffer, Joel Shprentz, George
Sipe, Perry Smith, Stefan Taxhet, Andrew Vignaux, Johan
MIT 5
ISPELL(1) ISPELL(1)
Widen, James Woods, and Ken Yap.
MIT 6