This directory contains the files for a comprehensive spelling checker which can correct files containing text and embedded commands for a number of different word processors. All necessary files are provided, including dictionaries. This file contains a full description of the program and installation instructions. This version uses a completely new user interface than the one provided on previous DECUS tapes. It operates correctly under VMS 4.0 and later. If you are running a version before 4.2, you will need to update your PASRTL in SYS$LIBRARY. Distribution of this VMS software component has been authorized by Digital. Files: ***** AAAREADME.TXT - this file BUILD.COM - installation procedure COMMONWRD.DAT - sequential file of most common words LEXIC08.DAT - main dictionary for words of <8 letters - indexed file LEXIC16.DAT - main dictionary for words of <16 letters - indexed file LEXIC32.DAT - main dictionary for words of <32 letters - indexed file NEWGOOD.TXT - new word file - indexed PASRTL.EXE - 3.0 Pascal runtime library for VMS 4.0 and 4.1 SMGDEFS.PEN - Pascal environment file for SMG$ routines. SPELL.EXE - SPELL program compiled under VMS v4.1 SPELL.HLP - HELP library module for SPELL SPELL.OBJ - object file of SPELL SPELL.PAS - VMS Pascal source text of SPELL *.SPELL_HELP - Internal help files *.FDL - FDL files used to optimize dictionaries Functionality: ************* SPELL reads standard VMS text files - such as those created by EDT or TPU, and will recognize the syntax for embedded commands in RUNOFF, TeX and SCRIBE word processor input files, thus preventing many spurious "errors" from being registered. The program makes extensive use of SMG$ screen management routines, and may be used on any video terminal supported in TERMTABLE.EXE. The entire program is written in Pascal, and should be readily modifiable if additional functionality is required. The checker includes guessing algorithms which may optionally be invoked if a wrongly spelled word is encountered, a facility for creating personal dictionaries, and provision for the automatic submission of words not found in the dictionaries to a maintenance person so that the main dictionaries may be periodically updated. The checker creates a corrected file as output, and optionally also produces a log file showing all changes made. Installation: ************ There is an installation procedure in this submission which should make installation relatively painless. Unlike SPELL, it is not extensively error-trapped, so be careful when using it - hopefully, you will only need to use it once! To use this procedure, copy all the files in this directory to a scratch directory on your system (not necessarily on your system disk), set your default to that directory, and invoke the procedure with an @BUILD command. Internals: ********* Spell operates by constructing an ordered tree of all new words as they are encountered in a document. Any word in the document is first checked against this tree, and if it is already in the tree, it is copied intact. If the word under test is not in the tree, and it is not one of a small number of very common words which are held in an array in memory, one of the three indexed files which constitute the main dictionary is searched, depending on the length of the word. If the word is still not found, an optional personal dictionary in the user's default directory is searched, and failing this, the list of words which users have marked as "right" but which have not yet been checked by the dictionary-person. If the word appears in this last list, the user is warned that it may not be correct, and given the opportunity to reject it. If a word is not found in any of these searches, the user is presented with two lines of context, with the word highlighted, and given a number of choices: C - you want to check alternative spellings in the dictionary. I will prompt you for words to check until you find one that is correct or you decide to give up checking. G - you want me to offer you some suggestions about possible spellings. I - you want to ignore the word - for instance an uncommon name. P - you want me to add the word to your Personal Dictionary. Q - you want to quit SPELL without checking any more of the file. R - the word is right and should be added to the main dictionary. W - you agree that the spelling of the word is wrong. I will ask you for the correct spelling or give you the option of asking me to guess. If the result of a "wrong", "guess" or "check" operation is to change the word being questioned by spell, the program will attempt to automatically match upper and lower case to the case of the word being replaced. If this turns out to be impossible, the user is prompted to supply the case. Wrong words are linked into a double tree together with replacements. If they occur again in the file, the user is prompted to decide whether they wish to make the same change again. Right words are added to an indexed file in the system dictionary directory, which is automatically accessed when the dictionary person invokes the SPELL/UPDATE command. SPELL writes a workfile called SPELLWRK.TMP, and if the session terminates successfully, and any changes have been made to the file being checked, then the workfile is renamed to the next higher version of the file being checked. The "front-end" of SPELL is a small DFA lexical scanner. This can easily be modified to include support for other word processors. All pre-processing of text is handled by this scanner, including recognition of words, constructs such as apostrophes and quotations, and recognition of embedded word-processor commands. Any word or character which is not recognized as a possible "legitimate" word is simply copied to the output file, and the remaining candidates for testing are made into a simple linked-list which contains some attributes of the words as well as their text, and allows simple manipulation of the text if errors are encountered. Origins: ******* The SPELL program is based on a program called PROOFREAD which was written by Matthew Temple of Smith College, Northampton MA. His ideas and permission to use them are gratefully acknowledged. Maintenance: *********** The /UPDATE option should be invoked periodically, and any new words filtered either into the main dictionaries, or out of the system. If a lot of new words have been added to the dictionaries, some performance improvement may be obtained if they are run through the CONVERT utility, using the supplied FDL files. Submitted by: ********* ** Mark Resmer, Vassar College, Poughkeepsie NY 12601 (914) 452-7000 Ext 2437 resmer@vassar on CCnet and BITnet resmer%vassar.bitnet@wiscvm.arpa from ARPAnet