CATDOC, Utilities, Convert Microsoft Word Documents into Text or TeX CATDOC ver. 0.34 Copyright(c) by Softweyr 96,97 Free converter from MS-Word to TeX or plain text. OVERVIEW This program extracts text from MS-Word files, trying to preserve as much special printable characters as possible. It doesn't even try to preserve fancy Word formatting, becouse Word users usually don't care about document structure, and it is this very thing, which important for LaTeX users. Catdoc was designed to work with Cyrillic MS-Word files, so it can convert Cyrillic from ANSI 1251 code page to KOI-8 (for UNIX) or CP 866 (for DOS). This feature can be disabled at compile time. SUPPORTED PLATFORMS Catdoc was tested by author on Linux and Solaris (gcc 2.7.x) and DOS (Watcom C 10, 16-bit). There was also reports of successiful uses of catdoc on Digital UNIX, HP-UX and other flavors of UNIX. I don't expect any troubles with catdoc on any platform which has ANSI C compilier and supports notion of stdout and stderr. SUPPORTED VERSIONS OF WORD Catdoc was tested on files produced by MS-Word 6.0 for Windows and MS-Word 7.0 for Window-95. It definitely doesn't work with Word-97 files, stored in UNICODE. It should also work with earlier versions of Word, or even MS-Write but it was never tested. LIMITATION Doesn't handle any embedded OLE objects, including MS-Equation Doesn't handle fastsaves and footnotes properly (modifications and footnotes are printed at the end of text as separate paragraphs) Can output garbage in place of OLE objects. (it is better to have garbage, then lose part of text). Doesn't output proper table headers in Latex mode. LICENSE This program is distributed under GNU Public License Full text of this license can be obtained from: http://www.gnu.ai.mit.edu/copyleft/gpl.html INSTALLATION All platforms: Check if you need Cyrillic translation. If not so, uncomment line #define LATIN1 in catdoc.c Unix compile catdoc.c with cc catdoc.c -o catdoc and put resulting binary in any directory on your path. Put manual page (catdoc.1) into appropriate directory. If you have Tcl/Tk version 7.6 or higher you can also install accompainiing wordview script, which is X11 interface to catdoc. It should be installed into same directory as catdoc itself. Edit first line of the script to point to your copy of wish. Dos compile catdoc.c with your favorite C compilier and put catdoc.exe in any directory on your path. If you use 16-bit compilier, use small memory model AUTHOR Victor Wagner Ported to VMS by Hunter Goatley