;;09-21-86 Eric Gans French Department UCLA Los Angeles, CA 90024 WINDEX.DOC v3.01 WINDEX features: - allows indexing of words or strings up to 49 characters. - allows free capitalization format in index entries. - allows a page offset between -254 and 9999. - allows tagging keys (once!) within the text file. - allows indexing of (hard-)hyphenated words. - uses entire free memory (maximum of about 11000 page references and a 17 K NDX file for a 60 K system -- e.g., Kaypro-10). A number of features have been added in versions 2 and 3 at the request of users, whom I thank for their interest. Version 3.01 (released 01/10/87) Allows straightforward indexing of strings (up to 50 characters). Streamlined command structure; eliminated all internal prompts (for use in batch files). Creates index entries exactly as entered. Version 2.2 (3/20/86) Allows words up to 49 characters (suggestion of Walter Becker), fixed bug in tree-sort routine; improved version of ALPHA.COM (2.0). Version 2.1 (2/21/86) Calculates BIOS addresses so as to work with Wordstar "R" command. ***** WINDEX creates indexes for Wordstar files written in document mode. It can be used to index a manuscript of any length, including books of up to 9999 pages, with a maximum of 254 keys. Command line: windex [d:]fn.ft [@] [/offset] [*|#] o The index file (fn.NDX) will be placed on the same drive as the file to be indexed. o Include @ in the command line if you want the index output to the screen (the file will be created in any case). The output can be aborted by typing ^C after the first page. o The page offset (if any) should be entered as a decimal number between -255 and 9999. All page numbers in the index will be increased by this offset. This feature allows you to index manuscripts that do not start at page 1 (say, chapters in a book). A negative offset may be used if page 1 is preceded by prefatory material; index entries that come before page 1 will be listed as "-#". o The keywords to be WINDEXed can be entered in three modes, indicated by the last character on the command line: , *, or #. 1. (default) Direct keyboard entry; you will be prompted at the console. This is the simplest approach for short indexes. 2. * (NB: * replaces / used in earlier versions) Keywords will be sought in a file fn.KWD on the same drive/user. In creating this file, you need only avoid hyphens and the exotic punctuation marks [\]^_` The character / can be used between words to index strings; it will be treated as the equivalent of a space in the file. The KWD entry: blurk/zap/zlonk will search the string "blurk zap zlonk". To tag this string in your text file, it should appear as: "^Pblurk zap zlonk^P". (NB: the non-break spaces [^O] between words required in previous versions are no longer necessary.) All other characters, including numbers, periods, commas, semicolons or blanks, are permitted as separators: the simplest way is to list the words with a CR after each. Hyphenated words may be indexed. The program will not find indexed words that are contained in hyphenated groups: Mac will not be found in a search for Big-Mac. No other internal punctuation (e.g., apostrophes) is permitted. Whatever capitalization you choose for entries will be respected in the output (the search function will not pay attention to capitalization). You should avoid entering the words in alphabetical order; letting the program alphabetize them will speed up the indexing operation. The same criteria hold true if you prefer to enter your word list from the keyboard. 3. (#) Keywords will be tagged in the file to be indexed. This allows you to create your index as you go along. ^P (entered as ^PP) must precede and follow each keyword or string. The maximum string length permitted is 49 characters (v2.2). o You only have to tag keywords ONCE, not every time they appear as with STARINDEX. Duplicates will be ignored. o Because of the string-indexing feeature added in v3.0, all index entries indicated within the file itself must be both preceded and followed by ^P. The output file (on the same drive) will be fn.NDX. An approximate right margin of 65 will be adhered to; CR's will be added after each line and second and succeeding lines of index entries will be tabbed. This file can be edited with Wordstar and converted if you like to document mode (this doesn't seem appropriate for an index, however). If you have more than 254 keywords, you should divide them alphabetically into two or more groups. (ALPHA.COM will do this for you.) You can then combine the indexes later in alphabetical order using PIP or Wordstar's ^KR command. WINDEX allocates about 2/3 of the free memory to the page- reference buffer and about 1/3 for the NDX file. This allows (on a 60 K Kaypro-10) for about 17 K for the file and 34 K for the buffer, or about 11000 references at 3 bytes each. (This proportion is based on the fact that many references are multiple appearances on the same page that do not appear in the NDX file.) This should be enough for any normal use of the program (110 references/page in a 100-page manuscript!) In case you somehow do run out of memory, WINDEX will recognize when the CCP is overwritten and do a Warm Boot, but it doesn't check if you go even further. But long before you get to that point, you should divide your keyword list into smaller alphabetical groups and index them separately. As long as you keep the different indexes in alphabetical order (you'll also have to change their names if you keep them on the same disk), you can PIP them together with no internal editing save removal of a few headings. Hyphens: Wordstar distinguishes between hard hyphens (those you enter yourself) and soft hyphens (entered for formatting purposes). WINDEX skips over soft hyphens, since they merely break words at the end of lines; hard hyphens are treated as letters in the keyword and in the file. This remains true even when they occur at the end of a line; the difference is that you entered them as part of a hyphenated word. Features: - makes use of a binary tree for maximum search speed (5-6 seconds for a 40K file) - occupies less than 3K on disk Limitations: - won't recognize words with internal punctuation other than hyphens (apostrophes, accents, &c.) - only works with files saved in document mode (it needs this for its page-count feature) - all files (fn.ft, fn.KWD, fn.NDX) must be on same drive Warning: - WINDEX will create a new NDX file each time it is run and delete any previous file of the same name. You must rename old index files you want to save before rerunning the program. Trick: - You can rename an old NDX file as a KWD file, since the program won't notice the numbers following the entries. Before you do this, don't forget to delete the heading within the NDX file. ***************************************************************** ALPHA.COM v2.0 (3/19/86) Command line: alpha [d:]fn.ft [/] Partly in response to a request from Donald Giroux, ALPHA now makes a file containing the alphabetized list (one word per line); it now counts words and different words but no longer keeps track of the number of times each word is used (this was eliminated to save 2 bytes per word - if protests are heard I can reintroduce it as an option). Also added is support for hyphens and non-break spaces (^O). (The latter are no longer required by WINDEX, but if you use them, ALPHA will alphabetize strings.) ALPHA is meant to facilitate finding keywords in your files; it can also be used (as Mr. Giroux remarks) for fast spell-checking. ALPHA uses a binary tree to sort all the words in a file in alphabetical order and put the list in a file of type ALP as well as printing it on the screen. Up to 3072 different words can be accommodated (this means about 15-20 K total words, or say 90- 120K bytes). The / option will limit the word list to capitalized words. Many of these will be The or This, but you will also find all the proper names in your file. ALPHA allows internal apostrophes as well as hyphens. (I didn't allow for apostrophes in WINDEX since when you write an index you usually want to include possessives under the possessor: if you are indexing "Smith," you want instances of "Smith's" to be included, not listed separately.)