Text file INDEX generator (c) T.Jennings 7/21/81 You can do anything you want with this program except sell it. Give it to anyone who wants it. Address bugs, suggestions, etc. to: Tom Jennings 221 W. Springfield St. Boston MA 02118 Leave me a message at NECS CBBS. INDEX is a utility for use with WordStar, and generates an alphabetically sorted index for a file. Words or phrases to be put in the indexed are marked with control characters not used elswhere within WordStar. (At least as of version 1.01) If a file is later edited, invoking INDEX again will remove the old index, produce a new one, and add it to the end of the file. INDEX can also be use with any non-WordStar text editor that can insert control characters into the text. No other assumptions are made about the contents of the file, except that the file is terminated by a control-Z character (correct way) or end of file. INDEX scans the text file for certain WordStar "dot commands", such as page breaks, etc., in order to maintain proper page numbers. If no page "dot" commands are found, as with other editors, pages are counted internally. 1  Text file INDEX generator (c) T.Jennings 7/21/81 There are two different kinds of index entries; WORDS and PHRASES. WORDS are what are normally thought of as words; groups of characters, seperated by spaces, commas carriage returns (called CR from now on) or linefeeds (LF). PHRASES are groups of words, including the spaces that seperate the words. Since words are easy to find, only a single marker is necessary to identify them. This marker is a control-K character, ^K. Phrases must have both ends marked, and control-P is used, ^P. Below are some examples: The sixth word in this ^Ksentence will be put in the index. ^PThis entire phrase will be there^P, also. Since this is page 2 of the manual, the index for these should look like: Sentence...................................... 2 This entire phrase............................ 2 These two examples are actually in the index at the end of this manual. WordStar dot commands INDEX is optimized for use with WordStar. By default, it scans the file for "dot commands"; notably .pa and "..index". .PA is used to count pages, and must be the first word on the line to be counted as a dot command. The "..index" is created and used by INDEX. As defined in the WordStar manual, any line beginning with two dots (..) will be ignored when printed. INDEX uses this to mark the beginning of the index. When INDEX is run, if it finds the "..index" line, it will remove all text following that line. This allows creating an index for an updated file that already has an index. If one was not found, it is added. CAUTION: NEVER put a ".." WordStar dot command followed by index, as described above. All text following this line will be deleted from the file. A single space after the .. will suffice, or use .IG instead. 2  Text file INDEX generator (c) T.Jennings 7/21/81 Sorting As stated before, the index generated is sorted alphabetically. The entire phrase or word is used in sorting, except that case is ignored. If identical entries are found, they are listed on a single line, followed by all page numbers found on. Unfortunately, multiple identical page numbers will be listed. For clarity, some examples of how things work follows. The following two phrases are equivalent, as case is ignored, and will be listed on one line. The first occurence will be the entry on the left side of the page. This is the first phrase THIS IS THE FIRST PHrAsE Since length counts, these next are all in proper order. This This is This is what 3  Text file INDEX generator (c) T.Jennings 7/21/81 Side effects and cautions This is a list of implementation peculiarities, etc. -In general, any group of one or more white-space characters (see below) are converted into a single space character. Phrases with embedded spaces will have all extra spaces (more than one) removed. A phrase may start and end on different lines (or even pages) and will work properly. Leading spaces will be removed from the index entry. -The following characters are converted to and treated as a single ASCII space character. These also mark the end of a word: CR LF tab comma (,) semicolon (;) colon (:) suprise-mark (!) -BUG NOTICE Periods are removed from the character stream. This was a cheap way out since it is a sentence-terminator. The only time this is a problem is when putting things in the index such as filenames. (i.e., FILENAME.TYP) If someone complains, it will probably get fixed. -Words and phrases will have any leading spaces removed. The first character of any word or phrase will be converted to upper case. Note that if a phrase consists of a single blank, it will NOT be removed from the index. This does not count for words, of course, as the next word that comes along will be indexed. -Because of wonderful CP/M, and the fact that some of it's utilities use end-of-file instead of a control-Z character to terminate text, INDEX cannot detect the following read errors: unwriten random record, zero length. -INDEX sorts in ASCII order. Digits, quotes, parenthesis, etc come before letters. -The sort routine used is horrible. It uses a bubble sort, with extra unnecessary exchanges. Didn't require much thought, though. 4  Text file INDEX generator (c) T.Jennings 7/21/81 Colon................................... 4 Comma................................... 4 Control-Z............................... 4 CP/M.................................... 4 CR...................................... 4 Embedded spaces......................... 4 End-of-file............................. 4 Examples................................ 2 Filenames............................... 4 INDEX................................... 1 Leading spaces.......................... 4, 4 LF...................................... 4 Non-WordStar text editor................ 1 Periods................................. 4 PHRASES................................. 2 Semicolon............................... 4 Sentence................................ 2 Side effects and cautions............... 4 Suprise-mark............................ 4 Tab..................................... 4 This entire phrase will be there........ 2 White-space characters.................. 4 WORDS................................... 2 WordStar................................ 1 WordStar "dot commands"................. 1 WordStar dot commands................... 2 ^K...................................... 2 ^P...................................... 2 5