LT (LIBRARY TYPER) VERSION 31 USER MANUAL Integrated and edited for improved comprehension by Brian C. Murphy Vancouver Kaypro Users Group Mission, British Columbia December 16, 1991 i CONTENTS Extracting and Decompressing LT31.LBR . . . . . . . . . . . . . . . . . . . 1 New Features of LT Version 29 - 31 . . . . . . . . . . . . . . . . . . . . 1 History of Revisions to LT . . . . . . . . . . . . . . . . . . . . . . . . 2 User Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 User Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . 5 User Control Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 User Patches for Customizing LT Version 31 . . . . . . . . . . . . . . . . 7 Support of LZH Compressed (.?Y?) Files & LT31 Assembly Procedures . . . . . 9 Support of LZH Compressed Files . . . . . . . . . . . . . . . . . . . . 9 LT31 Assembly Procedures . . . . . . . . . . . . . . . . . . . . . . . 10 Using LINK.COM . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Using SLRMAC and SLRNKP . . . . . . . . . . . . . . . . . . . . . . 10 Using M80 and L80 (or SLRNK, non + version) . . . . . . . . . . . . 10 Steve Greenberg's Original UNC.REL Documentation . . . . . . . . . . . . . 13 UNLZH.REL and CRLZH.REL Documentation . . . . . . . . . . . . . . . . . . 16 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Version History and Compatibility . . . . . . . . . . . . . . . . . . . 16 Care and Feeding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Standard Header Information . . . . . . . . . . . . . . . . . . . . . . 19 What LZH Compression Does and How it Compares . . . . . . . . . . . . . 20 A Small History . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 1 EXTRACTING AND DECOMPRESSING LT31.LBR (from LT31.LBR: -README.1ST) The LT31.COM file is left UNCOMPRESSED and should be used to extract/decompress everyting in this .LBR. Merely extract the LT31.COM file and enter the command: LT31 LT31 x:*.* (replace x with a drive letter of your choice) LT31.LBR is based on, and corrects a bug in, LT30.LBR released by: Roger Warren, Sysop, The Elephant's Graveyard (Z-Node#9) 619-270-3148 (PCP area CASDI) Brian Murphy, Vancouver Kaypro Users' Group Richmond, B.C. BBS Phone: 271-5934 NEW FEATURES OF LT VERSIONS 29 - 31 (From LT31.LBR: LT31.FOR) Version 31 corrects a bug in version 30 (and probably all versions back to 25) that prevented LT from accepting USER specifications as a SOURCE of data. (Many thanks to Roger Warren, Sysop, Elephant's Graveyard San Diego, California for the suggestions for fixing this problem.) Version 31 also changed the 6 labels in the MAC file, which had 7 characters to 6 character labels. That was done to make LT31.MAC compatible with an ancient (about 1979) version of M80 that will not accept labels with more than 6 characters. This change does not in any way impair the use of LT31.MAC with any other assembler. Revised by Brian Murphy Vancouver Kaypro Users Group Richmond, British Columbia, Canada BBS phone: (604) 271-5934 Version 30 (July '91) incorporated Version 2.0 of LZH Encoding. Excellent for TYPE.COM use. Can type normal, LZH-encoded, crunched or squeezed files - whether standalone or in a .LBR. If wheel is on, can also easily extract any/all files, and uncrunch and/or unsqueeze at the same time. Most useful program for either RCP/M use or for individual CP/M systems. Revised by Roger Warren Sysop, Elephant's Graveyard San Diego, California Version 29 LZH-encoded file (.?Y?) handling capability has been added to Mr. C. B. Falconer's excellent Library Typer program with this update. Excellent for TYPE.COM use. Can type normal, LZH-encoded, crunched or squeezed files - whether standalone or in a .LBR. If wheel is on, can also easily extract any/all files, and uncrunch and/or unsqueeze at the same time. (This is the missing feature in NULU152.) Most useful program for either RCP/M use or for individual CP/M systems. Revised by Roger Warren Sysop, Elephant's Graveyard San Diego, California 2 HISTORY OF REVISIONS TO LT (From LT30.LBR: LT30.DOC) LT v31 Documentation - C.B. Falconer - 15 Dec. 91 Changes summary: LT31 - Modified as follows to enable assembly with ancient (about 1979) version of M80 by changing the names of the following symbols which exceeded 6 characters. (This will not effect assembly with any other assembler.) OLD LABEL NEW LABEL OLD LABEL NEW LABEL OLD LABEL NEW LABEL setnam2 setnm2 setfld1 setfd1 setfld2 setfd2 setfld3 setfd3 setfld4 setfd4 setfld5 setfd5 Also modified routine labelled start: to fix a bug dating from v25 that prevented LT from accepting USER specifications as a SOURCE of data. (Many thanks to Roger Warren, Sysop, Elephant's Graveyard San Diego, California for the suggestions for fixing this problem.) Also revised User Manual for easier comprehension. - Brian Murphy LT30 - Incorporated version 2.0 of LZH encoding. This program is necessary to decode files encoded with version 2.0 of LZH compression, but will AUTOMATICALLY handle files encoded with version 1.x LZH encoding. Added appropriate documentation changes. Added instruction to reset MSB in output FCBs. Corrected error message pointer for UNC & UNL errors (sent garbage to screen). - R. Warren LT29 - Added the capability to handle LZH-encoded files (files with extensions of the form ?Y?). No other functional changes. - R. Warren LT28 - Reworked - all identification of output files had disappeared. Documentation attached to LT27 had the wrong patch information. (but LT25 was correct) LT27 was non-fixable, due to reformatting, and thus text comparators gave no useful information. Basis for LT28 is LT25. - C.B. Falconer LT27 - Rewrote display section when extracting files to disk. The program was triple spacing with no tabulation. Was unsightly if the library had more than 1-2 files. (Still under 5k.) If the TPA is under 46k might need to change "REC" from 128 to 96 or even 64. - Irv Hoff, Sysop LT26 - When typing a crunched file with a comment attached, it was running off the end of the screen, to display the uncrunched file name with comment. - Ed Minton LT25 - All options patchable, no need to reassemble for RCPM/ZCPR use. OPTION LOCATIONS CHANGED. Fixed multi-sector file output buffer initialization. Cleaned up. - C.B. Falconer LT24 - Fixed bug in PARSE1 routine. If no USER AREA specified for disk output then it defaulted to input file user area rather than the CURRENT user area. Modified to allow abort during disk output. Added $U command line option to allow disk output of squeezed/crunched files WITHOUT unsqueeze/uncrunch. (see the NOUQZ byte added to the patch area at 103H). Added REC equate to allow more than one sector in the file output buffer to reduce wear and tear on floppy drives. REC is currently set 3 for 8 sectors (1k). - Tom Head LT23 - FIXES SEVERAL ANNOYING BUGS. Most notably, it no longer aborts when extracting a 0-length file from a library. Other minor changes, see source code. - Howard Goldstein LT22 - adds support for ZCPR/ZCMD page zero maximum user values for a more secure RCP/M environment. Wheel support has been expanded. When the wheel is active the line counter restriction is defeated, allowing the sysop (or other wheel users to display files of any length. The wheel also over- rides the page zero user value and substitutes the hardcoded maximum value. - Gary Inman LT21 - adds the ability to scroll one line at a time with the space bar. - Irv Hoff LT20 - was created for two reasons. The first reason is so that only one copy of LT will be needed for RCP/M systems. It may also be handy for persons who have a secure "local" system which utilizes the wheel byte for security. Only one byte need be changed in the table of configurable items if desired should the wheel byte not be implemented on your system. The wheel byte will enable/disable ability for the program to output to a disk file. The second reason for this version's creation was because it wouldn't assemble properly with M80/L80; the value for YES and NO "were" 0FFH and 0, respectively. Now changed to NO equ 0, YES equ NOT NO. Minor other changes were made. Code is still < 4k. - George Reding LT19 - was modified to prevent possible junk characters from being sent to the console while displaying a crunched file header. LT18 was modified to allow access to LBR's larger than 512k. Changes were made in LT17 to prevent the generation of junk file names under certain usage conditions. See source code modification history for more details about these fixes. LT16 - eliminates all the limitations related to extraction of binary files to disk (see below). It will also now uncrunch when running on 8080, 8085 and V20 CPUs, besides Z80s. When extracting binary files to disk, the program is uninhibited by the "bad types" table or occur- rences of 1Ah (EOF) bytes within the file. LT15 - adds the ability to extract files to disk. There is a limitation of only text files (any 01Ah byte is EOF). No checks are performed so that invalid LBR checksums etc. will not be detected. To use this ability, add a "DU" specification to the component name. Don't try to extract .COM (.CZM, .CQM) etc. files with LT15. This ability was considered necessary since NULU (1.51) cannot expand crunched files directly to disk. LT14 - is a revision of LT13 to include "uncrunch" capability (but only on Z80 CPUs for uncrunching). As did LT13, LT14 can expand UCSD style indentation coding (DLE, count + 020h). Customization etc., remains unchanged, although the defaults have been modified. The "min" form is no longer supplied but you can create it by setting the appropriate equates to false (in the source) and re-assembling. 4 USER DIRECTIONS (Subtantially expanded from LT30.LBR: LT30.DOC by Brian Murphy) LT (Library Type) displays the contents of .LBR files on the console, or functions as a pausing replacement for TYPE. LT was adapted from Steven R. Holtzclaw's LUXTYP of (06/10/83) for use without the complete LUX system. As does LUXTYP, LT automatically unsqueezes any squeezed file components for display. Two versions of LT are distributed, LTnnMIN is a minimum sized object file with no security restrictions, nor DU style drive/user specifications. It is intended to be mounted in a COMMAND.LBR file on single user systems. LTnnMAX implements DU drive/user specification, and various security provisions, for use on RCPM systems. In particular LTnnMAX can individually specify drives available (rather than a simple max value), and expands UCSD style indentation coding (V. 13 up). This is useful for compressed PASCALP source files, or transfers from a UCSD system. Both versions implement wild card file specification, and can search an alternate drive for the specified files. LT executes on 8080 systems (as opposed to LUXTYP on Z80s only). LT searches the default and system disks for the files. Assembly time constants are available to restore the original LUX limitations on file types and output line count. LT has provisions for stopping CRT page pauses during one files output, and will also pause across file (component) boundaries. Execution of: C>LT produces the version number and a minimum syntax explanation as follows: LT31 [d[u]:]lbr/filename [d[u]:][component] [$u] by C.B. Falconer ^S pause, ^C abort, ^X next file, ^Z stops paging Type/uncrunch/unsqueeze/unLZH files or LBR members Drive/user after lbr/filename causes file output. $U at end disables unsq/uncr/unlzh of compressed files (wildcards permitted) Examples: B>LT A3:HELLO SOURCE.AZM [console] (defaults to HELLO.LBR) A>LT LZHENCOD.DYC [console] (handles LZH encoding) A>LT SQUEEZE.DQC [console] A>LT HELLO *.* [console, all typable] B>LT B:HELLO A4:SOURCE.AQM [file] B>LT B:HELLO A4:SOURCE.AQM $U [file with no unsqueeze] A>LT B:CRUNCH.AZM A: [file] 5 C>LT drive[user]:LBRFILE ambigfilereference where "lbrfile" is always assumed to have the extension .LBR, and the "ambigfilereferance" refers to one or more library components. It may contain wild cards (* and ?). The command above will list to the CRT all ASCII (only) members of the library which were specified in "ambigfilereference". There must not be a drive specification preceding "ambigfilereference". If the LBRFILE is on the logged on drive, "drive:" before LBRFILE may be omitted. C>LT drive[user]:LBRFILE drive[user]:AFN will extract all the designated files (including .COM, etc.) from the SPECIFIED LBRFILE to disk files located on the drive and user specified prior to "AFN:. Any data after the EOF (01ah) character will be lost, including CCITCRC checks. There must be a drive specification preceding "AFN". If the LBRFILE is on the logged on drive, "drive:" before LBRFILE may be omitted. USER SPECIFICATIONS: The currently assembled distribution copy of LT.COM (LT31.COM) permits USER specifications for both source of data and destination of data, as a result of correcting a bug which probably began about version 25. (The present editor can personally verify that the bug was present in versions 29 and 30.) For example: LT C0:NULU *.* is permitted. LT C1:NULU *.* is permitted. LT C:NULU C1:*.* is permitted. LT C1:NULU C2:*.* is permitted User Control Codes: (s,S^S, c,C,C^, ^X, ^Z): The usual CTL-S can pause output, but finishes the current line before pausing. (s or S can also be used.) CTL-C aborts back to CP/M prompt but again finishes the current line first. c,C k,K,^K can also be used. CTL-X ( or X or x) aborts listing of the current file and advances to the next file, if any. CTL-Z stops all "[more] page pauses until the next file (component.) NOTE: Use CTL-Z after a pause for continuous scrolling. Force an initial pause via S or CTL-S, then a CTL-Z. This allows scrolling without stopping with [more] pauses (not even the first one). NOTE: For an RCPM version, patch as detailed below. This allows you to specify the exact drives available. 6 Examples: B>LT LT10 *.D?C lists all components of LT10.LBR, found on either the B or A drives, with the extensions DOC, DQC, etc. The B drive is preferred. B>LT C:LT10 *.D?C is similar, but will only look for LT10.LBR on drive C: B>LT D:FNAME.FT will display FNAME.FT, unsqueezing it if it is a squeezed file. This mode replaces TYPE. The absence of the second parameter signals that this is not a library file. (The following for LTnnMAX only, or versions assembled with the "DUSPEC" assembly time option set to YES.) B>LT M5:BLAH *.D?C will display all .DOC/DQC/DZC etc. components of BLAH.LBR on drive M, user 5. The disk specification avoids any drive searches being made. (As noted above under "USER SPECIFICATIONS", this usage of user number is not tolerated by the currently assembled copy of LT.COM.). B>LT @5:BLAH *.D?C is similar, but drives B and A are searched for BLAH.LBR on user 5. 7 USER PATCHES FOR CUSTOMIZING LT VERSION 31 (From LT30.LBR: LT30.DOC) Address Default Purpose ------- ------- ------- 0103h Standard ZCPR environment, unused in LT now. 010Bh 0000h Wheel byte implemented for security/remote usage. Set 10bh to 03eh if you use the wheel ^^ byte at 03Eh. Set NO (00h) if you do not ** WORD ** use the wheel byte. The wheel byte will enable/disable the ability of this program to extract files to disk. THIS IS A WORD and allows the wheel to be anywhere in memory. 0000h eliminates all wheel checking. 010Dh 14h=20 Lines typed between pauses. 0 for no pauses. 010Eh 0 Maximum lines allowed before abort A value of zero allows any size file to be typed. Cannot be set above 255 (1 byte) 010Fh 0FFh Causes file types listed in the inhibition table at 0119h to be rejected for typing 0110h 0 Set 0ffh to check for pause on each output character. Useful for slow Braille terminals. 0111h 0FFh Absorbs all control characters except CR, LF, TAB, BELL AND BCKSP. A 0 value allows any control character to be typed. 0112h 0 Alternate drive to search when no drive specification is made, and the file is not found on the default drive. 0 for no search, 1 for A, 2 for B, etc. 0113h 00Fh =15 Maximum user number accessible (inclusive). The SPECIAL VALUE 0ffh causes the value to be replaced by that at 03dh (ZCPR mxusr) 0114h A,B,M A one word (2 byte) vector of bits which (all) specifies drives accessible. The most sig- nificant bit (in byte 0115h) specifies drive A, the next drive B, etc, until the least significant bit of byte 0114h specifies drive P. 0116h 00 A one byte flag that is toggled by the $U command line option. If set to 0FFh, it will disable the uncrunch/unsqueeze of files during disk output. 0117h 0ffh 0 to suppress dle expansion (UCSD style indentation codes). May be useful if output of control characters (above) is enabled. 0118h Spare, for future use 0119h (and up) Contains a list of 3 byte ASCII characters, each of which specifies a file type that may not be "typed". The byte at 010Fh can be set to ignore this table. The space up to, but not including, 0155h can be used to add installation specific values. Setting a high order bit in any entry will inhibit its use. The "?" character matches any character in the file type. 8 Security: The customization values above control the security levels. However, if LT is executed by a user already logged onto a drive or user number outside the specified limits, LT allows access to that area also. The theory is that the previous login has established the access rights. - C.B. Falconer 9 SUPPORT OF LZH COMPRESSED (.?Y?) FILES & LT31 ASSEMBLY PROCEDURES (EXPANDED from LT30.LBR: READ.ME by Brian Murphy) SUPPORT OF LZH COMPRESSED FILES Updated for use with LT31 1991 Dec. 15 LT version 29 added support for LZH-encoded files (extensions of the form .?Y?) to Mr. C.B. Falconer's LT program (prior version 28, see below). LT versions 30 and 31 incorporate version 2.0 LZH decoding, which can also decode version 1.x LZH compression as well. LT31 is ready for non-RCPM use. Refer to the LT28 docs and to the source code for modification for RCPM usage. This .LBR includes UNLZH.REL, an 8080/8085/V20/Z80 subroutine file which contains the code for the LZH-encoded file handling. That file is copyrighted (c) 1989, 1991 by Roger Warren. It may be used or reproducedon a non-profit basis only. If you assemble your own version, make sure that the LAST file loaded in the LINK phase is UNC, as the LT program locates the top of the program by a symbol in that module. FOR THOSE INTERESTED IN PATCHING FOR DATE STAMPS: In the pre-assembled version of LT30 in LT30.LBR 128 bytes of patch area were left from 1AA0h thru 1B1Fh, inclusive. These were added in the link phase and are not an integral part of the program. If you re-link...it goes away (unless you provide your own). ----------------------Updated LT28 documentation follows ---------------------- Updated for use with LT31 1991 Dec. 15 LT28 is copyrighted (c) 1986, 1988 by C.B. Falconer. It may be freely copied and used for non-commercial purposes. LT28R.COM (not distributed with versions 29 - 31) is ready for RCPM use. LT28.COM does not use the wheel byte and, is ready for normal (non-RCPM) use. This revision of LT (Library Typer) includes UNC.REL, a modification of Steve Greenberg's UNCREL.REL file, to access crunched files and library components (in addition to squeezed and uncompressed files/components). This uncruncher runs on 8080/8085/V20/Z80 processors, and can process the output of CRUNCH23 or prior. See above for customization patches. Sysops can now leave EXTRACT set to YES, as v20 tied this in with the wheel byte. v25 also allows UZCPR to be replaced by a MAXUSR value of 0FFh (255). 10 LT31 ASSEMBLY PROCEDURES USING LINK.COM To link with LINK rather than L80, simply type: M80 =LT29.ASM/M LINK LT29,UNLZH,UNC USING SLRMAC AND SLRNKP: To process this with SLRMAC and SLRNKP: (SLRMAC configured to use .MAC input files, else rename source) D>SLRMAC LT29/M D>SLRNKP LT29/N/A:100/J,LT29,UNLZH,UNC,/E The first step assembles to a .REL file, the second links UNLZH.REL and UNC.REL and creates LT29.COM, ready to use - no other steps needed. USING M80 and L80 (or SLRNK, non + version): My early 1980's version of M80, at least, will not accept label symbols exceeding 6 characters. One of the authors or revisers of previous versions of LT31.MAC has used six different 7 character labels. Consequently, each of those six labels has to be changed to 6 characters to prevent several "fatal" assembly errors. In LT31.MAC I have changed all ocurrences of the following six 7 character labels as noted to correct that deficiency.: 15 Dec 91 Modified to correct error in M80 assembly (symbols > 6 char.) OLD LABEL NEW LABEL OLD LABEL NEW LABEL OLD LABEL NEW LABEL setnam2 setnm2 setfld1 setfd1 setfld2 setfd2 setfld3 setfd3 setfld4 setfd4 setfld5 setfd5 v31 - Modified by Brian Murphy Vancouver Kaypro Users Group Once those corrections were made to each occurrence of each label, LT31 assembled flawlessly with M80. Those changes, which have no effect on other assemblers, have now been made to this distribution version of LT31.MAC. To re-assemble with M80/L80 (using a source file named LT31.MAC): C>M80 =LT31/M No Fatal error(s) 11 The final linking step is a two-pass operation: C>L80 /P:100,/D:2000,LT31,UNLZH,UNC,/M Link-80 3.43 14-Apr-81 Copyright (c) 1981 Microsoft Data 2000 3571 < 5489> Program 0100 1A99 < 6553> (NOTE that "1A99", which is the end of the programme code in LT31.COM, is needed for the 2nd pass and for calculating the amount of space needed for data before removing unneeded bytes with DDT.) CODES 3557 ENDU 3571 ENDUL 1554 GETBYT 03C8 GLZHUN 03C8 OUT 04E4 PLZHUN 04E4 TOPOF 1A71 TROOM 356D UNC 1610 UNCREL 15DE UNL 12D0 UNLZH 12D7 26817 Bytes Free */R (Reset L80) */P:100,/D:1A99,LT31,UNLZH,UNC,LT31/N (NOTE that "1A99" from Pass 1 is used here) Data 1A99 300A < 5489> Program 0100 1A99 < 6553> 26817 Bytes Free */E (Exit L80 now that Linking is complete.) Data 1A99 300A < 5489> Program 0100 1A99 < 6553> 26817 Bytes Free [0100 300A 48] L80 has added many unneeded bytes to the file. Load it with DDT and SAVE only the portion up to the last code byte plus about 80h bytes (1A99H + 80H in this example; it may be different depending upon assembly options.). This extra pad is because there is actually initialized data in the data segment, but those areas are in the very beginning of the area. With DDTZ, use "K 100,1AE5" command to write it back. You can zero the portion after the end of code (unmodified L80 and early SLRNK+ junk fill, but SLRNK zero-fills). As noted above in Pass 1 of the Linking process, the L80 Linker tells you, in the Program Row of the table printed on the screen when the linkage pass is complete, both the beginning and ending locations of program code in the resulting .COM file. In our example, while linking to generate the final form of the LT31.COM file, that row (in each of the 3 screen printouts) states that the beginning of program code is 0100(H) and the end of program code is 1A99(H). Since the end of program code will vary with the version of the program and the assembly options chosen, you have to look at that table to determine that location. 12 Once you know that location, add 80H to it. (A Public Domain program variously called HEXDEC11.LBR, HEX-DEC.LBR, or @.LBR facilitates hexidecimal calculations.) In our example, 1A99H + 80H = 1B19H. Subtract the beginning of file address (0100H) from that number. In our example, the difference of 1B19H - 100H = 1A19H or 6681D. Divide that decimal number by the decimal number of bytes in a page (256). In our example 6681D/256 = 26.098 pages. Round up to the nearest integer number (27 pages in our example) and you have the number of pages to SAVE after loading the COM file with DDT. Make sure that the calculator you use for this particular calculation does not truncate the result. If you do not round up when telling SAVE how many pages to save, you will cut off a useful portion of the .COM file. The following illustrates the final steps to be taken using DDT and SAVE to eliminate the useless final portion of the file. C>DDT LT31.COM DDT VERS 2.2 NEXT PC 3080 0100 -G0 C>SAVE 27 LT31.COM (or whatever name you want to use for the final .COM file) C> - enjoy ------------------------------------- end ------------------------------------- 13 STEVE GREENBERG'S ORIGINAL UNC.REL DOCUMENTATION (From LT30.LBR: UNC.DOC) UNC v2.1 UNC.REL is an adaptation of Steve Greenbergs UNCREL module to: a: Execute on 8080/8085/V20 processors. b: Modify error returns. Returned values are: 0 (no carry) No error, as UNCREL 1 (carry) Later version needed. 2 (carry) File/module is not crunched 3 (carry) File is corrupt. Possibly not crunched. 4 (carry) Insufficient memory or stack overflow. c: Added entry point UNC, is used just as UNCREL (described below), except that the input file/module has already been processed up to and including the initial 0 byte. This avoids the necessity of rewinding the file, BUT an uncrunched file (no initial id stamp) can cause unknown results. CAUTION. d: Added (data relative) entry points CODES and TROOM may be monitored (but not modified) by the application to detect "codes assigned" and "codes re-assigned" respectively (version 20 crunch format). TROOM monitors codes still unassigned for version 10 format. e: A single byte at UNCREL-1 contains the revision number. The remainder of this file is Mr. Greenbergs original document. C. B. Falconer (86/11/24) ------------- UNCREL.REL is a Microsoft / DRI compatible .REL file which makes it extremely easy for an application program to "uncrunch" files. All the programmer need supply is two routines - one which is capable of supplying one character at a time in the "A" register and one which will accept one character at a time. A single call to external entrypoint "UNCREL" will do the rest. This organization was chosen (ie UNCREL stays "in control", and calls the programmer's input and output routines, rather than vice-versa) because it is more consistent with the nature of the LZW algorithm. The algorithm may make 1, 2 or 3 consecutive calls for input without outputting anything, & then may output any number of characters before needing more input. Since an individual call is made for each byte needed or supplied, the user's routines can refill (or dump out) any input or output buffers at any appropriate time. If the programmer decides to terminate the uncrunch operation before its natural completion, it is simply necessary to reset the stack pointer and continue rather than RETurning to UNCREL. More detailed description of operation follows: 14 Programmer Supplied Routines, declared "PUBLIC". 1. One must be named "GETBYT". Each time this routine gets called by UNCREL, it should supply the next sequential byte in "A". The routine may make full use of all registers, but should not assume that they will be in any particular state the next time the routine is called. Basically, any pointers, etc., should be memory based. Note, however, that all usage of the alternate Z-80 registers has been eliminated from UNCREL. Thus an "alternate" scheme is to execute an EXX instruction at the beg and end of "GETBYT", in which case the registers will remain intact. 2. Another routine must be named "OUT". UNCREL calls this routine each time it has a new character available for output. The character will be in register "A". All other conditions are as described above. What you (the programmer) must do: Simply make a single call to "UNCREL", which should be declared "EXTRN". This will initialize all tables, and then all activity will proceed until the file is completely uncrunched or an error is encountered. At that point, UNCREL will return to your program. If the carry flag is set, this was an error return. In this case reg "A" will contain a [non-zero] code identifying the type of error, if you are interested. If carry is clear, A will be zero, and this indicates that the entire operation has terminated normally. What else you must do: BEFORE you make the call to UNCREL mentioned above, you should load "HL" with a pointer to a large block of memory (24k). The rest of the work is taken care of - UNCREL will figure the next page boundary and allocate various pointer for various tables it will build itself. It will also allocate a large stack area for itself within this block, and will initialize all any ram locs it needs. All of this is done at "run time". UNCREL can be reexecuted multiple times if desired. Your stack will be returned to its previous value if UNCREL returns (either due to completion or error). If you decide to interrupt operation by not returning from one of its "GETBYT" or "OUT" calls, however, don't forget to reset the stack to your own value. Another EXTRN value named "ENDU" is made available for possible use. It is the end of the data area (DSEG) used by UNCREL. If you link in such a way that UNCREL is "on top", then this can be used as a reference for the beg of available memory after the program. More notes: Every byte of a crunched file, starting with the 76H and 0FEH header bytes, must be supplied in order. The header information is checked for some validity, although the file name is ignored. UNCREL will determine the version# when it gets to it and uncrunch accordingly, you needn't worry about this. If you need the filename, you must extract it yourself. The exact format of a crunched file is defined as a separate document LZDEF20.DOC. UNCREL does NOT read or compute the checksum, if that interests you you can do that yourself too. (Note: CRUNCH.COM & UNCR.COM v2.0 are not directly based on UNCREL. They always create and check the checksum respectively, of course). 15 More notes: The LZW process is a continuously progressing forward process. You cannot go back. If you leave out just one byte, you will get results which become stranger and stranger, eventually becoming complete gibberish. (This is actually pretty interesting to watch, albeit frustrating if its not what you want). 16 UNLZH.REL AND CRLZH.REL DOCUMENTATION For Version 2.0 (with RUNTIME buffer allocation) (From LT30.LBR: LZHREL.DOC) July 1991 Abstract UNLZH.REL and CRLZH.REL contain assembly language routines for the decoding and generation of LZH-encoded files, respectively. The routines need only be supplied with a pointer to a large scracth area and linkages to character input and character output routines to be used. There are a few easily met functional requirements for the calling routine and I/O routine. No Z80 opcodes are used, so these routines may be used on 8080/8085/V20/Z80 based machines. The coding contained within UNLZH.REL and CRLZH.REL are Copyright (c) 1989, 1991 by Roger Warren and may not be used or reproduced on a for-profit basis. Non-profit use is freely permitted. The author will not be responsible for any damage, loss, etc. caused by the use of these programs. Version History and Compatibility Version 1.1 was released in Sept. '89 and was the first public offering of LZH encoding for CP/M. Version 2.0 (July '91) introduces several improvements/changes: More efficient encoding Greater speed More compact object code There are NO interface changes from the 1.x version. Of greatest importance is the encoding improvement. This change, while generating even smaller output files, means that files compressed with version 2.x programs cannot be decoded by old 1.x programs. However, the version 2.x UNLZH module DYNAMICALLY ADJUSTS for 1.x encoded input files. Thus, version 2.x of UNLZH can be used on all LZH-encoded files regardless of which algorithm version was used to encode the files. By extensive rewriting for size and speed, a 20% improvement in performance was achieved. Since the LZH algorithm is intrinsically slow, this will be of great interest to many. The recoding project allowed the incorporation of version 2.x extensions to the original algorithm while not appreciably affecting the code module size. 17 Care and Feeding The infomation that follows documents both CRLZH.REL and UNLZH.REL. One or both may be in the library this file is in, depending upon the nature of the program(s) it's bundled with...so ignore the superfluous information (if any). UNLZH.REL performs LZH decoding. It's progamming interface is similar to the UNCR.REL it was made to mimic. The programmer must provide the program with 8k of buffer space. If RUNTIME buffer allocation is selected (it IS selected in the version supplied with LT, FCRLZH, CRLZH, and UCRLZH), a pointer to the buffer must be supplied in the H/L register pair when the routine is invoked. If RUNTIME buffer allocation is not selected, the user must supply a PUBLIC symbol, UTABLE, which is the base of the provided buffer area. Once invoked, the routine allocates its own stack and 'stays in control' until the de-compression is completed (or an error is encountered). The programmer must supply two routines GLZHUN and PLZHUN, via which UNLZH 'GETS' bytes from the input stream and 'PUTS' bytes to the output stream, respectively. UNLZH *DOES NOT* compute/process checksums, etc. on the input file. Any support of such features must be handled externally. GLZHUN and PLZHUN should save all registers except the A register and flags. GLZHUN must return the next character from the input stream in the A register. GLZHUN should return with the CARRY flag RESET for a valid character, or with the CARRY flag SET when the end of the input stream is encountered (the content of the A reg should be zero in that case). Upon exit (return to the caller), UNLZH returns the following information: Carry reset (or A reg = 0) - Success Carry set, A reg = 1 - Newer version required Carry set, A reg = 2 - File not LZH endocoded Carry set, A reg = 3 - Bad or corrupt file Carry set, A reg = 4 - Insufficient memory UNLZH has 2 entry points, to be used as the programmer needs: UNLZH is the 'normal' entry point which expects the file to be completely REWOUND. At this entry point, the entire file is processed - the standard header is examined, but not reported or acted upon. By examining the return code, the programmer can discern if the file was, indeed, an LZH-encoded file and act accordingly. UNL is a secondary entry point which can be used when the programmer needs to process the standard header information (file name and stamp) and cannot (or doesn't want to) rewind the file. When this entry point is invoked, the header (down to and including the stamps/comment terminating zero) must have been processed (so the next byte in the input stream will be the revision level). The revision level of UNLZH.REL performs is at the byte at UNLZH-1. A hex value of 11 indicates version 1.1, etc. 18 CRLZH.REL performs LZH encoding. It's progamming interface is similar to the CRUNCH.REL it was made to mimic. The programmer must provide the program with 20k of buffer space. If RUNTIME buffer allocation is selected (it IS selected in the version supplied with LT, FCRLZH, CRLZH, and UCRLZH),a pointer to the buffer must be supplied in the H/L register pair when the routine is invoked. If RUNTIME buffer allocation is not selected, the user must supply a PUBLIC symbol, CTABLE, which is the base of the provided buffer area. In addition, at invocation time the A register must contain a value for CRLZH to install in the 'CHECKSUM FLAG' portion of the file header (see below). This byte, to be semi-compatible with C.B. Falconer's version of CRN for the 8080, is a subset of CRN's strategy byte: value (hex) meaning 00 Standard modulo 65536 checksum is used 10 CRC16 is used 20,30 Unassigned SUPPORT FOR CHECK INFORMATION MUST BE EXTERNALLY PROVIDED IN THE USER-SUPPLIED I/O ROUTINES (see below). THIS IS ALSO TRUE OF CRN...BUT WAS NOT EMPHASIZED! CRLZH merely provides the support for posting the value in the output stream since it happens to 'follow' some of the information posted by CRLZH (see the header description, below). CRLZH supports no other features of the CRN's strategy byte, all other bits are ignored. Once invoked, the routine allocates its own stack and 'stays in control' until the de-compression is completed (or an error is encountered). The programmer must supply two routines GLZHEN and PLZHEN, via which CRLZH 'GETS' bytes from the input stream and 'PUTS' bytes to the output stream, respectively. CRLZH *DOES NOT* compute/process checksums, etc. on the input file. Any support of such features must be handled externally. Specifically, the GLZHEN routine must provide for the accumulation of check information and the caller must write that check information to the output stream when CRLZH returns to the caller. GLZHEN and PLZHEN should save all registers except the A register and flags. GLZHEN must return the next character from the input stream in the A register. GLZHEN should return with the CARRY flag RESET for a valid character, or with the CARRY flag SET when the end of the input stream is encountered (the content of the A reg should be zero in that case). As a service to the user's output processor, every 256th call to PLZHEN is made with the Z flag set (for monitoring). All other times the Z flag is reset. Upon exit (return to the caller), CRLZH returns the following information: Carry reset (or A reg = 0) - Success Carry set, A reg = 1 - File already LZH-Encoded,CRUNCHed or SQueezed Carry set, A reg = 2 - File empty Carry set, A reg = 3 - Insufficient memory CRLZH has a single entry point at the label CRLZH. The user must have placed the standard header information in the output stream and must have the input stream REWOUND prior to invoking CRLZH. 19 The revision level of CRLZH.REL performs is at the byte at CRLZH-1. A hex value of 11 indicates version 1.1, etc. Since CRLZH and UNLZH allocate their own stacks, the user is reminded not to make too large a use of that stack in the user-supplied I/O routines. In addition, if the user-supplied I/O routines decide to abort the CRLZH or UNLZH operation (due to operator keystrokes, for example), the user must take steps to restore his own stack. Upon a normal (or error) return from CRLZH or UNLZH the user's stack is properly restored. STANDARD HEADER information LZH encoding follows Steve Greenberg's CRUNCH file format. The header contains information identifying compression format, original file name, etc: field size value Purpose ------- --------- -------- ------------------------------------------------- 1 1 byte 076h Signifies compressed form 2 1 byte 0FDh Signifies LZH encoding (0ff is for squeezed and 0feh is for CRUNCHED) 3 variable User Original file name in the form name.ext Trailing supplied blanks on the name portion should be suppressed, but a full 3 characters following the '.' should be used for the extension (i.e. no blank suppression). 4 variable User OPTIONAL. Used for file comment/stamp. If used the convention is that the comment is placed in square brackets [Like this]. Other information may be placed here (e.g., date stamp). The logical restriction is that a binary zero must not be part of the comment and/or other informa- tion. 5 1 byte 00h Signifies end of STANDARD HEADER For use of CRLZH, the user must supply all of the information above. For UNLZH, use of the UNLZH entry point causes UNLZH to expect to process the above information. It will discard the file name and optional comment/stamp, but will examine the general form (first 2 fields for a match and general form of the rest of the header). If the user chooses to use the UNL entry point, UNL will expect to process the first byte following the end of the standard header. What follows is the following: field size value Purpose ------- ------- ---------- -------------------------------------------- 6 1 byte variable Identifies generating program revision level. (11H signifies program generated by version 1.1 7 1 byte variable Significant revision level. Indicates major revision level of algorithm for decoding program compatability. (10h indicates significant revision 1.0) 8 1 byte variable Check type. 0=checksum, 1=CRC16, others currently undefined. 9 1 byte 05h Currently a SPARE, set to 05H by convention. Following this is the compressed file, itself. 20 What LZH compression does and how it compares FIRST - It's SLOW. Much slower than CRUNCH. About even with the old SQueeze. It's the nature of the algorithm, but the current implementation contributes somewhat (more on that later). The most impressive aspect of the algorithm is that it compresses further than CRUNCH. The nature of material being compressed is important - prose and high level language code will compress further. Since the algorithm depends, in part, on patterns within the file being compressed, I was somewhat surprised to discover that it does a better job (in general) on .COM files than CRUNCH. Personally, I was surprised to discover that LZH compression of CRUNCHed files is possible (but I've disabled that ability in this release)! Examples: CRUNCH of SLR180.COM 106% ratio (actually made a larger file) CRLZH of SLR180.COM 84% ratio CRUNCH of TYPELZ22.Z80 45% ratio CRLZH of TYPELZ22.Z80 40% ratio CRUNCH of 'C' source 45% ratio (typical 'C' src selected at random) CRLZH of 'C' source 33% ratio (same file as above) A small history I am NOT the originator of the LZH encoding. The program that started my whole involvement in the introduction of this method of compression to the 8-bit world bears the following opening comments: /* * LZHUF.C English version 1.0 * Based on Japanese version 29-NOV-1988 * LZSS coded by Haruhiko OKUMURA * Adaptive Huffman Coding coded by Haruyasu YOSHIZAKI * Edited and translated to English by Kenji RIKITAKE */ This 'C' program implemented the compression algorithm of the LHARC program which arrived on the US scene in the spring of '89. Being of a curious nature, I figured I'd play with the algorithm just to understand it (the internal comments were, indeed, sparse - leaving MUCH to the reader's contemplation/reverse engineering) while 'better minds' than I tackled it in earnest. Months passed. I found that I was 'mastering' the algorithm (read that as demonstrating to myself that I understood it) by converting it piece-wise to assembly language. After a while, I was left with a 'C' language main program, run time library, and I/O with the business end of the compression and decompression implemented entirely in assembly language. Since the expected event of one of the 'heavies' in the PD and/or compression world releasing a CP/M version of the compression algorithm hadn't come to pass, I set about making a version myself. The natural choice was to prepare an analog to the CRUNCH.REL and UNCR.REL of Mr. Steven Greenberg and Mr. C.B. Falconer and append to/substitute in the existing, widely known programs for handling SQueezed and/or CRUNCHed files. 21 I saw no reason to tamper with the format CRUNCH uses on the output file. Therefore, with the exception of taking the 'next' file type in sequence (SQueezed files begin with a 76h,FFh sequence; Crunched files with 76h,FEh; so LZH encoded files begin with 076h,FDh) and setting the revision levels in the header to appropriate values , there's no difference in the output file format. So, you can probably coax your time/date stamping into operating on LZH encoded files. R. Warren Sysop, The Elephant's Graveyard (Z-Node#9) 619-270-3148 (PCP area CASDI) 5MBHD-C/PDCPM9:LT31.LBR; LT31.DOC revised BCM 2057 16/12/1991