;*********************************************************************** ;* * ;* CRUNCH * ;* v2.4 15 Sept 1987 * ;* - Steven Greenberg * ;* * ;*********************************************************************** .Z80 .SALL TITLE 'Crunch v2.4 15 Sept 1987' EXTRN PARSEU CSEG ; ;======================================================================= ; MEMPAG EQU 1A00H ; <=== set! [see comment near end of program] ; ;======================================================================= ; ;....................................................................... ; ; v2.4 Update Note: As explained in the CRUNCH24 general release .LBR, ; v2.4 will generate identical files (except embedded revision level ; byte) to CRUNCH v2.3. The great majority of the changes are user ; interface related, and are described in the CRUNCH24 documentation ; files. Some changes were made in the implementation of the "core" of ; the algorithms for both CRUNCH and UNCRunch - in the case of CRUNCH, ; conditionals were removed by splitting into three separate loops. In ; the case of UNCRunch, an unnecessary chase to the end of"virtual ; links" was eliminated by aborting the search as soon as an available ; reassignments lot is found. Other performance improving changes in- ; clude less time updating the screen and dynamic I/O buffer sizing. ; Non-time-critical "user-interface" changes (eg. the "tag mode" code, ; etc.) were coded in as straightforward a manner as possible, with ; little regard to code space minimization and even less to speed. ; ; While some documentation of the code has been cleaned up in the sev- ; eral month interim between the CRUNCH24.LBR release and the release of ; this source code, I have been very careful to avoid any temptation to ; change any of the code itself, thus insuring that this source code can ; be used to create the identical COM files included in the v2.4 release ; of CRUNCH. ; ;....................................................................... ; NO EQU 0 YES EQU NOT NO CRUNCH EQU YES ; Yes for CRUNCH, No for UNCR (for common) REV EQU 24H ; Program revision level SIGREV EQU 20H ; "Significant" revision level (compatibility) NOPRED EQU 0FFFFH ; "No predecessor" IMPRED EQU 07FFFH ; Pred that can't be matched or bumped SCRUPT1 EQU 03H ; Screen update speeds SCRUPT2 EQU 0FH ; ; --- Reserved codes --- ; EOFCOD EQU 100H ; EOF code RSTCOD EQU 101H ; Adaptive reset code NULCOD EQU 102H ; Null code SPRCOD EQU 103H ; Spare code ; ; --- Ascii equates --- ; CTRLC EQU 03H ; ^c BELL EQU 07H ; Beep BS EQU 08H ; Backspace LF EQU 0AH ; Linefeed CR EQU 0DH ; Carriage return ; ; --- CP/M address equates --- ; DFCB EQU 5CH ; Default FCB #1 DFCB2 EQU 6CH ; Default FCG #2 DDMA EQU 80H ; Default DMA address BDOS EQU 0005H ; BDOS entry point ; ; --- BDOS function equates --- ; CONIN EQU 1 ; Input a character from the console CONOUT EQU 2 ; Output single char to console PRTSTR EQU 9 ; Print string to console CONST EQU 11 ; Get console status GETVER EQU 12 ; Get CP/M version# SELDSK EQU 14 ; Select disk OPEN EQU 15 ; Open file CLOSE EQU 16 ; Close file SFIRST EQU 17 ; Search for first file SNEXT EQU 18 ; Search for next file ERASE EQU 19 ; Erase file READ EQU 20 ; Read file (sequential) WRITE EQU 21 ; Write file (sequential) MAKE EQU 22 ; Make file GETDSK EQU 25 ; Get currently logged drive SETDMA EQU 26 ; Set DMA address SETATR EQU 30 ; Set file attributes GSUSER EQU 32 ; Get/set user code RSTDRV EQU 37 ; Reset disk drive SETMS EQU 44 ; Set multi-sector count (CP/M+ only) ; ;....................................................................... ; ; Macros to facilitate "horizontal" movement through the table. ; See "Table structure" comment near "initbl" for more information. ; RIGHT1 MACRO LD A,H ; } ADD A,14H ; } Move "right" one column (same row) LD H,A ; } ENDM ; ;....................................................................... ; START: JP STRT ; <--- entry DB 'Z3ENV',01H ; ZCPR3 environment descriptor Z3ED: DB 00H,00H ; ;----------------------------------------------------------------------- ; Z3FLG: DB 0 ; ZCPR flag ARCHIV: DB 0 ; Archive bit mode flag INSREV: DB 23H ; Program rev for install purposes QUIFL: DB 0 ; Quiet mode flag NPROFL: DB 0 ; No prompt before overwrite flag TRBOFL: DB 0 ; Defeat multi-sector i/o flag CNFRFL: DB 0 ; Confirm every file flag WRMFLG: DB 0 ; Warm boot flag BIGFLG: DB 0 ; Bigger file prompt flag MAXDRV: DB 0FFH ; Maximum drive allowed by program MAXUSR: DB 0FFH ; Maximum user# allowed by program SPARFL: DB 0FFH ; Spare flag (or value) ; ;....................................................................... ; ; File type exclusion list. Must end with zero. ; |<-1->|<-2->|<-3->|<-4->|<-5->| EXTBL: DB 'ARC','ARK','LBR',0,0,0,0,0,0 ; |<-6->|<-7->|<-8->|<-9->|<10->| DB 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 DB 0 ; Must leave this terminating zero. ; ;=*=-=*=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-= ; CPYRT: DB 'CRUNCH v2.4 Copyright (c) S. Greenberg 09/15/87',CR,LF DB 'May be reproduced for non-profit use only','$' DB ' 201-670-8724' ; ;=*=-=*=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-= STRT: SUB A ; Z-80 test [RAF] JP PO,Z80 ; LD DE,WRNGUP ; "Program requires Z-80 Processor" JP MESS80 ; Special no frills exit w/ message Z80: LD (OLDSTK),SP ; Save os's stack LD SP,TOPSTK ; Set local stack CALL STRTUP ; Does a lot of stuff ; ;....................................................................... ; ; ***** Re-enter here for each matching file ***** ; ; General wildcard operation: When the program is first invoked, all ; matching filenames are put end to end in FNBUFF, 12 bytes each, in ; alphabetical order. Since a filename is only 11 characters long the ; spare byte, which precedes each filename, is used as a "tag/flag". By ; the time file processing starts (now), a number of routines have al- ; ready run (parts of the STRTUP routine). These routines set the tag- ; flag which indicates to us now in what manner the file should be pro- ; cessed: "00" = "skip it", "01" = process it", "02" = "perform a direct ; copy (if possible)", "FF" = "no more files". ; NXTFIL: LD SP,TOPSTK ; Reset SP LD A,(QUIFM) OR A CALL Z,CRLF ; Extra CR/LF if not in "quiet" mode LD DE,INFCB ; Input file's fcb CALL CLRFCB ; Init it to blanks and zeroes INC DE ; Leave "DE" pointing at "INFCB+1" for below LD HL,(BUFPTR) ; Pntr to name of next file from expansion bfr NXTSEL: LD A,(HL) OR A ; If zero, the file is "unselected" JR NZ,ISSEL ; Br if it is selected LD BC,12 ; Else just quietly skip to the next file ADD HL,BC LD (BUFPTR),HL JR NXTSEL ; ;............................... ; ; The file is "selected"; prepare to process it ; ISSEL: CP 0FFH ; (FF means done) JP Z,RETCCP ; Br if that is the case PUSH AF ; Save stat (to see if file is "excluded" blw) INC HL ; Skip to 1st filename char LD BC,11 ; Filename character count LDIR ; Put next file name into input fcb LD (BUFPTR),HL ; Save new pointer for next file CALL INTRAM ; Initialize all ram LD A,01H ; This loc req's diff init for crunch vs uncr LD (CSAVE),A ; Goes there POP AF ; Get file's status byte back again CP 02H ; 02 if file matched the "exclusion" list JR NZ,COPNIN ; If not, definitely attempt to compress it LD A,(WLDFLG) ; If so, see if prgm was invoked w/ wildcards OR A JR Z,COPNIN ; If not, go attempt compression LD A,(DIFDU) ; Else see if a direct copy is in order OR A ; (flag set if data flow is to distinct DU:'s) JP Z,NXTFIL ; If not, forget the whole thing ; ;....................................................................... ; ; Perform a direct straight copy of the file ; LD DE,DASHES ; "-----" for visual separation CALL MESAG2 JP COPY9 ; Performs the copy ; ;....................................................................... ; ; Normal Processing; Prepare to compress the input file. First, open ; the input file. A failure here is unusual, since the file existed at ; the time the filename expansion took place. There are "normal" series ; of events which could lead up to this, however. ; COPNIN: LD DE,DASHES ; "-----" for visual separation CALL MESAG2 CALL OPNIN ; Attempt to open the next input file "INFCB" JR NC,OPOK ; Br if ok LD DE,ERR1 ; "input file not found" JP SKIP1 ; Skip to next file OPOK: CALL GETC ; "gtlogc" needs initialization to get started JR NC,NOTMT ; If carry is set on 1st byte, file is empty LD DE,ERR0 ; "input file empty" JP SKP991 NOTMT: CP 76H ; If file starts with "76FF" or "76FF", it is JR NZ,NOTSQ ; - already crunched or squeezed respectively PUSH AF ; 1st byte was "76H", take advance peek at 2nd EXX ; Carefully check next byte, w/o norm call LD A,(HL) EXX INC A ; Well? JR Z,ALRDSQ ; Br if already squeezed INC A JR Z,ALRDCR ; Br if already crunched POP AF NOTSQ: LD (LIMBO),A ; Else ok; put 1st char there manually JR CBL ; Continue below ; ;....................................................................... ; ALRDCR: POP AF LD DE,MSGCR ; "Already crunched" JP SKP991 ALRDSQ: POP AF LD DE,MSGSQ ; "Agalready squeezed" JP SKP991 ; ;....................................................................... ; ; So far the input file is open. The output file is not. ; CBL: LD A,' ' ; For aesthetic alignment purposes CALL TYPE LD HL,INFCB ; Print input filename to console CALL PRNFIL LD DE,OUTFCB ; Now for the output fcb CALL CLRFCB ; Clear it INC DE ; Leave "DE" pointing at filename area CALL CPYNAM ; Copies filename from input fcb to output fcb LD HL,(OUTFCB+9) ; Get 1st & 2nd letters of ext for analysis LD A,' ' ; See if extension is blank CP L JR Z,FORZZZ ; If so, force an extension of "ZZZ" LD A,'Z' ; See if middle letter is "Z" CP H JR NZ,NZZZ ; Normal condition- simply force 2nd ltr to z ; ;............................... ; ; Middle letter is Z, use "ZZZ" if possible ; CP L ; Make sure it isn't "ZZZ" already! JR NZ,FORZZZ ; Ok... LD HL,(OUTFCB+10) CP H JR NZ,FORZZZ ; Ok... LD DE,ERR7 ; If this happens, user better rename his file JP SKP991 ; But give him a straight copy, anyway ; ;............................... ; NZZZ: LD HL,OUTFCB+10 ; Normal condition- force 2nd letter to "Z" JR NORMZ ; (note- "A" already has a "Z" in it) ; ;............................... ; FORZZZ: LD HL,OUTFCB+9 ; Come here if an extension of "ZZZ" is needed LD A,'Z' LD (HL),A ; (A future version will rename "XZY" files INC HL ; To "XZZ" rather than "ZZZ", a better idea) LD (HL),A INC HL ; ;............................... ; NORMZ: LD (HL),A ; ;....................................................................... ; ; Now open the output file. "OPNOUT" will check for duplicate filenames, ; and prompt if indicated. If carry is set on return, the file was not ; opened. DE points to an appropriate error message, if any. The rou- ; tine also types an arrow to the screen, followed by a "PRNFIL" call to ; type the DU: and filename to the screen. ; CALL OPNOUT ; Do all that JP C,SKIP2A ; Skips to next file if so deemed by "OPNOUT" ; ;....................................................................... ; ; Now both files are open. Eventually either both will be closed, or the ; input closed and the output deleted. CALL INITBL ; Initialize the lzw table LD A,76H ; Output the "76FE" header CALL OUTB ; Each call to "OUTB" outputs one byte LD A,0FEH CALL OUTB LD HL,INFCB ; Pointer to original (input) file's name CALL OUTFIL ; Embed it into the output file at bytes 2+ LD HL,STAMP ; Pointer to possible additional "stamp" chars IDOULP: LD A,(HL) ; Possibly get a stamp char INC HL ; Incr bfr pntr CALL OUTB ; (output at least one zero no matter what) OR A ; End of stamp bfr? JR NZ,IDOULP ; Loop till so LD A,REV ; Output revision level of this program CALL OUTB LD A,SIGREV ; Output "significant revision" level CALL OUTB XOR A CALL OUTB ; Output a checksum flag byte of zero LD A,5 CALL OUTB ; Output a spare byte of "5" ; ;....................................................................... ; LD A,(QUIFM) ; Print "heading" if in verbose mode OR A JR NZ,QUIET1 LD DE,HEADNG ; (the "in / out ca cr" stuff) CALL MESAGE QUIET1: LD IY,STATE0 ; Set the initial state of the "input machine" LD HL,NOPRED ; Initialize "pred" to "NOPRED" ; ;======================================================================= ; ; *** Main encoding loop *** ; ; "Match" will determine if the combination { , }, as ; supplied in { HL, A }, is already in the table. If it is, the match- ; ing index value is returned in DE. If it isn't, it will be added to ; the table in an appropriate location (assuming the table is not yet ; filled). If the table is filled, it may or may not still be added. ; The carry flag will be set to indicate when a match was NOT found. ; MAINLP: CALL GTLOGC ; A <-- next byte from "logical" input stream JR C,FINISH ; Branch on end-of-file MAINL2: CALL MATCH ; Is { pred, suffix } in the table? JR NC,FOUND ; Branch if found CALL OUTPUT ; If not, output that pred (still in hl) LD HL,RSTFLG ; See if an adaptive reset has been requested SRL (HL) ; Check (& zero if set) the adaptive rst flag LD HL,NOPRED ; Meanwhile, reset pred to "NOPRED" JR NC,MAINL2 ; Loop without getting another char (normally) JP ADPRST ; (unless an adaptive reset was indicated) FOUND: EX DE,HL ; Match- discard old pred & replace with new JP MAINLP ; Get a new character and loop ; *** End of main encoding loop *** ;....................................................................... ; ; --- End-of file processing --- ; FINISH: CALL PREINC ; Update the count for the upcoming output CALL OUTPUT ; Output the "leftover" code CALL PREINC ; Update again LD HL,EOFCOD ; Send an (otherwise disallowed) "EOF" code CALL OUTPUT ; That does that LD A,(CSAVE) ; Get the var that accumulates bits until 8 CP 01H ; The 1 out of 8 chance we're on a byte bndry JR Z,ONBND XOR A CALL OUTB ONBND: LD A,(CHKSUM+0) ; Now output the checksum CALL OUTB ; (lo byte) LD A,(CHKSUM+1) CALL OUTB ; (high byte). This completes all output. CALL DONE ; Writes out partial output bfr, thru cur loc CALL CLSOUT ; Close the output file CALL CLSIN ; Close the input file (prevents inadvertent ; Accumulation of open files). ; ;....................................................................... ; ; Now we are done with the file. The size of the resulting file will be ; compared with the original. If the resulting file is larger, the file ; will be erased and the original will be copied in uncompressed format ; instead. This will only be done if the source and destination DU:'s ; are different (obviously a direct copy to the same drive and user is ; nonsensical). When this is the case, the user will be given the op- ; tion of saving the "crunched" file - if he doesn't, then it will be ; erased. ; LD A,(BIGFLG) ; Get size question override flag AND A ; Check if non-0, clear carry at same time JP NZ,NEXT ; Skip if bigger LD DE,(INCTR) ; Size of input file LD HL,(OUTCTR) ; Size of resulting file SBC HL,DE ; Compare JP C,NEXT ; (normally the case) LD A,(DIFDU) ; Dest du: differ from origin? OR A JP Z,ASKHIM ; If not, give option of saving larger file LD DE,MSG998 ; "not smaller..." CALL MESAG2 CALL ERAOUT ; Erase the output file COPY9: CALL COPY ; Perform a straight copy JP C,NXTFIL ; If the copy did not actually take place JR NEXT ; If it did, count it ; ;....................................................................... ; SKP991: CALL MESAG2 ; Type predefined message LD A,(DIFDU) ; Dest du: differ from origin? OR A JP Z,NXTFIL JR COPY9 ; ;....................................................................... ; ASKHIM: LD DE,QUES1 ; Result file not smaller than original CALL MESAGE ; Ask the guy if he wants it anyway CALL RSPNSE ; Get his response PUSH AF ; Nec? CALL CRLF POP AF JR NZ,SKIP4A ; "skip4a" erases output file, goes to next ; ;....................................................................... ; NEXT: LD HL,NFP ; Increment #of files processed INC (HL) CALL ARCIT ; Flag input file as archived JP NXTFIL ; Repeat if still more files ; ;....................................................................... ; ;............................... ; SKIP1: CALL MESAGE ; Entry if neither input nor output files ; have been opened yet SKIP1A: JP NXTFIL ; (Entry here if no error text desired) ; ;............................... ; SKIP2: CALL MESAGE ; Entry here if only input file open SKIP2A: CALL CLSIN ; (Entry here for no message) JP NXTFIL ; ;............................... ; SKIP3: CALL MESAGE ; Entry here if both input and output ; files need to be closed SKIP3A: CALL CLSIN CALL CLSOUT JP NXTFIL ; ;................................ ; SKIP4: CALL MESAGE ; Entry here to erase output & close input file SKIP4A: CALL CLSOUT ; (Entry here for no message) LD DE,OUTFCB ; Close, then erase output file LD C,ERASE CALL BDOSAV CALL CLSIN ; Close input file as well JP NXTFIL ; ;............................... ; ;....................................................................... ; ; --- Perform an adaptive reset --- ; ADPRST: LD (SAVSUF),A ; Save the suffix which has yet to be output LD HL,RSTCOD ; Send an (otherwise disallowed) reset code CALL OUTPUT LD HL,0000 ; Reset entry# prior to table re-initialization LD (ENTRY),HL LD (TTOTAL),HL ; Also reset "codes reassigned" to zero XOR A LD (FULFLG),A ; Reset the adaptive reset flag back to zero CALL INITBL ; Re-initialize the entire lzw table LD A,9 ; Reset the code length to "9" LD (CODLEN),A ; LD (CODLE0),A ; This gets that also LD A,02H ; Reset the target mask value accordingly LD (TRGMSK),A ; LD A,0FFH ; Init the target compression ratio to max LD (LOWPER),A ; Goes there LD HL,NOPRED ; Set pred to "nopred" LD A,(SAVSUF) ; Restore the suffix char, patiently waiting JP MAINL2 ; And continue where we left off ; ;======================================================================= ; ; Find a match for { }, as supplied in { HL, A }. Does ; one of the following two things: ; ; (1) Returns the index# of a match in DE, with carry flag clear ; (2) Sets carry flag & adds new combo to to the appropriate place in ; "table". ; ENTERX: MATCH: LD B,A ; Suffix will stay in b for the duration LD A,(FULFLG) ; Use separate search loop if table full OR A ; Is it? JP NZ,MATCH2 ; Yes, use "match2" rather than "match1" ; ;....................................................................... ; ; "Match1": Table is not yet full; find a matching entry or else make a new ; one in the next available location. No code reassingnment here. ; PUSH HL PUSH HL ; This will be popped into "DE" below CALL HASH ; Get initial hash value into "HL" POP DE ; "de" <-- "pred" (pushed as hl above) MTCHL1: LD C,H ; C <-- extra copy of h LD A,(HL) ; Check if any entry exists at that location CP 80H ; "80" is indicative of an empty entry JP Z,EMPT11 ; If empty, use the spot to create a new entry JR NC,SKIPD1 ; If carry, must be "FF"- leave it alone AND 0DFH ; Else mask out flag bit (5) before matching SKIPD1: CP D ; Does entry match pred (hi) ? JR NZ,NM1 ; Br if not RIGHT1 ; Move to pred (lo) LD A,E CP (HL) ; Match? JR NZ,NM1 ; Oh well RIGHT1 ; Alright then, move to suffix LD A,B CP (HL) ; Well? JR NZ,NM1 ; 2 out of 3 aint bad ; ;....................................................................... ; ; Match found! Return the entry# (from the next two columns of the ; table). ; RIGHT1 ; To entry#, hi-byte LD D,(HL) ; Get it RIGHT1 ; Move to entry#, lo byte LD E,(HL) ; Get that LD H,C ; Normalize. (ie reverse all those "right"'s) SET 5,(HL) ; Flag the entry as "referenced" with this bit LD A,B ; Restore "a" to its value on entry POP HL ; Likewise "HL" (won't be used, but gotta pop) AND A ; Clear carry flag (return status) RET ; And return ; ;....................................................................... ; ; Match not found. Perform standard hash collision processing and try ; again. Add "DISP", a variable displacement value, for the "secondary ; probe". DISP was pre-calculated at the time the original hash value ; was computed. ; NM1: LD H,C ; Normalize to beg of entry. PUSH DE ; Save target values in d & e LD DE,(DISP) ; Get pre-computed displacement value ADD HL,DE ; Add displacement to current physical loc LD A,H CP TABLHI ; And check for looping back to beg of table JR NC,NC91 ; (br if no loop) LD DE,5003 ADD HL,DE ; Else 5003 for loop around NC91: POP DE JP MTCHL1 ; Repeat to see if this "link" matches ; ;....................................................................... ; ; All "links" to the hashed entry have been checked and none have ; matched. Since the table is not full, we make a new entry at this ; unused location. ; EMPT11: LD (HL),D ; Put in pred (high) RIGHT1 LD (HL),E ; Pred (low) RIGHT1 LD (HL),B ; Suffix LD DE,(ENTRY) ; Now put the entry's number next to the entry RIGHT1 ; Move to entry# (lo) column LD (HL),D ; Put that in RIGHT1 LD (HL),E ; Likewise entry# (hi) CALL PREIN2 ; Increments "ENTRY" and associated stuff SCFRET: SCF ; Set carry to indicate new entry (no match) LD A,B ; Return with carry set and "HL" & "A" intact POP HL RET ; ;....................................................................... ; ; Subroutine to pre-incr for next code. Called from various places in ; these loops. ; PREINC: LD DE,(ENTRY) ; Pre-incr for next code. PREIN2: INC DE ; (entry here if "DE" already = "ENTRY") LD (ENTRY),DE ; Save the new value LD A,(TRGMSK) ; See if new code length is necessitated CP D ; Check hi-byte against target value RET NZ ; Simply return if not SLA A ; Yes, code length will change LD (TRGMSK),A ; Next target mask LD A,(CODLEN) ; Previous code length value (#of bits) INC A ; Incr code length CP 13 ; Too long? JR Z,FLAGFL ; Yes, this means table just filled. LD (CODLEN),A ; Else just update new length RET ; And return FLAGFL: LD A,0FFH ; If table just filled, flag this fact LD (FULFLG),A ; ( = "FF" ) RET ; And return w/o updating "CODLEN" past 12 ; ;----------------------------------------------------------------------- ; ; "Match2": This loop is executed after the table is full. Continue ; search searching until a match is found. If no match, but the entry ; is suitable for reassignment, save the position and do further search- ; ing in "Match3" loop below which skips the "reassingnment suitability" ; stuff since the candidate slot has already been found. ; ; This loop used after table is full ; MATCH2: PUSH HL ; This save for the benefit of the "caller" PUSH HL ; This will be popped into "DE" below CALL HASH ; Get initial hash value into "HL" POP DE ; "de" <-- "pred" (pushed as hl above) ; ;....................................................................... ; MTCHL2: LD C,H ; C <-- extra copy of h LD A,(HL) ; Check if any entry exists at that location CP 80H ; "80" is indicative of an empty entry JP Z,SCFRET ; Nothin doin' JR NC,SKIPD2 ; If so, leave "FF" intact for matching process AND 0DFH ; Else mask out flag bit (5) before matching SKIPD2: CP D ; Does entry match pred (hi) JR NZ,NM2 ; Branch if not RIGHT1 ; Move to pred (lo) LD A,E CP (HL) ; Match? JR NZ,NM2 ; Oh well RIGHT1 ; Alright then, move to suffix LD A,B CP (HL) ; Well? JR NZ,NM2 ; 2 out of 3 ain't bad ; ;....................................................................... ; ; We have a match! But there is one very important "but" - if the table ; is full and we are in "code reassignment" mode, we must pre-empt the ; possibility of generating the WsWsW *** string here in the cruncher. ; This is because it is impossible to detect these in the uncruncher ; once all codes are defined. ; LD A,(LPR+0) ; If so, see if this whole pred/suffix combo CP E ; - is identical to the last one generated JP NZ,NTUGLY ; Pred (lo) doesn't match, so everything's ok LD A,(LSUFX) ; Check suffix. the order of these 3 checks CP B ; - is intended to optimized speed (most JP NZ,NTUGLY ; - likely "non-matches" first) LD A,(LPR+1) ; 2 out of 3 same- check pred (hi) CP D JR Z,NM2 ; Ugly situation-- pretend there's no match ; ;....................................................................... ; NTUGLY: RIGHT1 ; A good match! LD D,(HL) ; Get the entry# for return. RIGHT1 ; Move to entry#, lo byte LD E,(HL) ; Get that LD H,C ; Normalize. (ie reverse all those "right"'s) SET 5,(HL) ; Flag the entry as "referenced" with this bit LD A,B ; Restore "a" to its value on entry POP HL ; Likewise "HL" AND A ; Clear carry flag (return status) & return RET ; ;....................................................................... ; NM2: LD H,C ; No match yet. normalize to beg of entry. BIT 5,(HL) ; Is entry is available for poss re-assignment? JR NZ,NAVAIL ; Branch if not LD (AVAIL),HL ; Else this physical loc is the candidate JP NAVAI3 ; And jump into the "Match3" loop below ; ;....................................................................... ; ; Standard hash collision processing. Add "DISP", a variable displace- ; ment value, for the "secondary probe". DISP was conveniently pre- ; calculated at the time the original hash value was computed. ; NAVAIL: PUSH DE ; Process standard hash collision. LD DE,(DISP) ; Get pre-computed displacement value ADD HL,DE ; Add displacement to current physical loc LD A,H CP TABLHI ; And check for looping back to beg of table JR NC,NC92 ; (Branch if no loop) LD DE,5003 ADD HL,DE ; Else 5003 for loop around NC92: POP DE JP MTCHL2 ; Repeat to see if this "link" matches ; (end of "Match2") ; ;======================================================================= ; ; "Match3": Like "Match2" above, but don't bother checking for a reas- ; signable entry, we already have one. If all matches fail, perform ; that reassingment. ; MTCHL3: LD C,H ; C <-- extra copy of h LD A,(HL) ; Check if any entry exists at that location CP 80H ; "80" is indicative of an empty entry JP Z,EMPTY3 ; If empty, use the spot to create a new entry JR NC,SKIPD3 ; If so, leave "FF" intact for matching process AND 0DFH ; Else mask out flag bit (5) before matching SKIPD3: CP D ; Does entry match pred (hi) JR NZ,NM3 ; Branch if not RIGHT1 ; Move to pred (lo) LD A,E CP (HL) ; Match? JR NZ,NM3 ; Oh well RIGHT1 ; Alright then, move to suffix LD A,B CP (HL) ; Well? JR NZ,NM3 ; 2 out of 3 ain't bad ; ;....................................................................... ; ; We have a match! But there is one very important "but" - if the table ; is full, and we are in "code reassignment" mode, we must pre-empt the ; possibility of generating the WsWsW *** string here in the cruncher. ; This is because it is impossible to detect these in the uncruncher ; once all codes are defined. ; LD A,(LPR+0) ; If so, see if this whole pred/suffix combo CP E ; - is identical to the last one generated JP NZ,NTUGL3 ; Pred (lo) doesn't match, so everything's ok LD A,(LSUFX) ; Check suffix. the order of these 3 checks CP B ; - is intended to optimized speed (most JP NZ,NTUGL3 ; - likely "non-matches" first) LD A,(LPR+1) ; 2 out of 3 same- check pred (hi) CP D JR Z,NM3 ; Ugly situation-- pretend there's no match ; ;....................................................................... ; NTUGL3: RIGHT1 ; A good match! LD D,(HL) ; Get the entry# for return. RIGHT1 ; Move to entry#, lo byte LD E,(HL) ; Get that LD H,C ; Normalize. (ie reverse all those "right"'s) SET 5,(HL) ; Flag the entry as "referenced" with this bit LD A,B ; Restore "a" to its value on entry POP HL ; Likewise "HL" AND A ; Clear carry flag (return status) & return RET ; ;....................................................................... ; NM3: LD H,C ; No match yet, normalize to beg of entry. ; ; Standard hash collision processing. Add "DISP", a variable displace- ; ment value, for the "secondary probe". DISP was conveniently pre- ; calculated at the time the original hash value was computed. ; NAVAI3: PUSH DE ; Process standard hash collision. LD DE,(DISP) ; Get pre-computed displacement value ADD HL,DE ; Add displacement to current physical loc LD A,H CP TABLHI ; And check for looping back to beg of table JR NC,NC93 ; (Branch if no loop) LD DE,5003 ADD HL,DE ; Else 5003 for loop around NC93: POP DE JP MTCHL3 ; Repeat to see if this "link" matches ; ;....................................................................... ; ; All "links" to the hashed entry have been checked and none of them ; have matched. We therefore make a new entry. ; EMPTY3: LD HL,(TTOTAL) ; Incr "codes reassigned" ("cr") INC HL LD (TTOTAL),HL LD HL,(AVAIL) ; Was defined during "Match2" loop LD (LPR),DE ; Save last entry made for "ugly" detection LD A,B ; LD (LSUFX),A ; "lpr" <-- last pred, "lsufx" <-- last suffix LD (HL),D ; Re-assign the entry. leave it's # alone. RIGHT1 LD (HL),E ; Pred (low) RIGHT1 LD (HL),B ; Suffix JP SCFRET ; ;======================================================================= ; ; Insert the pred now in HL into the output stream. ; OUTPUT: PUSH AF ; Save caller's "A" ADD HL,HL ; Must always pre-shift left at least 4 times ADD HL,HL ; (for case of left justifying 12 bit codes) ADD HL,HL ; 3 of those are done here. LD A,(CODLE0) ; Compute number of additional pre-shifts (+1) LD C,A ; This value is (13 - codelength) NEG ; Also leave code length in "C" for use below ADD A,13 ; (the +1 simply ensures at least one execution LD B,A ; - of the loop below) ADDHLP: ADD HL,HL ; Additional necessary pre-shifting DJNZ ADDHLP LD A,(CSAVE) ; Get "leftover" bits from last time LD B,C ; Put code length, still in "C", in "B" PUTLP1: ADD HL,HL ; Now we start shifting out bits for real RLA ; Bits coming out of "HL" go into "A" JR NC,ENDLP1 ; Skip if not time to dump the contents CALL OUTB ; Dump when necessary LD A,01H ; Re-init to flag bit only ENDLP1: DJNZ PUTLP1 ; Loop for as many bits as need to be output LD (CSAVE),A ; Leftover bits get saved here LD A,(CODLEN) ; "codle0" is always equal to "codlen" delayed LD (CODLE0),A ; -by one code output call. update here. POP AF ; Restore callers "A" & return RET ; ;======================================================================= ; ; Subroutine gets a character from the input stream and adds its value ; to running checksum. ; GETC: CALL GETCHR ; Get a character into A RET C ; Don't add in the garbage char recv'd on eof CALL CKSUM ; Add it in AND A ; Guarantee clear carry when no eof RET ; That's it ; ;======================================================================= ; ; Subroutine to initialize the table to contain the 256 "atomic" entries ; { "NOPRED", }, for all values of from 0 thru 255. ; INITBL: CALL PRESET ; "pre-initializes" the table (mostly zeroes) XOR A ; Start with 0 INITLP: PUSH AF LD HL,NOPRED ; Will stay at this value for all 256 loops CALL ENTERX ; Make the entry { hl, a } POP AF INC A ; Incr the suffix char JR NZ,INITLP ; Loop 256 times LD HL,IMPRED ; "impossible pred". Not bumpable or matchable. CALL ENTERX ; Reserve entries 100h thru 103h CALL ENTERX ; (namely eof, reset, null & spare) CALL ENTERX CALL ENTERX XOR A ; Put this back to zero for normal execution LD (FFLAG),A RET ; ;....................................................................... ; ; Low-level pre-preset called from INITBL above ; PRESET: LD HL,TABLE ; Beginning of table (1st entry, first column) LD DE,TABLE+1 LD A,80H ; Initialize whole 1st column to empty flags LD BC,1400H LD (HL),A ; Initialize 1st location LDIR ; And the rest LD (HL),0 ; Next 4 x 1400h locs all get zeroes LD BC,4*1400H-1 ; "-1" so we don't go one too far LDIR RET ; ;----------------------------------------------------------------------- ; ; Hash subroutine. ; ; Notes about the hashing. The "open-addressing, double hashing" scheme ; used, where the actual codes output are the logical entry#, contained ; in the table along with the entry itself, would normally make the ; codes output independent of the exact hashing scheme used (codes are ; simply assigned in order, their physical location is irrelevant). ; However, with code reassignment implemented, the re-assignments are ; obviously not made in any particular order and are hash function de- ; pendent. Thus the hash function must not be changed. ; ; Called with pred in HL (3 nybble quantity) and suffix in A (2 nybbles). ; Exclusive OR's the upper 2 nybbles of the pred with the suffix for the ; two least significant nybbles of the result. The lower nybble of the ; pred becomes the highest of 3 nybble result. Adds one to that as well ; as the table offset, resulting in a usable address, returned in HL. ; Also compute "DIFF", the secondary hash displacement value, as a nega- ; tive number. ; HASH: LD A,B LD E,L ; Save so low nybble of pred can be used below ADD HL,HL ADD HL,HL ADD HL,HL ADD HL,HL ; Shift whole pred value left 4 bits XOR H ; Xor hi-byte of that with suffix LD L,A ; Goes there as lo-byte of result LD A,E ; Get pred(lo) saved above AND 0FH ; Want only low nybble of that ADD A,TABLHI ; Convenient time to add in table offset LD H,A ; Goes here as hi-byte of result INC HL ; Except add one. this eliminates poss. of 0. PUSH HL ; Save hash val for return LD DE,-5003-TABLE ; Compute displacement value, - (5003-hash) ADD HL,DE ; (displacement has table offset removed again) LD (DISP),HL ; Secondary hashing value, a negative number. POP HL ; Get back orig hash address RET ; And return it ; ;----------------------------------------------------------------------- ; ; Like "PRNFIL", but send chars to the output stream instead of typing. ; This routine WILL explicitly output blanks in the filename extension. ; OUTFIL: LD BC,0C20H ; B = loop counter, c = blank character CHARL2: INC HL ; Pre-incr pointer LD A,(HL) ; Get a char CP C ; Blank? JR Z,SKPTY2 ; Suppress them (but not in the .ext) TYPEI2: CALL OUTB ; Send char to the output stream SKPTY2: DEC B ; Loop counter RET Z ; Return when done LD A,B ; Check loop counter CP 4 ; At this point, type a "." JR NZ,CHARL2 LD A,"." ; This is also a convenient char to set "C" to LD C,A ; A "." cannot be found in an fcb filename JR TYPEI2 ; Type the ".". do no incr hl. ; ;======================================================================= ; ; Return one "logical" character from the input stream. The logical in- ; put stream consists of the characters from the physical (actual) input ; stream after RLL (repeat byte) encoding has been performed. ; GTLOGC: PUSH HL ; Call "GETLOG" from here if hl must be saved CALL GETLOG POP HL RET ; ;....................................................................... ; ; Entry here similar to "GTLOGC" (above) except HL is not saved. ; ; --- Common entry point for all states --- ; GETLOG: LD A,(LIMBO) ; Last physical character read, hasn't been outputed yet LD D,A ; (All states want "limbo" in "d") JP (IY) ; Go to the appropriate state ; ;....................................................................... ; ; normal state ; STATE0: CALL GETC ; Get next byte from physical input stream JR C,EOF ; Branch if no more data CP 90H JR Z,IS90H CP D ; Compare to last char JR Z,SWTO1 ; Br if same. will change to RETURN: LD (LIMBO),A ; Update "limbo" with new byte just read LD A,D ; And output the old val of "limbo" AND A ; Clear carry flag RET ; Return, leaving at ; ;............................... ; SWTO1: LD IY,STATE1 ; Set next state to RET ; Need not update "limbo" or ld a, (are same) ; ;............................... ; EOFS: DEC D ; (Entry here if "d" contained a count) EOF: LD IY,STATEX ; Set next state to (spec. eof state) LD A,D AND A ; Return with clear carry one more time RET ; ;............................... ; IS90H0: DEC D ; (Entry here if "d" contained a count) IS90H: LD IY,STA9A ; Set next state to (spec. 90 state) LD A,D AND A RET ; ;....................................................................... ; ; A second occurrence of the same character has already been ; detected. So far only one occurrence has been output. ; STATE1: CALL GETC ; Get new byte from input stream JR C,EOF CP 90H ; (repeats of 90h cannot be packed) JR Z,IS90H CP D ; Another repeat (3rd contiguous occurrence)? JR Z,SWTO2 ; If so, switch to LD IY,STATE0 ; Else switch back to JR RETURN ; Rest is same as above ; ;............................... ; SWTO2: LD A,90H ; Don't get any new input now, but output "90H" LD IY,STATE2 ; Change to RET ; ;....................................................................... ; ; Three contiguous occurrences of a byte been detected. The ; byte itself and the 90H have already been output. Now it is time to ; suck up characters (up to 255 of them). ; STATE2: LD E,D ; Byte to be matched will be kept in e LD D,3 ; Init d, repeat byte counter, to 3 RPTLP: CALL GETC ; Get next byte INC D ; & incr repeat byte counter JR C,EOFS ; Branch on EOF from "GETC" call JR Z,RETRN3 ; In case of more than 255 contig occurrences CP 90H JR Z,IS90H0 ; Branch out if 90h is encountered CP E ; Still the same? JR Z,RPTLP ; Loop if so ; ;............................... ; RETRN3: DEC D ; Adjust count LD IY,STATE3 ; Change to (final state) JR RETURN ; Rest is same as above ; ;....................................................................... ; ; Like State zero, but don't look for a match (because the ; last byte output was a count). ; STATE3: CALL GETC ; Get next character JR C,EOF ; Branch on end-of-file CP 90H JR Z,IS90H ; Branch if 90h encountered LD IY,STATE0 ; Next state will be "0" JR RETURN ; Rest is same as above ; ;....................................................................... ; ; 90H has been encountered, byte before it has been output. ; Now output 90H, next output "0". ; STA9A: LD A,90H ; Note this state doesn't get another phys char LD IY,STA9B ; Next state will be outputs the "0" AND A ; Be sure to return with clr carry flag RET ; ;....................................................................... ; ; 90H has been encountered, & 90H has been output. Now output "00" ; STA9B: CALL GETC ; Get next physical char for "limbo" LD D,1 ; Will get decr'd and cause a zero output JR C,EOFS ; (Branch on end-of-file) CP 90H ; JR Z,IS90H0 ; Branch if another 90h is encountered JR RETRN3 ; Rest is same as above ; ;....................................................................... ; ; EOF has been encountered, and all bytes have been output. ; Set carry flag and return. STATEX: SCF ; As described above. RET ; ;======================================================================= ; ; "Stamp" processing. ; PRCSTM: PUSH DE ; Called w/ "HL" pointing to text of "stamp" LD DE,STAMP ; Buffer for holding the date stamp or text LD B,7FH ; Put a limit on its length STMPLP: LD A,(HL) ; Get a character LD (DE),A ; Put it in the buffer INC DE OR A ; Zero denotes end of cmnd tail, ending stamp JR Z,PRCDN1 INC HL SUB ']' ; The "proper" way the stamp should end JR Z,PRCDN2 DJNZ STMPLP ; Get more chars ERR8: LD DE,PRSER8 ; Stamp overflow, probably impossible JP FATALU PRCDN1: POP DE ; Come here if null terminated the stamp RET ; Return with the null in "A" & z set PRCDN2: LD (DE),A ; Make sure a null (a has one now) gets here NBLP: LD A,(HL) ; Advance to first non-blank after stamp CP ' ' JR NZ,NBC ; Branch if we have one INC HL ; Else advance DJNZ NBLP ; And continue JR ERR8 ; Overflow error NBC: POP DE ; Rtn with "HL" pointing to 1st non-blank char OR A ; (Return z stat if that character is null) RET ; ;....................................................................... ; ; Flag files matching the "exclusion list" ; EXCLUD: LD BC,12 ; Leave 12 in bc for incrementing ix LD IX,FNBUFF ; Points to beg of filenames OUTLP: LD A,(IX+0) ; Get flag byte for this entry CP 0FFH ; Final [non-] entry? RET Z ; (return if so) OR A ; Is it an untagged filename? JR Z,NXTFN ; If so, leave it that way & move to next LD HL,EXTBL-3 ; Beginning of "exclusion" list INRLP0: INC HL INRLP1: INC HL ; (If HL already incremnted once) INRLP2: INC HL ; ( " " twice) LD A,(HL) ; Get a char from list OR A ; End of list? JR Z,NXTFN ; If so, move on to next filename in "fnbuff" CP '?' ; Wildcard? JR Z,AUTOM1 ; Yes, automatically matches CP (IX+9) ; Else see if it matches first ft char JR NZ,INRLP0 ; No match, forget it and move to next filename AUTOM1: INC HL LD A,(HL) ; Repeat twice more for other 2 chars CP '?' JR Z,AUTOM2 CP (IX+10) JR NZ,INRLP1 AUTOM2: INC HL LD A,(HL) ; As above CP '?' JR Z,AUTOM3 CP (IX+11) JR NZ,INRLP2 AUTOM3: LD A,02H ; File type matches; flag file as "excluded" LD (IX+0),A NXTFN: ADD IX,BC ; Move to next filename in "fnbuff" JR OUTLP ; ;======================================================================= ; ; All ASCII centralized here as a service to disassembly hobbyists. ; VUNITS EQU (REV/16)+'0' ; Version, units dig, in ascii VTNTHS EQU (REV AND 0FH)+'0' ; Version, tenths dig, in ascii INTRO: DB 'GEL Cruncher v',VUNITS,'.',VTNTHS,CR,LF,'$' ERR7: DB ' [ Can''t crunch .ZZZ files ]$' MSGCR: DB ' [ Already crunched ] $' MSGSQ: DB ' [ Already squeezed ] $' MSG998: DB CR,LF,' [ Result not smaller ] $' QUES1: DB 'Result not smaller. Save anyway? : ',BELL,'$' USAGE: DB CR,LF,LF,'Usage:',CR,LF,LF DB ' Filename Date, etc. ' DB 'Option letters',CR,LF DB ' / / /',CR,LF DB 'CRUNCH {du:} {du:} { [id] } { / }' DB CR,LF DB ' \ \ |',CR,LF DB ' Source Destination (space)' DB CR,LF,LF DB ' is up to 4 letters immediately following ' DB 'a " /".',CR,LF DB ' "Q" = Quiet mode "C" = Confirm (tag) mode', DB CR,LF DB ' "O" = Overwrite mode "A" = Archive bit mode',CR,LF DB ' Option letters toggle (reverse) the corresponding ' DB 'default setup.',CR,LF DB CR,LF DB ' Both "du:" are of form DU:, UD:, D:, or U:',CR,LF DB ' "[id]" is date or any text enclosed in "[ ]".',CR,LF DB CR,LF DB ' Everything is optional except filename.',CR,LF,'$' ; ;======================================================================= ; ; ** Include file begins here ** ; INCLUDE COMMON.LIB ; ; ** Include file ends here ** ; ;======================================================================= ; ; Additional misc ram locs which need not be initialized, or are init- ; ialized by the routines which use them. ; LIMBO: DS 1 ; Storage for 1 char in pipeline delay AVAIL: DS 2 LPR: DS 2 LSUFX: DS 1 SAVSUF: DS 1 FFLAG: DS 1 CSAVE: DS 1 ; ;............................... ; SAFETY: DS 16 ; Safety region beyond stack limit check ENDPRG EQU $ ; (approx bottom of stack) ; ;_______________________________________________________________________ ; STKSZ EQU 8 ; Minimum stack size (pages) IBUFSZ EQU 8 ; Input buffer size (pages) ; ;======================================================================= ; ; ===> All tables will begin at "MEMPAG", defined at the top of the ; program. This should be set to a page aligned value i.e., ad- ; dress that ends in "00") which is ABOVE the end all program and ; data segments. You may have to do one test link to determine the ; proper value (changing "MEMPAG" will not change the length of the ; segments on the subsequent link). ; ; "MEMPAG" is defined at the beginning of this program to remind you to ; set it properly. If you set it higher than necessary, there will be ; no negative effect other than an increase in the TPA required to run ; the program. If you set it too low, you will be in big trouble. The ; value must be set manually because most linkers cannot resolve an ; "and", "shift" or "hi" byte extraction at link time to determine the ; page boundary. ; ;======================================================================= ; ; "MAXFLS" is buffer size (in files) for wildcard expansions. Room for ; this many files will be allocated. ; MAXFLS EQU 256 TOPSTK EQU MEMPAG+(STKSZ*256) ; Top of stack IBUF EQU TOPSTK ; (= beginning of input buffer) EIBUF EQU IBUF+(IBUFSZ*256) ; End of input buffer TABLE EQU EIBUF ; (= beginning of table) EOTBL EQU TABLE+(5*20*256) ; End of table FNBUFF EQU EOTBL ; (= beginning of wildcard expansion buffer) ENDFNB EQU FNBUFF+(12*MAXFLS) ; End of expansion buffer STAMP EQU ENDFNB ; File "stamp" buffer ** size temp *** ENDALL EQU STAMP+100H ; End of everything, except output buffer OBUF EQU ENDALL ; Beginning of dynamically sized output buffer ; ;----------------------------------------------------------------------- ; IBUFHI EQU HIGH IBUF ; Input buffear address, high byte (low byte = 0) EIBFHI EQU HIGH EIBUF ; End of input buffer address, high byte, likewise TABLHI EQU HIGH TABLE ; Beginning of table, high byte, likewise ETBLHI EQU HIGH EOTBL ; End of table, high byte, likewise EFNBHI EQU HIGH ENDFNB ; End of expansion buffer, likewise ENDHI EQU HIGH ENDALL ; OBUFHI EQU HIGH OBUF ; Output buffer addrress, high byte likewise END