title 'UNC & UNCREL uncrunch module' ; Original version by Steven Greenberg. Revised by C.B. Falconer to: ; Execute on 8080/8085/V20 Check for memory availability ; Correct "sigrev" check Correct error responses. ; Generally compacted code module. ; Entry UNC avoids input file rewind, i.e. after initial 0 read. ; Entries CODES and TROOM allow monitoring codes assigned/re-assigned ; (for version 2). TROOM is also version 1 codes free. ; ; Following kept in separate library module. Thus linking without ; search will include it, but with search removes from object. ; db 'Copyright (c) 86/11/24 by' ; db ' Steven Greenberg 201-670-8724' ; db ' and C.B. Falconer 203-281-1438.' ; db ' May be reproduced for non-profit use only.' ; ; error codes (0 for no error) version equ 1; newer uncrunch version required isnotcr equ 2; File is not crunched fouled equ 3; File is fouled memory equ 4; Memory or stack overflow ; ver equ 21; Kept at UNCREL-1 for referance ; ; move right n columns, same row right macro n mov a,h adi n*10h mov h,a endm ; memtop equ 6; CPM keeps top of memory avail here ; sigrev equ 20h; significant revision level impred equ 07fffh; impossible pred, can never be matched nopred equ 0ffffh; no predecessor code vacant equ 080h; value for a vacant entry guard equ 07fh; protect table entry from use crnchid equ 0fe76h; initial crunched file id wd (lsb 1st) escape equ 090h; repeated byte encoding ; ; For version 2 algorithm initw equ 9; initial cell width maxwide equ 12; max width of cells tblsize equ 5003 ; ; Version 2 special codes eofcod equ 0100h rstcod equ 0101h; adaptive reset signal nulcod equ 0102h; NOP sprcod equ 0103h; spare for future use ; n01 equ 1 n02 equ 2 n08 equ 8 n0f equ 000fh n10 equ 0010h n14 equ 0014h n20 equ 0020h n28 equ 0028h n30 equ 0030h ndf equ 00dfh nfe equ 00feh nff equ 00ffh t0fff equ 0fffh t1000 equ 1000h t2000 equ 2000h t2800 equ 2800h t4000 equ 4000h ; ; TABLE STRUCTURE: The 4096 x 5 table is realized as five in- ; dividual 4096 by one tables. The first (and hence the rest) are ; page aligned. This organization simplifies address calculation ; and "vertical" searching at the expense of slightly increased ; complexity when moving "horizontally" through the table. More ; specifically, converting an entry# (index#) to an address is ; simply a matter of adding "hitabl" (table base addr, hi-byte) to ; the hi-byte of the index#. Subtraction reverses the process- no ; multiplication or divide by five necessary. Horizontal movement ; is accomplished by adding 10H, 20H, 30H, or 40H to move "right" 1 ; to 4 columns respectively. Obviously subtracting the same amounts ; moves "left". ; ; For discussion purposes, however, the table should be ; thought of as having 4096 rows (entries) of five bytes each ; (columns). For any given row, the first two columns contain a ; "pred" value (hi-byte, then lo), the 3rd column contains a "suf- ; fix" char, and the last two columns may contain a link to another ; entry (again, hi-byte first). The link is not conceptually part ; of the algorithm but is needed to resolve hash collisions. Note ; that both the "pred" and the "link" columns contain a value re- ; ferring to another entry (or contain a null reference). In the ; case of the "pred", the number entered is a 12-bit index number ; (0-4095), with "FFFF" indicating "NOPRED". In the case of the ; link columns, the reference is achieved with an actual address, ; with 0000 meaning "no link" (the difference is one of conven- ; ience). Note that link pointers point to the first column of the ; intended entry. ; ; The first column ("pred", hi byte) also flags whether the ; entry is vacant. Legal values for this column normally range ; from 00H - 0FH, plus "0FFH" in the case of "NOPRED". "80H" has ; been selected to represent a vacant entry. Entry #0 contains a ; "7FH" in this position, insuring that entry is neither vacant nor ; "matchable". This prevents code 000 from ever being generated ; under normal circumstances, allowing its use as a special EOF ; code. ; ; TROOM or CODES is used to keep track of the number of remaining ; vacant entries. TROOM is decremented from 4095 to 0 (entry #0 is ; not used), CODES is incremented for ver 2 use, when TROOM counts ; code re-assignments (for reference only). ; ___________________________________________________________________ ; cseg extrn out, getbyt entry uncrel, unc; program area entry codes, troom, endu; data area ; ; This byte kept at UNCREL-1, for external referance db ver; current revision level ; ; Main entry. Read from start of file/module. uncrel: call malloc; returns hl = new stack jc xstkov sphl; ok, now switch stacks call inbyte; Check really is crunched file cpi (crnchid AND 0ffh) jnz xnotcr call inbyte cpi (crnchid shr 8) jz uncra; validated, go do it from start ; " " xnotcr: mvi a,isnotcr jmp error ; xstkov: mvi a,memory jmp error ; xbadf: mvi a,fouled jmp error ; xnewv: mvi a,version ; " " error: stc ; " " exit: lhld spsave sphl ret ; exitok: xra a jmp exit ; ; Entry here if application has already read the header, and ; validated the initial id bytes. This avoids rewinding the file. ; The next input byte MUST be the revision level. unc: call malloc; returns hl = new stack jc xstkov sphl; ok, now switch stacks jmp uncrb ; ; Set up memory allocation. base pointer in hl ; Carry if insufficient space (stack overflow incipient) malloc: xchg lxi h,2; allow for call malloc dad sp shld spsave; save return from main lxi h,255 dad d; round up to page boundary mvi l,0 shld @table mvi a,n30; '0' add h mov h,a shld xlatbl; for version 2 system mvi a,n28; '(' add h mov h,a push h cma mov h,a; 4 LESS bytes than Z80 coding inr h; l was zero shld stklim pop h mov a,h adi n08; proposed stack page mov h,a; check stack page a suitable mvi l,0 xchg; Check memory against memtop and for lhld memtop; input stack within range mov a,h; @table thru newstack sub d; (can exit because stack saved) rc; not enough system memory lda spsave+1 lhld @table cmp h jc stkck1; input stack below table, ok mov h,a; input stack page mov a,d; new stack page cmp h stkck1: cmc xchg ret; with carry if stack overflow ; ; Reading from start of file. uncra: call inbyte; skip over id area ora a jnz uncra ; " " ; Entry here allowable after reading to initial 0. Thus applications ; can first extract the initial info, and then enter here via "uncr" uncrb: call init; variables etc call inbyte; ignore revision level call inbyte; significant revision level push psw call inbyte; ignore checksum flag call inbyte; and spare byte pop psw cpi sigrev+1 jnc xnewv; need newer version cpi sigrev jnc uncrc; ver 20 uncrunching ; " " ; Ver 10 uncrunching call unc1i jmp unc1 uncrc: call unc2i; Ver. 2, initialize tables jmp unc2 ; ; Version 10 uncrunching initialize. Returns de := nopred unc1i: lxi h,t0fff shld troom call clrmem mvi a,12 sta width; ver 10 tokens are 12 bits xra a sta kind; 0 for version 10 operation " " ; initialize atomic entries. Set de := nopred atoms: xra a lxi h,nopred atoms1: push psw push h call enterx; make entry { hl, a } pop h pop psw inr a jnz atoms1 xchg; de := nopred ret ; ; Version 20 setup. Returns de := nopred unc2i: call clrtbl mvi a,1 sta kind; Version 20 signal mvi a,n20; force non-bumpable atomic entries sta ffflag call atoms; init atomic entries mvi b,(sprcod+1) and 0ffh unc2i2: push b lxi h,impred; impossible pred xra a call enterx; reserve eof thru sprcod pop b; unmatchable and unbumpable dcr b jnz unc2i2; xra a sta ffflag; reset flag mov h,a mov l,a shld troom; re-used as re-assignment counter lxi d,nopred ret ; ; Ver 10 uncrunching loop unc1: xchg shld lastpr call getok; new 12 bit code to de jc exitok; eof or eof node push d call decode lxi h,entflg mov a,m mvi m,0 ora a cz entlast; make new table entry if not done pop d lda fulflg ora a jz unc1; continue ; " " ; Speed up when table full, no more entries need be made/checked unc1b: call getok jc exitok push d call decode pop d jmp unc1b; continue ; ; Version 2 uncrunching unc2: xchg shld lastpr call getkn jc unc2c; eof or reset etc. push d call decode lxi h,entflg mov a,m mvi m,0 ora a cz entlast; if not made, then make entry pop d lda fulflg ora a jz unc2; adaptive system reset cpi nfe; when this becomes 0ffh all done. First jnz unc2b; it becomes 0feh, when one more loop inr a; is required, and set it to 0ffh. sta fulflg jmp unc2; do the extra loop ; table is full. No new entries needed unc2b: xchg shld lastpr call getkn jc unc2c; eof etc push d call decode lhld lastpr lda char call recod; check for code re-assignment pop d jmp unc2b ; ; here for input codes in range 100h..103h (eof..sprcod). unc2c: mov a,e; special code, (eof or adaptive reset) cpi eofcod and 0ffh jz exitok; done cpi rstcod and 0ffh jnz xnotcr ; " " ; adaptive reset xra a mov h,a mov l,a shld codes; init current code to 0 sta fulflg; clear call unc2i mvi a,initw sta width; reset input code width mvi a,n02 sta trgmsk mvi a,n01 sta entflg; 1st entry always a special case jmp unc2 ; ; VAR b : byte; (* global *) ; ; PROCEDURE decode(x : index); ; ; VAR ix : index; (* index is a record *) ; ; BEGIN (* decode *) ; ix := lookup(x); ; IF ix.pred = NIL THEN enter(x, b); ; IF ix.pred = nopred THEN b := ix.byte ; ELSE decode(ix.pred); ; send(ix.byte); ; END; (* decode *) ; ; The char associated with the bottomost recursion level is saved in ; "char" and is used later to make the next table entry. ; ; The code at "ugly" has to do with a peculiar string sequence where ; the encode "knows" about a string before the decoder so the decoder ; has to make an emergency entry. Fortunately there is enough inform- ; ation available to do this. It has been shown that this case is ; unique and that the assumptions are valid. To understand the LZW ; algorithm the "ugly" code may be ignored. ; ; Universal decoder ; a,f,b,c,d,e,h,l decode: lda kind ora a jz dcda; version 1, no setup needed push d xchg lda @table+1 add h mov h,a; convert code to table adr. mov a,m ori 020h; mark referenced (not bumpable) mov m,a pop d ; " " ; decode/output the index in de. Recursive ; a,f,b,c,d,e,h,l dcda: lhld stklim dad sp jnc xstkov; stack overflow lda @table+1; Convert index de to address hl add d mov h,a mov l,e mov a,m ani ndf; (for 2 only) cpi vacant jnz dcda1; not vacant, normal case ; " " ; The "ugly" exception. Term due to K. Williams mvi a,n01 sta entflg push h mvi a,n20; (for 2 only) sta ffflag call entlast; make emergency entry xra a sta ffflag; (for 2 only) pop h mov a,m cpi vacant jz xbadf; If vacant file is invalid ; " " dcda1: mov d,m; get "pred" (hi) right 1; move to "pred" (lo) mov e,m; get it. If msb of hi byte is set value mov a,d; must be ff (nopred) because not 80h ani not 20h jm decodx; nopred, terminate recursion mov d,a; (for 2, remove any accessed flag) push h call dcda; recursive pop h right 1; move ahead to "suffix" byte mov a,m jmp send; output suffix byte & exit ; ; Exit from decoding recursion. Unloads all the stacked items. decodx: right 1; move ahead to "suffix" byte mov a,m; get & save as 1st char of decoded sta char; string. Used later to make a new ; " " table entry. Send & exit ; Send char with repeat expansion etc. ; a,f,b,c,h,l send: mov c,a; output char lhld outflg inr h dcr h jnz send2; repeat flag set cpi escape jz send1; escape char, set flag mov l,a; save char for possible repeat coming dcr h; cancel coming inr, not repeat call outbyt send1: inr h; set repeat flag shld outflg ret send2: mvi h,0; clear repeat flag shld outflg; save result (with l = repeat char) ora a jz send4; escape 0 represents escape dcr a rz; take care of repeat = 1 mov h,a; set repeat count mov a,l; repeaat char send3: call outbyt dcr h jnz send3 ret send4: mvi a,escape jmp outbyt ; ; Enter lastpr/char into table ; a,f,b,c,d,e,h,l entlast: lhld lastpr lda char ; " " ; enter { , } into table, passed in {hl, a} regs. ; a,f,b,c,d,e,h,l enterx: mov b,a lda kind ora a mov a,b jnz ent2x; version 2 decoding ; " " else version 1 decoding ; enter { , } into table, passed in {hl, a} regs. ; a,f,b,c,d,e,h,l ent1x: push psw push h call midsq; hash index into al mov h,a lda @table+1; page address add h mov h,a; into address pop d; pred pop psw; suffix mov c,a ; " " ent1x1: mov b,h; check for match mov a,m cpi vacant jz ent1x3; Entry does not exist, make it right 3; move to link column mov a,m; link(hi) ora a jz ent1x2; no link mov b,a; save right 1; move to link(lo) field mov l,m mov h,b; hl := link address jmp ent1x1; and repeat ent1x2: mov h,b; restore h to left hand column call ffree; Find new spot and link in. Returns ; " " hl pointing to new entry. ent1x3: call link; make the entry. pred(hi) right 1 mov m,c; suffix lhld troom dcx h shld troom mov a,h ora l rnz; not full dcr a sta fulflg; else set full flag ret ; ; link entry de at location hl^ link: mov m,d; high right 1 mov m,e; lo ret ; ; Version 2 table entry ent2x: push psw push h call tbladr; to physical loc only, affects nothing pop d; and check width etc?? lhld codes lda @table+1 add h mov h,a; convert to address ; " " ; Entry is made here, but normally flagged as "unreferenced" (until ; received by decode). Until then entries are "bumpable". If ffflag ; is 020h the reference is flagged now, to protect atomic entries and ; WsWsW string emergency entries (from decode, despite not received) lda ffflag ora d; may set "referenced" bit mov m,a; pred(hi) right 1 mov m,e; pred(lo) right 1 pop psw mov m,a; suffix lhld codes; advance entry counter inx h shld codes inx h; Allow for crunch/uncrunch skew delay lda trgmsk; See if new code length needed cmp h rnz ral; carry was clear. Change to new length sta trgmsk; new target mask lda width inr a cpi maxwide+1 jz ent2x1; mark table full sta width; advance to new width ret ent2x1: mvi a,nfe; mark table full, at max width sta fulflg ret ; clrmem: lhld @table mvi m,guard; disallow entry #0 inx h; (used, but unmatchable) mvi e,vacant lxi b,t1000; mark entries vacant call fill lxi b,t4000 ; " " ; Fill hl^ for bc with zero fillz: mvi e,0 ; " " ; fill hl^ for bc with e fill: mov m,e inx h dcx b mov a,b ora c jnz fill ret ; ; find a free entry in the event of a hash collision. Algorithm is to ; first add 101 (decimal) to the current (end-of-chain) entry. If ; that entry is not free keep adding 1. When a free entry is found ; the link pointer of the original entry is set to the found entry. ; ; Called with adr of an entry in hl, returns hl = adr of new entry. ; a,f,h,l ffree: push b push d push h; save pointer to old entry for update mov a,l adi 101; relatively prime to table size mov l,a jnc ffree1; no carry, thus no wrap inr h lda @table+1 adi n10 cmp h jnz ffree1; no wrap-around lda @table+1; set to table bottom mov h,a ffree1: lda @table+1; compute # of remaining entries, adi n0f; counting up (last entry + 1 sub h; - current entry) mov b,a mov a,l; as far as the low byte is concerned cma; we know we are subtracting from 0. inr a jnz ffree2 inr b ffree2: mov c,a; result in bc mov d,h; keep copy mov e,l call cmpm; search for empty entry jnc ffree3; found vacant entry lhld @table; Else wrap to start of table lda @table+1 mov b,a mov a,d sub b; (adr to index# conversion) mov b,a mov c,e; target value call cmpm; continue search jc xnotcr; not found. should not occur ffree3: xchg pop h; original pointer to link right 3; move to link(hi) field call link; link to new entry xchg; returned in hl pop d pop b ret ; ; Search for vacant entry from hl^ up. Carry if not found ; Carry clear if found when hl points to found entry ; a,f,b,c,h,l cmpm: mov a,m cpi vacant rz inx h dcx b mov a,b ora c jnz cmpm stc; signal not found ret ; ; Return the mid-square of number of "pred" + "suffix" (actually the ; mid-square of # OR 0800h). Entry a = suffix, hl = pred. Returns ; result in a|l (not hl), ready to add a table offset. ; ; mid-square means the midddle n bits of the square of an n-bit num. ; Here n is 12. Results accumulate in a 16 bit register, with ; extraneous information overflowing off both ends of the register. ; ; Hash via mid-square of 12 bit input or'd with 800h. ; input is hl + a. Output in al registers. ; NOTE anomalous results for input out of range. Special handling ; since really needs to operate on 13 bit words to match the original. ; The algorithm is due to Robert A. Freed. This runs on 8080s, takes ; the identical code space as Mr. Freeds Z80 implementation, and has ; miniscule or no average performance penalty. By C.B. Falconer. ; ; Entry: a = suffix; hl = pred. Exit al = midsq ; a,f,b,c,d,e,h,l midsq: add l; hl := hl + a mov l,a; max result fffh+0ffh=010feh adc h; (normal, except special case) sub l mov d,a; save for special test ori 8; or with 800h. Max 18feh ; following should be 0fh, but modified to agree with original ani 1fh; mask to 13 bits. Max 1fffh rar mov h,a; max 7ffh mov b,a; m := bc := hl := input DIV 2 mov a,l; using n*n = 4 * (m * m) (n even) rar; or 4 * m * (m+1)+1 (n odd) mov l,a; and any final "1" gets discarded. mov c,a jnc midsq1; even, use m inx h; hl := m+1 ; " " ; special case test, input = 0ffffh+0 must hash to 800h ; from initial 1 byte string prefix = nopred, suffix = 0. mov a,d ora a; did input have high bit? mov a,h; holds 800h in this case rar; because using 13, not 12 bits rm; yes, return 0800h ; " " ; Multiplication. hl := bc * hl (12 lo bits of hl only) midsq1: mvi a,12; bits in m * m' multiplication dad h dad h; reposition multiplier dad h dad h; using 12, not 16 bit multiply xchg; multiplier to de mvi l,0; clear necessary portion midsq2: dad h; left shift accum. Main loop. xchg; discarding overflow past 16 bits dad h; left shift multiplier xchg jnc midsq3; multiplier bit = 0 dad b; =1, add in midsq3: dcr a jnz midsq2; more bits dad h; reposition 12 bit result ral dad h; shift 4 bits to A ral dad h ral dad h ral mov l,h; move down low 8 bits of result ani 0fh; mask off. result in a & l ret ; ; Get input token, variable width. Check nops etc ; Carry for eof ; a,f,b,c,d,e getkn: call getok mov a,d dcr a ana a; clear any carry rnz; code not 01xx mov a,e cpi (sprcod+1) and 0ffh; codes used rnc cpi nulcod and 0ffh jnc getkn; ignore null and spare codes, nop ret; must be rstcod or eof, cy set ; ; Get input token, variable width ; a,f,b,c,d,e getok: lxi d,0 lda width mov b,a lda lftovr mov c,a getok1: mov a,c add a; bit to cy, flags on remainder cz morein; lftovr was empty, get more mov c,a; and keep the remainder mov a,e ral mov e,a; shift into de mov a,d ral mov d,a dcr b jnz getok1; more bits to unpack mov a,c sta lftovr; save any remainder mov a,d ora e rnz stc; carry for 0 value (eof) ret ; ; subroutine for getok. Next input byte positioned etc. morein: call inbyte stc ral; bit to carry, set end marker ret ; ; clear version 2 tables ?? clrtbl: lhld @table; 4096 rows * 3 cols, main table lxi b,t1000 mvi e,vacant call fill lxi b,t2000 call fillz lhld xlatbl; Physical to logical translation table mvi m,guard inx h lxi b,t2800; 1400h * 2 entries mvi e,vacant jmp fill ; ; Figure out what physical loc'n the cruncher put its entry at by ; reproducing the hashing process. Insert the entry # into the ; corresponding physical location in xlatbl. tbladr: mov b,a call hash; to hl tblad1: mov c,h mov a,m cpi vacant jz tblad2; no entry, make it call rehash jmp tblad1 tblad2: xchg lhld codes; logical entry # xchg mov m,d mov a,h; right 1 for this table adi n14 mov h,a mov m,e lda xlatbl+1 mov h,a mov a,c sub h mov h,a ret ; ; rehash rehash: xchg lhld nextx; displacement from hash dad d lda xlatbl+1; page address mov d,a mov a,h cmp d rnc lxi d,tblsize dad d ret ; ; Check for code reassignment? recod: mov b,a mvi a,nff sta avail+1 mov a,b call hash; to hl recod1: mov c,h mov a,m cpi vacant jz recod4; end chain. Try make entry (elsewhere) lda avail+1 cpi nff jnz recod3; have an entry push h; physical table pointer mov d,m; entry # (hi) mov a,h adi n14; right 1 mov h,a mov l,m; entry # (lo) lda @table+1; convert to addres add d mov h,a mov a,m ani 020h jnz recod2; not bumpable, try next shld avail; save resulting entry # for later use recod2: pop h recod3: call rehash; to next link in chain jmp recod1 recod4: lhld avail; reassign the entry pointed to by avail mov a,h; (if any), redefine "last pred entrd" cpi nff; and "last suffix" vars. rz; none available xchg lhld troom inx h shld troom; keep track of codes re-assigned lhld lastpr xchg lda char mov b,a call link right 1 mov m,b ; " " hash: mov e,l dad h dad h dad h dad h xra h mov l,a mov a,e ani n0f mov h,a lda xlatbl+1; add in table offset add h mov h,a inx h; eliminate 0 case push h xchg lhld tbltop dad d; make index dependant, not address shld nextx; rehash value, -ve no. pop h ret ; ; get next byte to a ; a,f inbyte: push b push d push h call getbyt pop h pop d pop b ret outbyt: push psw push b push d push h call out pop h pop d pop b pop psw ret ; ; Initialize variables, pointers, limits init: lhld xlatbl; hi byte is 0 lxi d,-tblsize mov a,e sub l mov l,a mov a,d sbb h mov h,a shld tbltop; -(xlatbl + tblsize) lxi h,itable lxi d,fulflg; copy the "shadow" lxi b,itbsize ; " " ; Equivalent (almost) to ldir. de^ := hl^ for bc ; a,f,b,c,d,e,h,l move: mov a,m stax d inx h inx d dcx b mov a,b ora c jnz move ret ; initializing table ("shadow") for data area itable: db 0 dw nopred db 1 dw 0 db vacant db initw; initial cell width db 2 dw 0 itbsize equ $-itable ; dseg; data area ; ; area is reinitialized on each call fulflg: ds 1; "ff" when table is full lastpr: ds 2; last pred entflg: ds 1; "already entered in table" flag codes: ds 2; current code? lftovr: ds 1; previous input bits still unused width: ds 1; current encoded cell width trgmsk: ds 1 outflg: ds 1; last output char ds 1; repeat flag for output (at outflg+1). 0 or 1 ; spsave: ds 2 kind ds 1; version to be decoded. 0..1 for 10/20 resp. char: ds 1; last char of the previously decoded string avail: ds 2; hi byte is reassigning code flag when 0ffh ffflag: ds 1; force entries to be marked referenced nextx: ds 2; next hashed index ?? @table: ds 2; base of decoding table xlatbl: ds 2; translation table, hash to entry tbltop: ds 2; negated address troom: ds 2; space left in table stklim: ds 2; Stack usage limit endu equ $; available storage above here end