CP/M Assembly Language
                        Part X: Debugging
                          by Eric Meyer

     This month, we're going to take a look at a very useful
utility that comes with every CP/M system, but usually has little
or no documentation: your debugger.


1. What Are Debuggers For?
     Up to this point, we have concentrated on the basics of
writing your own assembly code.
     However, the quickest route to assembly language skills (as
well as just getting the software you already have to do exactly
what you want) is by studying, and modifying code written by
others.
     Much public domain software is available in source code, but
some isn't and of course commercial software, probably including
various utilities that came with your computer, is not.
     Many programs can be modified to customize their operation
according to your taste. (WordStar, for example, is notorious for
its complexity in this regard.)
     You may be given a list of "patch addresses" and values for
various options, and you will have to go in and change a byte
here and there.
     Often there will be some task specific to your computer
involving function keys, video display, clock, etc., that you
would like to perform in one of your own programs.
     Probably one of your system utilities already does it, but
you don't know how.
     There are various tools that allow you to study (and change)
the way code is put together, starting from just a COM file.
     One such tool came with your CP/M system: the "debugger" DDT
(or SID, for CP/M Plus).
     This is basically a program to read and modify bytes of
memory, and it works either on what's already in memory (like
your CP/M BIOS) or on a COM file that you ask it to read in.
     As usual, there are more sophisticated debugging tools
available but everyone has DDT/SID, so we'll use these.
     The two programs are extremely similar, but differ in some
respects (mostly regarding reading and writing files).


2. Examining Memory
     All you need to do to satisfy your curiosity about CP/M is
to enter DDT/SID and take a look around. If you don't have a copy
of the program available already, dust off your original system
disk and make one; then type

              A>ddt<cr>
     or       A>sid<cr>
 
     You will see a version message, then a prompt like "-" or
"#". When you get bored, just type ^C here to exit the program.
     Meanwhile, there is a whole alphabet of commands available.
Some take numerical arguments; these are in hexadecimal by
default, though you also can type # followed by decimal input.
     The simplest thing to do is to DISPLAY memory. This produces
a listing in both hex and ASCII, between any two addresses:

         -d0000,0007
         0000: C3 03 E3 81 00 C3 00 BD .......

     (Note, while we're here, that 0000 is the warm boot vector.
Here, it reads JMP E303, which tells you where your BIOS jump
table is. 0003 is the IOBYTE under CP/M 2.2, controlling I/O
redirection. 0005 is the BDOS vector; it reads JMP BD00.)
     Ordinarily you would see something like D906, your BDOS
entry point. DDT has changed this, for reasons that will become
apparent.
     Like most commands, DISPLAY assumes natural defaults if you
don't specify the addresses: start address where you left off
last, end address 192 bytes (C0) later. Examples:


-d0,ff       display from 0000 to 00FF
-d100        display from 0100 (to about 01C0)
-d           display the next bunch

     If what you're looking at is text or other data, the display
will make a fair amount of sense. If it's machine code, however,
it will look like gibberish. (It can be a challenge to tell the
difference between code and data, when you don't know beforehand
what you're working with.)
     To make sense of what may be program code, use the LIST
command. This works the same way as DISPLAY, but it produces
output in assembler mnemonics!  Example:


-l100   list from 0100 (about 11 lines worth)

     You'll see familiar things like "RET" on screen. If what
you're looking at really is code, the sequence should make some
sense.
     If it's actually data, the "code" produced by the LIST
command will be nonsense. (ASCII text, for example, tends to
produce inane strings like "MOV H,B; MOV H,C; . . . ")
     The LIST command makes it possible to reconstruct source
code -- with many limitations. You have to guess what's code and
what's data, where one routine starts and another ends. Z80
instructions aren't supported, so when encountered they generate
several bytes of garbage.
     There are no labels or comments. Where someone may once have
written:

NEWLN:  CALL SPMSG      ;line is full...
        DB   CR,LF,0    ;...start a new one
        JR   LOOP       ;and look for next entry

you are now going to see merely something like

08BC CALL 1C04     (hmm . . .  what's that?)
08BF DCR  C        (garbage generated by
08C0 LDAX B          interpreting the text
08C1 NOP             CR,LF,0 as machine code)
08C2 ??=  18       (DDT can't understand the
08C3 CMA             Z80 "JR" either)

     But at least it's a start!
     If you get confused about numerical values, the HEX VALUE
command will straighten you out. Given an argument in hex,
decimal, or ASCII, it tells you the rest. Example:

-h41<cr>
0041 #65 'A'


3. Making Changes
     There is a similar pair of commands for changing what you
find. Obviously you should use them with caution; it is easy to
make a mistake and crash your system.
     The SET MEMORY command allows you to change a byte at a
time, in hex or ASCII. It takes one argument, a start address.
     Byte by byte, it shows the address and its current contents.
You can hit <cr> to leave a value alone, or type in a new value,
in hex, decimal (with "#"), or ASCII (in single quotes). To stop,
type a period ".". Example:

-s200<cr>
0200 44 <cr>       (leave this alone)
0201 53 52<cr>     (change this)
0202 73 'w'<cr>    (and this)
0203 00 .<cr>      (all done)

     It also is possible to input assembler mnemonics, with the
ASSEMBLE command. This turns DDT/SID into an instant interactive
assembler, though rather limited in features. Example:

-a186<cr>
0186 call 1124<cr>
0189 .<cr>

(You don't get to see the original values; use LIST first if
needed.)
     You can even manipulate the individual registers of the 8080
while debugging a program!  The EXAMINE command will show you the
contents of the CPU registers, or allow you to change them:

-x<cr>
-Z-E- A=00 B=0000 D=0000 H=0000 S=0100 P=0100  NOP

     This is telling you: the Zero and Parity (Even) flags are
set, while the rest are clear.
     The contents of registers A, BC, DE, HL, and SP, PC (the
stack and program counters), are as shown, and what the
instruction at the PC (to be executed next) is. To change a value
in a CPU register, just specify the register:

-xh<cr>
H=0000 1234

     Now when you tell DDT to execute a piece of code, it will
start out with register HL=1234h.


4. Running and Tracing Code
     You can actually test out programs, or bits of code that you
write directly with the ASSEMBLE command. (I say "bits" because
it isn't practical to do large programs without the convenience
of labels.)  The GOTO command will jump to, and execute, any
block of code.
     Before you do this, you should see that the outermost
routine ends with a special instruction that will return control
to DDT:  RST 7
     If it ends with a warm boot (like JMP 0000) it will kick you
out of DDT, and if it ends with just a RET it will wind up who
knows where.
     Try out this little GOTO routine:

-a100<cr>                 (create a small routine)
0100 mvi e,7<cr>
0102 mvi c,2<cr>
0104 call 5<cr>
0107 rst 7<cr>            (it will return to DDT)
0108 .<cr>
-g100<cr>                 (run it)

     (Do you understand what happened?)

     In fact, DDT/SID is much more complicated. You can set
"breakpoints", so that execution of a long program will stop at
certain points for you to examine what has happened, or to
request a "trace" that will show exactly how the contents of the
CPU registers are changing, if you're having problems. This is an
excellent way to learn how assembly language works.
     The whole debugging system is too complex to explain here,
but it's useful to know at least how to trace a routine.
     Let's go back and take a closer look at the little
bellringer we wrote above. To use the TRACE command, you must
first set the PC to point to the code you want to run, then tell
DDT how many instructions to execute.
     If you want to be really cautious you can just type "t" over
and over again to trace a single instruction at a time; otherwise
you can use something like "t10" to go 10 at a time.
     First let's make sure everything is set up:

-l100,107<cr>
0100 MVI  E,07
0102 MVI  C,02
0104 CALL 0005
0107 RST  7
-x<cr>
-Z-E- A=00 B=0007 D=1000 H=0000 S=0100 P=0107  RST

     So the routine is still there. Note the state the CPU was
left in from running it the first time: the Zero and Parity flags
are set, probably from something having zeroed the A register.
There's an "07" in register C because that's where the "bell"
character was moved for the BIOS CONOUT routine. The PC is at
0107, where the routine ended.
     To run it again in TRACE mode, we set the PC back to 0100
and then use the "t" command.

     Here's what I see (and hear) on my computer:

-xp<cr>
P=0107 0100
-t10<cr>
-Z-E- A=00 B=0007 D=1000 H=0000 S=0100 P=0100 MVI  E,07
-Z-E- A=00 B=0007 D=1007 H=0000 S=0100 P=0102 MVI  C,02
-Z-E- A=00 B=0002 D=1007 H=0000 S=0100 P=0104 CALL 0005
-Z-E- A=00 B=0002 D=1007 H=0000 S=00FE P=0005 JMP  BD00
-Z-E- A=00 B=0002 D=1007 H=0000 S=00FE P=BD00 JMP  C3A4
-Z-E- A=00 B=0002 D=1007 H=0000 S=00FE P=C3A4 XTHL
-Z-E- A=00 B=0002 D=1007 H=0107 S=00FE P=C3A5 SHLD D6F2
-Z-E- A=00 B=0002 D=1007 H=0107 S=00FE P=C3A8 XTHL
-Z-E- A=00 B=0002 D=1007 H=0000 S=00FE P=C3A9 JMP  D806
(beep!)
-Z-E- A=00 B=0007 D=1000 H=0000 S=0100 P=0107 RST  07


     You can follow everything that happened, line by line. First
an 07 (ASCII Bell) is put in the E register, and the PC is bumped
to 0102 to point to the next instruction.
     Then 02 is put in C, and we point to 0104. Then we CALL 0005
(BDOS): a return to the next program address (0107) is placed on
the stack, so the SP gets bumped down to 00FE (remember how the
stack grows top down?), where 0107 is stored, and then the PC is
made to point to 0005.
     There we should find a JMP to the BDOS, which would
ordinarily have been something like D806, but in this case is
BD00 because we're passing through some DDT code in high memory
(more on this later).
     The next few things are being done by DDT itself: we JMP
again to C3A4; there, XTHL exchanges what's on the top of the
stack (which is 0107, our return address) with the current HL
register (which is 0000).
     The SHLD D6F2 stores this return address, for some internal
DDT purpose.
     Then we restore it again with another XTHL, and finally JMP
to the "real" BDOS at D806. (Note that DDT does NOT trace the
actual workings of the BDOS, only of your code, and a bit of its
own.)
     At this point the bell rings, and we return. Note that the
BDOS has moved the 07 from E into C, in order to call the BIOS
CONOUT routine which expects to find it there. Also, popping the
return address (0107) off the stack returns the SP to 0100.
     (Note: DDT/SID can fill up the screen with data far faster
than you can comfortably read; but like many programs, it will
pause if you type a ^S, and resume on a ^Q.)
     If you're still shaky on understanding some of the routines
we've already written and used, tracing them is a great way to
figure out how they really work.


5. Reading and Writing Files
     Fortunately, you're not limited to playing transient games
in memory. You can also read and write disk files, allowing you
to save what you've created, or to permanently modify ("patch")
an existing program.
     Unfortunately the methods DDT and SID use for file I/O
differ, so we'll run the examples in parallel columns, DDT on the
left, SID on the right.
     To read in an existing (usually COM) file is simple:

   -iFILENAME.COM<cr>  or  #rFILENAME.COM<cr>
   -r<cr>

     You will see a message listing several addresses. In SID you
can safely ignore this; in DDT you'd better remember what the
"NEXT" address given was.
     Once a file is read in, you can examine or change it with
all the commands mentioned above. Even a simple DISPLAY can be
amusing. You may find messages and features you didn't know were
there, such as "ILLEGAL ATTEMPT, NOW REFORMATTING HARD DISK". (Or
just curious text left in the code by the programmer: WordStar's
"Nosey, aren't you?" is a classic example.)

     Writing out the file is equally simple in SID:

     #wFILENAME.COM<cr>

     If this was not an already existing file, but something you
just created, you will have to specify the address range you want
to write out, usually starting at 0100, e.g., to save our little
bell program you might use:

     #wBELL.SID,100,107<cr>

     This will actually write out everything from 0100 to 017F,
since files are written in whole records (128 bytes). (Note that
I didn't call it "BELL.COM" because it ends with a RST 7, so it
will do odd things if you try to run it under CP/M. If you first
change this to a RET, you can write BELL.COM too.)
     Unfortunately DDT has NO mechanism for writing a disk file!
You have to EXIT from DDT (with a ^C) and then use the CP/M 2.2
SAVE command to create a disk file from what's in memory:

     -^C
     A>save 1 BELL.DDT<cr>

     Hmm, what was that "1"?  The SAVE command works in units of
memory pages (256 bytes, 2 records). It always starts at 0100,
and it needs to know HOW MANY pages to save.
     Obviously BELL is just one page long, but in general, you
will have to do an ugly hex calculation to get the page length
from that "NEXT" address that you remembered when the file was
loaded into DDT. Each 100H is a page, and 1000H is 16 pages.
     Suppose you saw "NEXT 1B80" when you first read the file
into DDT. That means the file runs from 0100 to 1B7F, so it's
1A80 bytes (1B80-0100) long. Breaking that up, we get:

1000 = 11 x 1000 =  1 x 16 pages = 16 pages
 A00 =  A x  100 = 10 x 1 page   = 10 more pages
  80 = an extra half page        =  1 more page

so you would want to tell SAVE to write 27 pages. (Phew.)
     (By the way, how often have you wanted to create a 0k "file"
to serve just as a disk or user area label?  Under CP/M 2, just:

     A>SAVE 0 --MAIL--.87<cr>

     Under CP/M 3 the SAVE command is quite different, and the
easiest thing to do is instead to type

     A>PIP --MAIL--.87=CON:<cr>

and then type ^Z.)


6.  More Commands
     I don't want to go into great detail on all the DDT/SID
commands (like call, passpoint, untrace), but here briefly are a
few more:

-fxxxx,yyyy,vv      FILL memory xxxx-yyyy with vv
-mxxxx,yyyy,zzzz    MOVE memory xxxx-yyyy to address zzzz
-iTEXT . . .        INPUT line: besides being needed before

an "r" command in DDT, this also sets up the FCBs and DMA in Page
Zero just as the CCP would on encountering the arguments TEXT . .
.  So you can run a program that expects command line arguments
under DDT/SID, by setting them up first with "i" before your "g"
or "t" command.
     After you've used DDT for a while, you will probably notice
that despite the variety of commands, there are some pretty
obvious ones that are missing.
     Worst, perhaps, is that there is no "search" command to find
a string of bytes. I have written a small RSX (SIDRSX11 from FOG-
CPM.164) that can be easily attached to SID on a CP/M 3.0 system,
that adds the commands "?" and "!":

#?data<cr>            FIND a data string
#!xxxx,data<cr>       WRITE a data string at address xxxx

     In each case, the data can be any mixture of hex digits
(like EB412C00) and ASCII strings (like "Yes"). Unfortunately, an
RSX will not work with DDT under CP/M 2.2. You may want to look
through an array of simple public domain programs with names like
SEARCH or FIND that can do this, although it is annoying to have
to switch back and forth from these to DDT.


7. How Does It Do It?
     By this point, I hope you've begun to wonder how DDT/SID can
allow you to load program code and run it at 0100, just as if DDT
weren't there.
     The answer: DDT "relocates" itself to the high end of
memory, right under the BDOS, in order to free up the usual
program area for your debugging efforts. (Remember above, when we
did a CALL BDOS, how the call passed through some extra code
enroute?  That was DDT, keeping its own record of the return
address before passing the call along to the BDOS.)


8. Heavy Duty Disassembling
     You probably can see that trying to thoroughly break apart
and understand a whole program of some size with DDT needs a LOT
of work.
     While a debugger is useful for small scale tasks, if you
really want to understand a large piece of a program written by
someone else, you need a "disassembler" -- a program designed to
take any COM file and produce readable assembly language output.
     In principle, the task is still complicated by all the
difficulties of interpretation mentioned above, but in practice a
disassembler can do a remarkably good job.
     There are both public domain and commercial programs
available -- a really fine one, at least to start with, is the
public domain Z80DIS.
     This is a Z80 disassembler, written by Kenneth Gielow;
version 2.1 (Jan. 1987) is available on FOG-CPM.022 (6/87
revision).
     It's menu-driven, and reasonably fast. Most importantly,
though, it does a very good job of guessing what's code and
what's data, so that its first attempt at a reconstruction will
already be nearly correct. (Many other programs get much less
right the first time, meaning more work for you.)
     You may still have trouble figuring out exactly what the
code you're looking at DOES, but a good disassembler makes the
task as easy as humanly possible. Compare what you get from DDT
with the output of a disassembler:

ORIGINAL CODE
newln: call spmsg
       db   cr,lf,0
       jr   loop

DDT OUTPUT
08BC CALL 1C04
08BF DCR  C
08C0 LDAX B
08C1 NOP
08C2 ??=  18
08C3 CMA

DISASSEMBLER OUTPUT
J#08BC: CALL C.1C04
  DEFB 0D,0A,00
  JR   J.0821

     A good disassembler will give you a source code file that
can be used to understand and modify a program in ways that go
beyond the capabilities of a simple debugger. (Note, if you
haven't already, that you usually can't insert or move code with
a debugger. This would change all the addresses, and the code
would crash if you tried to run it. But if you have source code
to edit and reassemble, this isn't a problem.)
     As an exercise, if you get hold of a disassembler, run it on
one of the programs (like FILTER.COM) you have already created.
See whether you could figure out what the program does, on the
basis of that very basic source code alone.


9. Caveats
     For safety's sake, keep an unmodified backup copy of all
software. You may find that your brilliantly clever modifications
suddenly aren't working, and it would be a shame to forget how to
put the program back together again.
     Some legal and ethical questions are inevitably raised by
the use of debugging tools on copyrighted (both commercial, and
some public domain) software.
     Though personal feelings may differ, it seems to me that
most anything you do with a debugger in the privacy of your own
home (at least, in states other than Georgia) is fine. It's no
crime to try to understand code, or to modify it to suit your
needs or tastes. But as a rule, don't distribute copies of what
you produce by disassembling or patching somebody else's code,
either under their name or your own, as source code or COM file,
for free or for money.
     That much said, have fun.