CP/M Assembly Language
                      Part II: The 8080 CPU
                          by Eric Meyer

     Last time we discussed the assembler itself, some basic
assembler directives, and their use in patching (modifying)
existing programs. Now we'll begin to investigate writing our own
programs, which will first require learning something about the
8080 chip itself. We will be using the standard Intel mnemonics
for 8080 instructions.
     (All of what follows will apply equally well to the Z80,
which simply has more registers and instructions. Unfortunately
there's a further complication: the Zilog instruction set,
besides being larger, also uses different names for all the CPU
instructions. We may deal with this in a later installment -- for
now everything applies to an 8080 assembler.)
     I'm assuming you have some programming experience so won't
explain such concepts as loops, subroutines, etc., though I will
discuss ways in which assembly language differs from the higher
level languages with which you may be familiar.


1. 8080 Registers
     The first thing to become accustomed to is the idea of a
register. There are no variables and types (real number,
character string) in assembly language, just numerical values and
the place where they are stored. At any given time, a program
will have most of its data stored someplace in memory, but what
it is working on immediately must be fetched into the CPU itself.
The CPU contains a number of registers for this purpose. By
convention they have one-letter names, and each can hold one byte
of data. Often they are depicted like this:

+-------+=======+
|   A   |   F   |
+-------+=======+
|   B   |   C   |
+-------+-------+
|   D   |   E   |
+-------+-------+          +-------+
|   H   |   L   | - - - >  |  (M)  |
+-------+-------+          +-------+

     Seven of the eight registers "A-L" are used to hold data
bytes; the "F" (Flag) register serves a different purpose, which
we'll discuss later.
     The "A" register (accumulator) is where most of the
arithmetic and logical operations take place, though it also can
simply hold a byte for a while. The "B-C", "D-E", and "H-L"
registers can function either separately or together in pairs to
hold a word (two bytes) of data, often a memory address. The "M"
register isn't really in the CPU at all -- it represents the
contents of the memory at the address in the H-L register pair.
Thus you can operate with a byte of data stored in memory just as
you would one stored in the CPU itself, by putting its address in
the H-L registers and referring to it as "register M".
     Typically the H-L pair is used as a pointer in this fashion.
The B, C, D, and E registers are used for a variety of tasks: as
pointers, holding counter values, storing bytes temporarily.


2. The MVI and MOV Instructions
     The simplest things you can do with a CPU register are put a
data byte into it, and move a byte from one register to another.
The MVI (move immediate) instruction puts a byte into a register:
MVI  A,13 and puts the value 13 (decimal, unless followed by "H"
for hex or "B" for binary) into the A register.
     The MOV (move) instruction simply moves whatever's in one
register into another: MOV  B,A takes the contents of the A
register (at the moment, 13) and puts that value in the B
register too. (The A register remains unchanged.)  Note that in
both cases, the destination comes first, then the source. Thus
you read "MOV B,A" as "MOVe into B, the value in A".
     Let'consider what's going on at the gut level. The assembler
will translate each "MOV" instruction into one instruction byte;
e.g., "MOV B,A" turns out to be 47H. Each "MVI" is also one byte,
followed by a second byte to tell it what value is being put in
the register, e.g., "MVI A," is 3EH so the instruction "MVI A,13"
will become the two bytes 3EH, 0DH. This is typical of how all
the assembly language instructions wind up in the executable
program. The sequence of hex bytes 3E,0D,47 in a COM file would,
when the program runs, put the value 13 into the A register, and
then into the B register too. While we will mostly be discussing
things at the level of the mnemonics (like "MVI") that you write,
it helps to know what's ultimately happening.


3. The LXI and XCHG Instructions
     Similar instructions let you manipulate 16-bit values in
register pairs though Intel mnemonics give these instructions
totally different names. The LXI instruction (load index
immediate) puts a word value into a register pair, referred to by
the first register name: LXI  H,801H, loads the H-L register pair
with the value 801 hex.
     The high byte (08H) goes into the first register (H), the
low byte (01H) into the second (L). (Note that this differs from
the "backward" order in which 16-bit values are stored in memory,
which we discussed last time.)  In fact, you could have done
exactly the same thing with the pair of instructions:

MVI  H,8
MVI  L,1

except that the "LXI" instruction makes it clearer what you're
doing (and in fact takes only 3 bytes of code, as opposed to 4
for the two "MVI"s).
     When you need to move 16-bit values around, you do in fact
have to use pairs of "MOV" instructions, e.g., to move this value
from the H-L register pair into B-C now would require:

MOV  B,H
MOV  C,L

     There is one exception. There are times when you would find
it convenient to exchange the contents of the D-E and H-L
register pairs, and this can be done by the simple instruction
XCHG.


4. The INR, DCR and INX, DCX Instructions
     The need to "increment" and "decrement" (add and subtract 1)
is very common. The INR and DCR instructions increment and
decrement a single register, e.g., INR  A adds one to whatever
value was in the A register previously.
     A similar pair, INX and DCX, work with 16-bit values in
register pairs: DCX  H would subtract one from the value in the
H-L register pair. If H-L still contained the value 801H we put
in a moment ago, it would now contain 800H, (i.e., H would still
contain 08H, and L would contain 00H.)  Note that this is not the
same as DCR  H which would decrement the H register as a single
byte, and not affect the L register at all. (If H-L had contained
801H, it would now contain 701H.)
     All arithmetic is cyclical here: negative numbers are
represented by their complements. For example, if you do this:

MVI  A,0
DCR  A

the A register will contain the value FFH, which you may
interpret as either 255 or -1, depending on the circumstances. If
you increment that, of course you will get zero again. Most
assemblers in fact allow you to write a statement like MVI  A,-1
which is actually exactly equivalent to "MVI A,255".


5. Moving Bytes Around
     It's time to see how these instructions fit together to
accomplish something potentially useful. Let's consider moving
several bytes of data (this could easily be text) from one place
to another.

             ORG  0100H         ;code begins here
0100 211101  LXI  H,SOURCE      ;point to source with H-L
0103 111301  LXI  D,DEST        ;point to destination with D-E
0106 7E      MOV  A,M           ;fetch byte from memory into A
0107 EB      XCHG               ;exchange so H-L is now dest
0108 77      MOV  M,A           ;store byte at destination
0109 EB      XCHG               ;now H-L is source again
010A 23      INX  H             ;point to the next byte
010B 13      INX  D             ;and the next destination
010C 7E      MOV  A,M           ;get another byte
010D EB      XCHG               ;switch to destination again
010E 77      MOV  M,A           ;store the byte
010F EB      XCHG               ;switch back to source
0110 C9      RET                ;all done, return.
0111 6869    SOURCE: DB   'hi'  ;data: the bytes we will move,
0113 3F3F    DEST:   DB   '??'  ;and where we'll put them
0115         END                ;end of source file

     In the middle is the source code, with comments to the
right. On the left I have put the actual addresses and
instruction bytes that will result. (Most assemblers can produce
a "listing" [LST or PRN] file just like this, for your
reference.) If you actually assembled this "program" into a COM
file, it would contain the 21 bytes of instructions and data
shown on the left, at addresses 0100-0115H.
     What's going on here?  "SOURCE:" and "DEST:" are labels --
the assembler will figure out the actual addresses where they
will wind up and will keep track of those (16-bit) values. When
you refer to SOURCE, for example, the assembler will substitute
the address of the label -- in this case 0111H -- so the
statement "LXI H,SOURCE" is actually "LXI H,0111H". Similarly for
DEST. The "DB" instruction will accept text characters in single
quotes as shown. (We could have written "DB 68H,69H" instead of
"DB 'hi'", since those are in fact the ASCII codes for these
letters, but it wouldn't have been as clear what we meant).
     Remember that the "M register" actually refers to whatever's
in memory at the address in the H-L register. Putting the address
SOURCE in the H-L pair automatically makes "M" refer to the byte
at that address, namely the 'h'. So "MOV A,M" fetches that 'h'
into the A register. Then the program exchanges D-E and H-L, so
that it's now DEST in the H-L pair, and "M" refers to the byte
there (the first '?'); and it stores the 'h' there. Then it
exchanges back again, increments both "pointers" so that they
point to the next byte of data (the 'i') and the next destination
(the second '?'), and does it again. Then we're finished, and the
program returns control to the operating system.
     You can write, assemble, and run this little program. You
might even try modifiying it, e.g., extend it to move three or
four bytes instead of just two. But, it is ridiculously easy to
crash your computer when doing assembly language programming, so
don't leave disks in the machine that you don't have copies of,
and don't be afraid to push the Reset button if disaster strikes.
     If you had used some higher-level language to write
something like:

100 DEST$="??"
110 SOURCE$="hi"
120 DEST$=SOURCE$

your compiler would have generated machine instructions similar
to those above, but about 10 times more. (When you program in
assembly language, you don't have to include a single byte that
you don't need.)
     This program was no big deal and nothing visible happened.
But be patient; there's a lot more to learn.