CP/M Assembly Language Part VI: Disk Files by Eric Meyer By now you have a good understanding of the basics of 8080 assembly language. It's time to learn to do something truly useful with it: how to read and write disk files. 1. BDOS File Access The CP/M BDOS is set up to make access to disk files as easy as possible. You don't have to worry about where the file is on the disk, or any of the mechanics of reading and writing blocks of data. A small handful of BDOS functions take care of all these tasks: 1. SET DMA -- designate a 128 byte buffer to hold data; 2. OPEN a file -- locate file and prepare to read or write; or MAKE a file -- create a new file to write; 3. READ or WRITE -- a record (128 bytes) of data (repeat as many times as needed); 4. CLOSE a file -- finish updating a file written to. Involved here are two "data structures", used to communicate with these BDOS functions: the DMA and the FCB. The DMA (Direct Memory Access) is simply a small area of memory where the BDOS is to put (or find) the data that will be read (or written) when the file is accessed. Data is read in units of 128 bytes, called "records". There are eight of these in 1K. You simply declare a region of memory to be the DMA area, by passing its address to the BDOS with the appropriate function call. As it happens, the default CP/M DMA is at address 0080h near the bottom of memory, and you can just use that if you only need one DMA at a time; otherwise, you need to set your own DMA, which is as simple as: LXI D,MYDMA ;point to the region you want MVI C,26 ;function 26 sets the DMA CALL BDOS ;ask BDOS to do it (Remember declare BDOS EQU 0005H.) You can reserve space for this (or any other) purpose with the assembler instruction DS, which we may not have mentioned before, e.g.,: MYDMA: DS 128 ; one record for DMA sets up a 128-byte buffer at that point for use as a DMA. The initial contents of a DS block are undefined (unpredictable). 2. The FCB The FCB (File Control Block) is a structure used by the BDOS to keep track of your position in a file when it's open. Basically it is a working copy of the directory entry for the file, and if you've ever snooped around a disk directory with DU or some such utility, it will look very familiar: FCB: d F I L E N A M E T Y P e x x x x x x x x x x x x x x x x x x x c r r r The FCB is 36 bytes long. The first byte ("d") is the number of the drive (00=logged drive, 01=A:, 02=B:, etc). The next 11 bytes are the filename and type. The next byte ("e") is the current extent number (an "extent" is a unit of 16K; long files have several extents). All the "x" bytes are used by the BDOS to keep track of the physical location of data on the disk. Their values don't concern us. The next byte ("c") is the current record number, within the current extent. And finally, the three bytes "rrr" are for a random record address (not ordinarily used when accessing a file sequentially, as we will be doing here). You can set up any number of FCBs in your own program, to read and write files as needed. There is also a default CP/M FCB at address 005Ch, which is set up when you invoke a program with the name of any argument you give. For example, when you enter WordStar with a command like A>ws mytext.fil, CP/M sets up the default FCB at 005Ch in the following way: 005C: 0 M Y T E X T _ _ F I L 0 x x x x x x x x x x x x x x x x x x x 0 0 0 0 Thus WordStar can look at this location to see whether you have given it a filename to open and edit. In this case it sees that you have: on the logged drive ("00"), in this case A:, the file MYTEXT.FIL. It can read this file from the beginning (note extent 0, record 0) just by pointing to this FCB and asking the BDOS to open the file. A further complication arises when you run a program with two arguments, typically an input and an output file, for example: A>pip newfil.txt=b:oldfil.txt The first file goes into the FCB at 005Ch; the second name, if present, is put at 006Ch. 005C: 0 N E W F I L _ _ T X T 0 x x x 006C: 2 O L D F I L _ _ T X T 0 x x x 0 0 0 0 As you can see, the second name is sitting in the middle of the default FCB! What PIP has to do is to copy the second filename out to another FCB, which it constructs elsewhere. Then the default FCB at 005Ch can be used to write NEWFIL.TXT, and the other FCB (in PIP someplace) can be used to read B:OLDFIL.TXT. 3. Opening an Existing File In order to read an existing disk file, all you have to do is "open" it. After constructing an appropriate FCB (or using the default one, if possible), you simply: LXI D,FCB ;point to the FCB with D-E reg. MVI C,15 ;Function 15 opens a file CALL BDOS ;ask BDOS to do it At this point, if the specified file existed, it can now be accessed with BDOS read/write functions, described below. However, it's also possible that the file didn't exist, in which case you do not want to go on and try to read or write to it! All the BDOS file functions leave a "return code" in the A register, which you can then examine. For the OPEN, CLOSE, and MAKE functions, a code from 00-03 indicates success, while FFh indicates an error. So immediately after the BDOS call you should check this return code, and if you see FFh, exit with an error message of some sort. For example, you could do this: CPI 0FFH ;is the error code FF? JZ IOERR ;jump to i/o error routine if so In the case of the OPEN function, an error generally means the file was not found. 4. Making a New File If you're creating a new file, the process is very similar: after creating the appropriate FCB, you simply: LXI D,FCB ;point to the FCB with D-E MVI C,22 ;Function 22 makes a new file CALL BDOS ;ask BDOS to do it Again, you had better check to see that the return code is not FFh before continuing to write data to your new file. An error means there was no room in the directory for another filename. 5. Closing a File This is a bit out of order here, but: after you have finished writing data to a file, you need to "close" the file, to ensure that all the new data has actually been written to disk, and the directory has been updated accordingly. By now you can probably guess how this is done: LXI D,FCB ;point to the FCB with D-E MVI C,16 ;Function 16 closes a file CALL BDOS ;ask BDOS to do it Afterwards, check the return code. If it was FFh, the disk (or possibly the directory) filled up and some of your data could not be written. NOTE: you only need to close a file that has been written to. Files opened for reading only need not (and probably shouldn't) be closed. 6. Reading and Writing Sequentially Once a file is "open" (or "made") it can be read or written to. This can be done either randomly or sequentially. Random access is used when quick access to data is needed, and you are going to maintain some table or index to tell you in what record in the file a given individual entry can be found. (This is often done by database programs.) Sequential access is far more common, though, and just goes from beginning to end of the file, reading or writing the whole thing. All you do is: LXI D,FCB LXI D,FCB MVI C,20 ;Read Seq. OR MVI C,21 ;Write Seq. CALL BDOS CALL BDOS Each time the record of data in the current DMA (for the moment, at 0080h) is read from (or written to) the file, and the record count is updated to point to the next. Again there are return codes. A return code of 00 indicates success. For READ, an error code of 01 means you've passed the end of the file, and there are no more records to read. Any other error code indicates a physical error (such as disk full). 7. Character Input: GETCH Although the BDOS organizes disk i/o in terms of records of 128 bytes, this is almost never the unit of interest to you. If you are reading a text file, for example, you will be interested in the individual character, or possibly line, of text. Thus it will be convenient to write a pair of functions to "hide" the BDOS i/o activity, and just pretend that you're reading and writing one character at a time. We will call them GETCH and PUTCH. The usage will be: ;Reading a character ;Writing a character CALL GETCH ;get char in A MVI A,xx ;put char in A JC IOERR ;jump if error CALL PUTCH ;write to file CPI 1AH ;is it EOF? JC IOERR ;jump if error JZ ISEOF ;jump if EOF For now, both routines can only be used to read/write ONE file at a time. Here is the read routine: ;Routine to get character from open file at GCFCB ;Returns char or EOF in A, and C for Error GETCH: PUSH H ;save registers PUSH D ; (so GETCH will be easy PUSH B ; to use) LDA GCFLG ;check EOF flag CPI 0 ;is it clear? JNZ GCEOF ;if not, at EOF LDA GCPOS ;get position count CPI 80H ;is it up to 128? JC GCCHR ;no, just go get char LXI D,GCDMA ;yes, need to read another record MVI C,26 ;set the DMA to GCDMA CALL BDOS ;ask BDOS to do it LXI D,GCFCB ;use the FCB at GCFCB MVI C,20 ;read a record sequentially CALL BDOS ;ask BDOS to do it CPI 1 ;check return code JZ GCEOF ;oops, 1: no more (end of file) JNC GCERR ;argh, >1: physical error STA GCPOS ;0: read successful, set GCPOS to 0 GCCHR: LXI H,GCDMA ;point to DMA with H-L MOV E,A ;move GCPOS into E MVI D,0 ;now D-E is 16-bit version of GCPOS DAD D ;GCPOS+DMA points to next char MOV A,M ;get that char into A LXI H,GCPOS ;point to GCPOS with H-L INR M ;increment GCPOS to point to next CPI 1AH ;is it ^Z? JZ GCEOF ;if so, record it STC CMC ;clear Carry JMP GCRET ;and return the character GCEOF: MVI A,1AH ;EOF, get a ^Z STA GCFLG ;set flag for future reference STC CMC ;clear Carry JMP GCRET ;return the ^Z GCERR: MVI A,1AH ;ERROR, get a ^Z STA GCFLG ;set flag STC ;and set Carry GCRET: POP B ;restore POP D ; the POP H ; registers RET ;and return GCFLG: DS 1 ;flag says EOF reached GCPOS: DS 1 ;keep track of position in record GCDMA: DS 128 ;DMA for GETCH, 128 bytes GCFCB: DS 36 ;FCB for GETCH Let's discuss how this works. The key is the byte variable GCPOS, which runs from 00. . .7FH to keep track of which is the next byte in the record (in the DMA) to be read. GCPOS starts out at 80H, which indicates that a new record must be read in. Once this is done, GCPOS is reset to 0. Then each time a character is needed, it is found by adding GCPOS to the DMA address, and GCPOS is incremented to point to the next. When GCPOS reaches 80H again, the whole record has been used, and a new one must once again be read in. Ordinarily, GETCH simply returns with the character read in A, and the C flag clear. But once the end of the file has been reached (either no more records, or the EOF character 1Ah), GETCH returns with an EOF. (Note the variable GCFLG, a flag that is set non-zero once this occurs, so that GETCH will continue to return EOFs thereafter.) And if an error is encountered trying to read the file, the C flag is returned. You have seen most of these instructions before, but there are a couple of new ones here. LDA and STA are rather like MOV A,M and MOV M,A: they fetch or store the value in A to a memory address, but to the address you specify directly, instead of to the address in H-L. Thus STA GCPOS stores the value in A to address GCPOS, and LDA GCPOS gets it back. STC and CMC are the instructions to "set Carry" and "complement Carry"; they only affect the Carry flag. There is no simple "clear Carry" instruction, which is what we really want to do here; you have to first set it, then complement it. ("Complement" means, roughly, "reverse".) 8. Character Output: PUTCH Here now is the complementary character output routine: ;Routine to write character in A to open file at PCFCB ;Returns C for write Error PUTCH: PUSH H ;save registers PUSH D ; (so PUTCH will be PUSH B ; easy to use) MOV C,A ;save the character in C LDA PCPOS ;get position count CPI 80H ;is it up to 128? JC PCCHR ;no, just go write char PUSH B ;preserve character in C LXI D,PCDMA ;need to write the record out MVI C,26 ;set the DMA to PCDMA CALL BDOS ;ask BDOS to do it LXI D,PCFCB ;use the FCB at PCFCB MVI C,21 ;write a record sequentially CALL BDOS ;ask BDOS to do it POP B ;restore character in C CPI 0 ;check return code JNZ PCERR ;argh, >0: physical error STA PCPOS ;0: write successful, set PCPOS to 0 PCCHR: LXI H,PCDMA ;point to DMA with H-L MOV E,A ;move PCPOS into E MVI D,0 ;now D-E is 16-bit version of PCPOS DAD D ;PCPOS+DMA points to next char MOV M,C ;put outgoing char in its place LXI H,PCPOS ;point to PCPOS with H-L INR M ;increment PCPOS to point to next SUB A ;clear Carry flag, all is OK JMP PCRET PCERR: STC ;write error, set Carry PCRET: POP B ;restore POP D ; the POP H ; registers RET ;and return PCPOS: DS 1 ;keep track of position in record PCDMA: DS 128 ;DMA for PUTCH, 128 bytes PCFCB: DS 36 ;FCB for PUTCH ; By and large, this is very similar to GETCH. Note the need to PUSH and POP the outgoing character, so it isn't lost when we do the BDOS calls to write a record periodically. Also note that PCPOS starts off at 0, not 80H, because you don't need to write a record until it's full (whereas in GETCH, we needed to begin by reading a record). 9. Using GETCH and PUTCH To use these routines properly, we will need a few routines to open, make, and close files properly for them. First, for GETCH, we need a routine to open the file for reading: ;Routine to open a file for GETCH (takes DE=FCB) ;Returns C if file not found GCOPEN: MVI A,80H ;initialize GCPOS to 80H STA GCPOS ;so first record will be read SUB A ;and zero the EOF flag STA GCFLG XCHG ;put FCB address in HL LXI D,GCFCB ;we'll move it into GCFCB MVI B,12 ;the drive/filename is 12 bytes GCL1: MOV A,M ;fetch a byte from FCB STAX D ;store it in GCFCB INX D ;point to next INX H ;in both places DCR B ;count down on 12 bytes JNZ GCL1 ;loop if more SUB A ;now get a 0 STAX D ;and put it into extent ("e") STA GCFCB+32 ;and also into record ("r") LXI D,GCFCB ;point to start of GCFCB again MVI C,15 ;function 15 opens a file CALL BDOS ;ask BDOS to do it CPI 0FFH ;check return code CMC ;now Carry is set if it was FFH RET ;return with Carry set if error This routine simply copies the filename from the FCB address given to it in the D-E register, into the GCFCB that will be used by GETCH; then it tries to open the file, returning with the C flag set if it could not. There is one new instruction here. STAX D (and its relative, STAX B) are just like MOV M,A, except that they use the D-E (or B-C) registers as pointers instead of H-L. Similarly, there are instructions LDAX D and LDAX B, which load values as does MOV A,M. (Unfortunately, these arbitrary names make them look completely different.) The routine to make a new file for use by PUTCH will be very similar. For now, it simply erases any pre-existing file of the given name; later we may add code to preserve such a file instead, possibly renaming it to a ".BAK" file. ;Routine to make a file for PUTCH (takes DE=FCB) ;Returns C if cannot make file PCOPEN: SUB A ;initialize PCPOS to 0 STA PCPOS ;for beginning to write with PUTCH PUSH D ;preserve FCB address MVI C,19 ;function 19 ERASES a file CALL BDOS ;ask BDOS to do it POP H ;recover FCB address into HL LXI D,PCFCB ;we'll move it into PCFCB MVI B,12 ;the drive/filename is 12 bytes PCL1: MOV A,M ;fetch a byte from FCB STAX D ;store it in PCFCB INX D ;point to next INX H ;in both places DCR B ;count down on 12 bytes JNZ PCL1 ;loop if more SUB A ;now get a 0 STAX D ;and put it into extent ("e") LXI H,20 ;point ahead 20 bytes DAD D ;HL now points to record ("r") MOV M,A ;put 0 there, too LXI D,PCFCB ;point to start of GCFCB again MVI C,22 ;function 22 makes a file CALL BDOS ;ask BDOS to do it CPI 0FFH ;check return code CMC ;now Carry is set if it was FFH RET ;return with Carry set if error Note the use of BDOS function 19 to delete any file with the same name before we try to make the new file. We don't care about any error that may result: if the file did exist, it's now gone; if it didn't, the attempt to erase it failed, but that's OK too. Use function 19 with caution; we'll say no more about it here. Finally, we need a routine to clean up after PUTCH, when we're all done writing, and close the file. The task is complicated by the fact that there may still be characters sitting in the PCDMA buffer, that need to be written before we close the file. Here is the routine: ;Routine to close a file for PUTCH (no arguments) ;Returns C if cannot close file PCLOSE: MVI A,1AH ;get a 1AH (EOF char) CALL PUTCH ;and write it to the file LDA PCPOS ;now look at position in PCDMA CPI 0 ;is it 00? JZ PCLFIL ;if so, no data, just close file LXI D,PCDMA ;need to write the last record out MVI C,26 ;set the DMA to PCDMA CALL BDOS ;ask BDOS to do it LXI D,PCFCB ;use the FCB at PCFCB MVI C,21 ;write a record sequentially CALL BDOS ;ask BDOS to do it CPI 0 ;check return code JNZ PCLERR ;warn if error PCLFIL: LXI D,PCFCB ;all set, point to the FCB MVI C,16 ;function 16 closes a file CALL BDOS ;ask BDOS to do it CPI 0FFH ;examine return code CMC ;Carry now set if was 0FFH RET ;return PCLERR: STC ;set Carry for write error RET ;and return By now this should all be pretty self-explanatory. 10. Things To Come Take a deep breath! Next time we'll use these routines to construct "filter" programs that read and process text files.