The following is a message I recently received on MicroNet from Russ Ranshaw of CIS. It covers some of the problems encountered with various network facilities. ----forwarded message follows---- Sb: File transfer protocols 16-Mar-82 13:55:18 Fm: Wiz-10 70000,1 To: Keith Petersen 70535,1245 There are several layers of network and switch gear between the user and the program running on one of our PDP-10 hosts. First there is the local node to which he is connected. If it is a CIS node, then it is a PDP-11 or some varient. That -11 is connected via a long-line to another -11. Depending on location of the local node, there may in fact be several -11's by the time the connection ends up in Columbus. Once here, the termination is in another -11 which is cross-bar connected to a set of PDP-15's. Each -15 services four KI-10 hosts. Normally, the delays from node to host are reasonably small. Once at the host, you are subjected to the problem that there are several jobs running, not just yours, and the monitor will schedule your job to run under any of several circumstances. If it is waiting for input, then the job will awaken when input is ready from the terminal. Here is another difficulty. To really do what you indicate, the job would have to run in what we call "break on all character mode," where each character input from the terminal is sent immediately to the host. The problem with this is that it dilutes the bandwidth of the 9600 baud long lines if there are a lot of jobs doing that. The normal mode is that characters are assembled into "packets" of 24 bytes before being sent, and a "break" character (CR, LF, ESC, BEL, and a few others) will terminate a packet and send it along. We prefer to run in this mode in order to better utilize the long-lines. Running in 8-bit mode is no problem on CIS nodes, but it has its own problems. There are two ways to run in 8-bit image mode: break on all characters or buffered. The b.o.a.c. has the same difficulty as above, so we don't want to do it. The buffered mode suffers from the fact that there is no "break" character because all bit patterns must be treated as data, hence the node has to "dummy up" a situation which will terminate a packet. We chose to send the packet if there is a 2- character time delay with no input. Now for the effect on up/down loading. A block of data is transmitted. The far end (be it host or local system) sends its ACK. If uploading, the ACK enters the node, waits for 2 character times, and is on its way. It might take say 1/2 second to arrive at the host. The job running on the host has to wake up to process the ACK. If it happens to be swapped out, it has to be swapped back in before it can run. If there is a lot of terminal I/O going on, then the scheduler queue which services terminal-bound jobs is rather long, and our transfer probram run request gets stacked on the bottom of the queue. Finally our job can run again, and manages to send out the next block of data, which the network can usually digest quite rapidly due to the buffering action of the intermediate nodes. It now takes maybe another 1/2 second for the data to traverse all the nodes and begin arriving at your system. Typically, the over- all delay runs to about 1.5 to 2.5 seconds, depending on system and network loading. That delay is going to be there regardless of what protocol you are running. What can be done? One thing is to transmit larger blocks. I picked up a lot when I went from line-at-a-time to 256-bytes at a time, in fact, from an average of 8-10 CPS to about 25 CPS. The block size could be made GIGANTIC, like several thousand bytes. However, the error detection capability begins to deteriorate at large block sizes, and worse, if you have a noisy local telephone system/modem/???, you will likely encounter frequent retransmission, and it takes a long time totransmit long blocks. Incidently, with the current A-protocol and 256-byte blocks, the effective thru-put is at best 29.5 CPS; the above delays account for the difference. Another thing which we can try is to employ a synchronous protocol instead of an asynchronous one (the A-protocol, and Ward Christensen protocol are asynchronous). This means that we would transmit a block, then immediately tranmit the next one, hoping to receive the ACK or NAK for the first one before the second one has finished. But now we have other problems. What do we do if there is an error in the first block? We have the choice of holding the second one (if it is okay) until the first one gets correctly transmitted, or we can toss it away. If we toss it away, making the sender resend it, we have diluted our thruput. If we keep it for later, we are opening Pandora's box of troubles. If you are familar with IBM Bisync, there are lots of situations where this scheme falls apart due to mis- understood ACKs or NAKs. The "toss it away" attitude is the concensus of most networkers today. DDCMP, one of the most reliable and wide- spread network protocols, does just this. They opt to reduce their thru-put in favor of greater reliability. I hope all of this indicates to you that our protocols and procedures are not random or capricious! I think that the best way for us to go is to get compressed files working. If we can achieve a 30-40% reduction if file size over the comm. link, the transmission time for actual data will in fact exceed the link's bandwidth. I forgot the "masking" which we do. Many of our users (about 25%) use Tymnet. Tymnet does a poor job handling control characters and bit-7 characters at times. In particular, control-B and control-O cause problems on the Tymcom node. Telenet has its own problems. In order to get around some of these diffuculties, we "mask" control codes by sending them as . I am currently looking into bit-7 masking as well, although it will be employed only in the event of two successive retransmissions. I do hope this gives you some insight into our situation and problems. Your comments are of course welcome. Russ