The  following is a message I recently received on MicroNet from  Russ 
Ranshaw  of  CIS.   It covers some of the  problems  encountered  with 
various network facilities.

----forwarded message follows----

Sb: File transfer protocols
    16-Mar-82  13:55:18
Fm: Wiz-10 70000,1
To: Keith Petersen 70535,1245

There  are several layers of network and switch gear between the  user 
and  the program running on one of our PDP-10 hosts.   First there  is 
the local node to which he is connected.  If it is a CIS node, then it 
is a PDP-11 or some varient.  That -11 is connected via a long-line to 
another  -11.  Depending on location of the local node,  there may  in 
fact be several -11's by the time the connection ends up in  Columbus.  
Once  here,  the  termination  is in another -11  which  is  cross-bar 
connected  to a set of PDP-15's.   Each -15 services four KI-10 hosts.  
Normally,  the delays from node to host are reasonably small.  Once at 
the host, you are subjected to the problem that there are several jobs 
running, not just yours, and the monitor will schedule your job to run 
under any of several circumstances.   If it is waiting for input, then 
the  job will awaken when input is ready from the terminal.   Here  is 
another  difficulty.   To really do what you indicate,  the job  would 
have to run in what we call "break on all character mode," where  each 
character  input  from the terminal is sent immediately to  the  host.  
The  problem  with this is that it dilutes the bandwidth of  the  9600 
baud  long  lines if there are a lot of jobs doing that.   The  normal 
mode  is  that  characters are assembled into "packets"  of  24  bytes 
before being sent,  and a "break" character (CR,  LF,  ESC, BEL, and a 
few others) will terminate a packet and send it along.   We prefer  to 
run in this mode in order to better utilize the long-lines.

Running in 8-bit mode is no problem on CIS nodes,  but it has its  own 
problems.  There are two ways to run in 8-bit image mode: break on all 
characters  or  buffered.   The b.o.a.c.  has the same  difficulty  as 
above,  so we don't want to do it.  The buffered mode suffers from the 
fact  that there is no "break" character because all bit patterns must 
be treated as data, hence the node has to "dummy up" a situation which 
will terminate a packet.  We chose to send the packet if there is a 2-
character time delay with no input.

Now for the effect on up/down loading.

A  block  of data is transmitted.   The far end (be it host  or  local 
system) sends its ACK.   If uploading,  the ACK enters the node, waits 
for  2  character times,  and is on its way.   It might take  say  1/2 
second to arrive at the host.  The job running on the host has to wake 
up to process the ACK.   If it happens to be swapped out, it has to be 
swapped back in before it can run.  If there is a lot of terminal  I/O 
going on,  then the scheduler queue which services terminal-bound jobs 
is  rather long,  and our transfer probram run request gets stacked on 
the bottom of the queue.   Finally our job can run again,  and manages 
to  send  out the next block of data,  which the network  can  usually 
digest  quite rapidly due to the buffering action of the  intermediate 
nodes.  It now takes maybe another 1/2 second for the data to traverse 
all the nodes and begin arriving at your system.  Typically, the over-
all  delay runs to about 1.5 to 2.5 seconds,  depending on system  and 
network loading.   That delay is going to be there regardless of  what 
protocol you are running.  What can be done?

One thing is to transmit larger blocks.  I picked up a lot when I went 
from line-at-a-time to 256-bytes at a time,  in fact,  from an average 
of  8-10 CPS to about 25 CPS.   The block size could be made GIGANTIC, 
like several thousand bytes.   However, the error detection capability 
begins to deteriorate at large block sizes,  and worse,  if you have a 
noisy  local  telephone system/modem/???,  you will  likely  encounter 
frequent  retransmission,  and  it takes a long time  totransmit  long 
blocks.   Incidently, with the current A-protocol and 256-byte blocks, 
the effective thru-put is at best 29.5 CPS;  the above delays  account 
for the difference.

Another  thing  which we can try is to employ a  synchronous  protocol 
instead  of an asynchronous one (the A-protocol,  and Ward Christensen 
protocol  are  asynchronous).   This means that we  would  transmit  a 
block,  then  immediately tranmit the next one,  hoping to receive the 
ACK or NAK for the first one before the second one has  finished.  But 
now we have other problems.  What do we do if there is an error in the 
first  block?   We have the choice of holding the second one (if it is 
okay) until the first one gets correctly transmitted,  or we can  toss 
it  away.   If we toss it away,  making the sender resend it,  we have 
diluted  our  thruput.   If  we keep it  for  later,  we  are  opening 
Pandora's box of troubles.   If you are familar with IBM Bisync, there 
are  lots  of  situations where this scheme falls apart  due  to  mis-
understood ACKs or NAKs.  The "toss it away" attitude is the concensus 
of most networkers today.   DDCMP,  one of the most reliable and wide-
spread  network protocols,  does just this.   They opt to reduce their 
thru-put in favor of greater reliability.

I  hope all of this indicates to you that our protocols and procedures 
are not random or capricious!

I  think  that the best way for us to go is to  get  compressed  files 
working.   If  we can achieve a 30-40% reduction if file size over the 
comm.  link, the transmission time for actual data will in fact exceed 
the link's bandwidth.

I forgot the "masking" which we do.  Many of our users (about 25%) use 
Tymnet.   Tymnet does a poor job handling control characters and bit-7 
characters  at times.   In particular,  control-B and control-O  cause 
problems on the Tymcom node.   Telenet has its own problems.  In order 
to get around some of these diffuculties,  we "mask" control codes  by 
sending  them as <DLE><code+040>.   I am currently looking into  bit-7 
masking as well, although it will be employed only in the event of two 
successive retransmissions.

I do hope this gives you some insight into our situation and problems.  
Your comments are of course welcome.

        Russ