Next: 3 A TCP/IP Stack Up: Implementing a High Performance Previous: 3.1 Introduction

2 The TCP/IP Protocol Suite

2.1 History

The history of TCP/IP began in the mid 1970s when the Advanced Research Project Agency (ARPA) started working on the protocols and architecture that today is the fundament for the Internet [2]. The TCP/IP took its current form around 1977-79. By the time of 1979 so many researchers were involved in the TCP/IP project that ARPA formed an internal committee to coordinate and guide the design of the protocols and architecture of the emerging Internet. The committee was called the Internet Control and Configuration Board (ICCB) and it was active until 1983 when it was reorganized.

The Internet Architecture Board (IAB) was created from the reorganization of the ICCB. The primary task of the IAB is to set official policies. Another of its duties is to decide which protocols are a required part of the TCP/IP suite.

In 1992, when Internet was moved away from the U.S. government, the Internet Society was created. This is an international organization formed to encourage participation in the Internet around the world.

The Internet we have today began around 1980 when ARPA started converting their machines to the TCP/IP protocols. To encourage usage of the new protocols, ARPA made a low cost implementation available for university researchers. ARPA was able to reach over 90% of the university computer science departments in the US. The timing for the protocol suite was perfect because the departments were just acquiring second and third computers and connecting them together. The departments needed protocols for communications, and no other were available.

The success of the TCP/IP technology and the Internet among computer science researchers lead other groups to adopt it. The National Science Foundation soon realized the importance of computer communication, and took an active role in expanding the use of TCP/IP Internet among scientists. In 1985 networks were established around six super computer centers. In 1986 it expanded and a new wide area backbone network was created, the NSFNET. The NSFNET eventually reached all the supercomputer centers and tied them to the ARPANET.

Adoption to the TCP/IP protocols and the growth of the Internet are no longer limited to scientists and government-funded projects. By 1994 the global Internet reached over 3 million computers in 61 countries. Today the growth of the Internet is very rapid, it connects computers almost everywhere, and all of them use TCP/IP.

2.2 General Overview of TCP/IP

Why use TCP/IP? Most networks are independent entities, established to serve the needs of a single group. The group uses hardware that suits their needs. Some may want a high-speed network to connect their machines, but such networks do not work over long distances. Others may want to communicate over long distances, settling for a slower network. TCP/IP provides a set of communication conventions making it possible to interconnect all these different kinds of networks.

The name TCP/IP refers to many different protocols, not only TCP and IP. Some of the more renowned ones are UDP, ICMP, and ARP.

The overview below (see figure 1) shows the general layout of how these protocols are related to each other. The parts comprising the protocol stack are the protocols found on the transport and network layers.

Figure 1: General Overview of TCP/IP

2.2.1 IP

The Internet Protocol (IP) is the protocol that provides for an unreliable transmission of blocks of data called datagrams from one host to another [4]. That IP is unreliable means that it does not care if a sent packet is lost, duplicated, or corrupted. It relies on higher level protocols to ensure a reliable transmission.

The primary tasks of IP are addressing, routing and fragmentation. IP uses the address carried in the internet header to transmit the datagram towards its destination. The selection of the path for transmission is called routing. Fragmentation is needed when a packet is too large to fit the network below IP. If a large packet arrives at the IP level, IP divides the packet into smaller fragments before sending them. When a packet fragment arrives, IP is responsible for the reassembly of the packet.

Another duty of the IP protocol is to remove old packets from the network. Each packet has a time to live (TTL) field which indicates the maximum number of router hops the packet is allowed to do before it is discarded. This is due to the fact that otherwise unroutable packets would remain on the network forever, filling it with useless data.

The best way to get a better understanding of the IP protocol is to take a look at the format of an IP packet (see figure 2).

Figure 2: An IP Header

Version (4 bits):: The version field indicates the format of the internet header. Currently, the only two supported formats are ``standard'' IP (represented by 4), and IPv6 (represented by 6).
Internet Header Length (IHL) (4 bits):: The length of the Internet header measured in 32 bit words. The minimum header length is 5 words.
Type of Service (8 bits):: This field is used to specify reliability, precedence, delay and throughput parameters.
Total Length (16 bits):: The total length of the datagram, measured in bytes including the internet header and data.
Identification (16 bits):: The identification is assigned by the sender to aid in assembling the fragments of a datagram.
Flags (3 bits):: One bit, the More Flag (MF) is used for fragmentation and reassembly. It indicates if the fragment is the last fragment, or if there are more fragments. Another bit is the Don't Fragment (DF) bit. It specifies whether the datagram may be fragmented or not. The last of the three bits is currently not used.
Fragment Offset (13 bits):: This field indicates where in the original datagram the fragment belongs.
Time to Live (8 bits):: Indicates the maximum time the datagram is allowed to remain in the internet system. If this field contains the value zero, the datagram is destroyed.
Protocol (8 bits):: This field indicates the next level protocol which is to receive the data field at the destination.
Header Checksum (16 bits):: A checksum of the header only. Since some of the header fields change (for example Time To Live), the checksum has to be recomputed and verified at each point where the internet header is processed. The checksum computation is explained in more detail below.
Source address (32 bits):: The source address.
Destination address (32 bits):: The destination address.
Options (variable):: The option field is variable in length. There may be zero or more options. The field encodes the options requested by the sender.
Padding (variable):: The padding is used to ensure that the internet header ends on a 32-bit boundary.
Data (variable):: The data field is a multiple of eight bits in length. The maximum length of data field plus header is 65,535 bytes.

Some additional explanation might be required, i.e. the fragmentation/reassembly process and the checksum computation.

When a datagram arrives from a higher level protocol, IP first checks to assure that the size of the datagram is not too large for the underlying network. If it is, it has to be split into a number of separate datagrams by IP. The fragmentation process starts by checking the Don't Fragment (DF) flag. If the DF flag is set then fragmentation of this datagram is not allowed; the datagram is discarded and an error message is returned to the sender. If fragmentation is allowed the data has to be divided into smaller chunks and a header is created for each chunk of data. The fields in the header that are affected by the fragmentation are the More Fragment flag, the Fragment Offset, the Internet Header Length, the Total Length, the Checksum and the options (if present).

If the IP level receives a fragmented datagram from the lower level protocol it has to reassemble it before passing it to the protocol above. When a fragment arrives, IP checks the Identification, Source and Destination fields. If those fields match the fields in another already received fragment then they belong to the same datagram. Due to the way IP routes packets (i.e. each packet is routed separately, and may take a different route), the packets may arrive out of order. The Offset field in the header tells IP where in the original datagram this fragment belongs. IP just inserts the arriving fragments at their correct positions in the datagram until all fragments have arrived. When all fragments have arrived the reassembly is complete, and the datagram is passed on to the next higher protocol level.

The IP checksum is computed on the header only. The checksum field is the 16-bit one's complement of the sum of all 16-bit words in the header.

2.2.2 TCP

The Transfer Control Protocol (TCP) is designed to provide a reliable connection between two hosts across different more or less reliable networks using the Internet Protocol [5]. The data sent with TCP is considered to be a stream of bytes, even though the data has to be split into smaller packets due to the underlying protocols. TCP must recover from data that is damaged, lost, duplicated, or delivered out of order by the internet communication system. TCP also provides a way for the receiver to govern the amount of data sent by the sender, i.e. flow control.

Another responsibility of the TCP protocol is the connection handling. TCP can manage a large number of connections at the same time. The TCP header contains all the necessary information to ensure safe transfer, to handle flow control and to handle connections.

The description of the TCP header is in figure 3 with the fields described as follows.

Figure 3: The TCP Header

Source Port (16 bits):

The number of the port on the source computer (see Port).

Destination Port (16 bits):

The number of the port on the destination computer.

Sequence number (32 bits):

The sequence number of the first data byte in this segment. When the SYN flag is present, it is considered to be part of the data bytes, and the the initial sequence number (ISN) is used as the sequence number (and the sequence number for the first data byte is actually ISN+1).

Acknowledgement number (32 bits):

If the ACK bit is set this field contains the value of the next sequence number the sender of the segment is expecting to receive. Once a connection is established this is always sent.

Data offset (4 bits):

The number of 32-bit words in the header. This indicates where the data begins.

Reserved (6 bits):

Reserved for future use.

Flags (6 bits):

The following flags are available to affect TCP transmissions

URG: The urgent pointer field is significant.
ACK: The acknowledgement field is significant.
PSH: The push function.
RST: Resets the connection if set.
SYN: Synchronizes the sequence numbers.
FIN: No more data from the sender.

Window (16 bits):

Used for flow control. It contains the number of data bytes beginning with the one indicated in the acknowledgement field which the sender is willing to receive (see Sliding Window).

Checksum (16 bits):

The checksum on the TCP level calculated from the TCP header, the data and the pseudo header (see below).

Urgent Pointer (16 bits):

The urgent pointer points to the byte following the urgent data. This allows the receiver to know how much urgent data is coming. The urgent pointer field is only interpreted if the URG flag is set.

Options (variable):

The option field may be of variable length depending on which options are used. Currently TCP only implements one option, it specifies the maximum segment size that will be accepted.

Padding (variable):

The TCP header padding is used to ensure that the TCP header end and data begins on a 32-bit boundary. The padding is composed of zeroes.

The connection handling and establishing is an important part of the TCP protocol. The opening of the connection can be either active or passive. A passive connection waits for incoming connections rather than attempting to initiate a connection. In either case the connection is opened by a three-way handshake. The handshake is important because it synchronizes the sequence numbers on both sides of a connection. The sequence numbers are a fundamental part of TCP. Every byte sent over a TCP connection has a sequence number. Since every byte is sequenced, each of them can be acknowledged. The acknowledgement mechanism employed is cumulative so that an acknowledgement of sequence number X indicates that all bytes up to but not including X, have been received.

The three-way handshake is the procedure used to establish a connection. This procedure is normally initiated by one TCP and responded to by another TCP. The procedure also works if two TCP simultaneously initiate the procedure. The simplest way of handshake is when one TCP is listening, and another connects to it. TCP A sends a SYN to TCP B which is listening. TCP B sends a SYN and an acknowledge of the SYN from TCP A, and finally TCP A acknowledges the SYN sent by TCP B. Now a connection is open, and data transfer can begin.

Figure 4: The Connection Procedure

The amount of data transferred at a time is controlled by the window function. The window sent in each segment indicates the range of sequence numbers the sender of the window (the data receiver) is prepared to accept. If the receiving TCP is busy, the window will shrink, and when the receiving TCP can not accept any more data the window will be zero. If more data arrives than can be accepted, it will be discarded.

TCP must at least be able to receive one byte even when the window is zero. When the receiving TCP has a zero window and a segment arrives it must still be able to send an acknowledge showing its next expected sequence number and current window (zero).

When TCP is sending data, every segment has a sequence number. The receiving TCP acknowledges (ACK) every segment received. If a segment does not get acknowledged the sending TCP assumes that it has been lost somewhere and resends it. If two segments arrive with the same sequence numbers, then the last one is discarded.

The checksum on the TCP level is calculated on the whole packet, i.e. the TCP header and all the data. That way the receiving TCP can tell if some of the data was damaged during the transfer. The checksum also covers a pseudo header. This pseudo header contains the Source Address, the Destination Address, the Protocol, and TCP length (these values are taken from the IP header). This gives TCP protection against misrouted segments. The checksum field is otherwise calculated in the same way as in IP, i.e. it is the 16-bit one's complement of the sum of all 16-bit words in the header, the pseudo header, and the data.

2.2.3 UDP

The User Datagram Protocol (UDP) provides an unreliable connectionless delivery service using IP to transport messages between machines [6]. It does not use acknowledgements to assure that messages arrive, it does not order incoming messages, and it does not provide feedback to control the rate at which incoming information flows between machines. Thus UDP messages can be lost, duplicated, or arrive out of order. This means that it is up to the application using UDP to make the transfer reliable. Furthermore, packets can arrive faster than the recipient can process them. The UDP header reflects the simplicity of the protocol (see figure 5).

Figure 5: The UDP Header

Source Port (16 bits):: The number of the port on the source computer.
Destination port (16 bits):: The number of the port on the destination computer.
Length (16 bits):: The length in bytes of this datagram including UDP header and data.
Checksum (16 bits):: The checksum calculated on the header, a pseudo header and the data. A value of zero as checksum indicates that no checksum calculation should be performed. This is in contrast to the TCP checksum where zero is a legal checksum value.

2.2.4 ARP

The Address Resolution Protocol (ARP) allows a host to find the hardware address of a target host on the same physical network, given only the IP address of the target [7]. Unlike most protocols, the data in an ARP packet does not have a fixed-format header. ARP supports a variety of network technologies, the length of the fields that contain addresses depend on the type of network. The header below (see figure 6) shows the header in the case of IP over Ethernet.

Figure 6: The ARP Header

Hardware Protocol (16 bits):: The type of network technology being used.
Network Protocol (16 bits):: The type of high level protocol being used.
Hardware Address Length (8 bits):: The length of the hardware address.
Network Address Length (8 bits):: The length of the network address.
Operation (16 bits):: The type of operation.
Sender Hardware Address (48 bits):: The ethernet address of the sending host.
Sender Network Address (32 bits):: The IP address of the sending host.
Target Hardware Address (48 bits):: The ethernet address of the target host.
Target Network Address (32 bits):: The IP address of the target host.

The IP address is assigned to a host independent of the machine's hardware address. To send an internet packet across a hardware network from one computer to another, a way is needed to map the IP address to a hardware address. The Address Resolution Protocol (ARP) performs dynamic address resolution, using low level network communication. ARP permits a machine to resolve addresses without keeping a permanent record.

A machine uses ARP to find the hardware address of another machine by broadcasting an ARP request. The request contains the IP address of the machine for which a hardware address is needed. All machines on the network receive the ARP request. If the request matches a machine's IP address, the machine responds by sending an ARP reply that contains the needed hardware address. The reply is sent directly to the asking computer; no need for a broadcast.

2.2.5 ICMP

Normal communication across an internet involves sending messages from one application on some host to an application on some other host. However, e.g. routers may need to communicate directly with the network software on a particular host to report abnormal conditions or to send the host new routing information.

The Internet Control Message Protocol (ICMP) is a required part of IP, and it provides for extranormal communications among routers and hosts [8]. An ICMP message travel in the data field of an IP datagram. The datagram has three fixed length fields at the beginning of the message: an ICMP message type field, a code field, and an ICMP checksum field. The message type defines the format of the rest of the message as well as its meaning. There are several types of ICMP messages. They are Echo Reply, Destination Unreachable, Source Quench, Redirect, Echo, Time Exceeded, Parameter Problem, Timestamp, Timestamp Reply, Information Request and Information Reply.

2.2.6 Other TCP/IP Protocols

Additionally to those protocols already mentioned, there are a few more that are not as commonly used. There are a couple of protocols that handle communication between routers, like EGP (Exterior Gateway Protocol), and RIP (Routing Information Protocol).

Multicast messages (messages that are sent to a specific set of computers) are controlled with IGMP (Internet Group Management Protocol).

There are also quite a number of protocols used on the application level above TCP and UDP. They may also be considered part of the TCP/IP suit. These protocols include:

SMTP (Simple Mail Transfer Protocol):: used to send electronic mail
FTP (File Transfer Protocol):: handles file transfers between two computers
telnet:: used to login to a remote computer and use it as if sitting at its console
rlogin:: similar to telnet
rsh:: similar to telnet and rlogin
DNS (Domain Name System):: used to look up the IP address of a host given its name
SNMP (Simple Network Management Protocol):: is used to manage routers, debug problems, and to find computers that violate protocol standards
RPC (Remote Procedure Call):: used to run a function on a remote server
NFS (Network File System):: used to have files on a remote server appear as if they were stored on the local machine

2.3 Future

The TCP/IP technology has worked well for two decades, but it is beginning to grow old.

The Internet has experienced many years of exponential growth, doubling in size every nine months or faster. By early 1994, a new host appeared on the Internet on average every 30 seconds, and the rate has increased since then. The traffic over Internet has increased even faster than the number of hosts. The increased traffic can be attributed to several causes. First, the Internet population is shifting from academicians and scientists to the general public. Consequently, people now use the Internet after business hours for activities such as shopping and entertainment. Second, new applications that transfer images and real time video generate more traffic than plain text. Third, automated search tools generate a substantial amount of traffic as they relentlessly probe Internet sites to find data.

The growth of the Internet is what has forced a change in IP. The address space in the IP protocol is too small. Other factors have contributed, real-time applications, and the need for secure communication.

IPv6 is the next generation of IP, it retains many of the basic concepts of IPv4, but changes most details. Like IPv4, IPv6 provides a connectionless transfer, best-effort delivery service. The format of IPv6, however, is completely different.

An IPv6 address is 128 bits long, making the address space so large that each person on earth could have his or her own internet as large as the current Internet. IPv6 divides addresses into types analogous to the way IPv4 divides addresses into classes.

An IPv6 datagram consists of a series of headers followed by data. A datagram always begins with a 40-bytes base header, which contains source and destination addresses and a flow identifier. The base header may be followed by zero or more extension headers, followed by data. Extension headers are optional, IPv6 uses them to hold much of the information IPv4 encoded in options.

Next: 3 A TCP/IP Stack Up: Implementing a High Performance Previous: 3.1 Introduction

Peter Kjellerstedt
Thu Jun 5 00:52:23 MET DST 1997