Pages

Subscribe:

Tuesday, 19 July 2011

How Does the Internet Work?

Introduction

How does the Internet work? Good question! The Internet's growth has become explosive and it seems impossible to escape the bombardment of www.com's seen constantly on television, heard on radio, and seen in magazines. Because the Internet has become such a large part of our lives, a good understanding is needed to use this new tool most effectively.
This whitepaper explains the underlying infrastructure and technologies that make the Internet work. It does not go into great depth, but covers enough of each area to give a basic understanding of the concepts involved. For any unanswered questions, a list of resources is provided at the end of the paper. Any comments, suggestions, questions, etc. are encouraged and may be directed to the author at the email address given above.

Where to Begin? Internet Addresses

Because the Internet is a global network of computers each computer connected to the Internet must have a unique address. Internet addresses are in the form nnn.nnn.nnn.nnn where nnn must be a number from 0 - 255. This address is known as an IP address. (IP stands for Internet Protocol; more on this later.) The picture below illustrates two computers connected to the Internet; your computer with IP address 1.2.3.4 and another computer with IP address 5.6.7.8. The Internet is represented as an abstract object in-between. (As this paper progresses, the Internet portion of Diagram 1 will be explained and redrawn several times as the details of the Internet are exposed.)

Diagram 1
Diagram 1

If you connect to the Internet through an Internet Service Provider (ISP), you are usually assigned a temporary IP address for the duration of your dial-in session. If you connect to the Internet from a local area network (LAN) your computer might have a permanent IP address or it might obtain a temporary one from a DHCP (Dynamic Host Configuration Protocol) server. In any case, if you are connected to the Internet, your computer has a unique IP address.

Internet Protocol


Unlike TCP, IP is an unreliable, connectionless protocol. IP doesn't care whether a packet gets to it's destination or not. Nor does IP know about connections and port numbers. IP's job is too send and route packets to other computers. IP packets are independent entities anay arrive out of order or not at all. It is TCP's job to make sure packets arrive and are in the correct order. Abd mout the only thing IP has in common with TCP is the way it receives data and adds it's own IP header information to the TCP data. The IP header looks like this:

Diagram 8

Above we see the IP addresses of the sending and receiving computers in the IP header. Below is what a packet looks like after passing through the application layer, TCP layer, and IP layer. The application layer data is segmented in the TCP layer, the TCP header is added, the packet continues to the IP layer, the IP header is added, and then the packet is transmitted across the Internet.

Diagram 9

Transmission Control Protocol


Under the application layer in the protocol stack is the TCP layer. When applications open a connection to another computer on the Internet, the messages they send (using a specific application layer protocol) get passed down the stack to the TCP layer. TCP is responsible for routing application protocols to the correct application on the destination computer. To accomplish this, port numbers are used. Ports can be thought of as separate channels on each computer. For example, you can surf the web while reading e-mail. This is because these two applications (the web browser and the mail client) used different port numbers. When a packet arrives at a computer and makes its way up the protocol stack, the TCP layer decides which application receives the packet based on a port number.
TCP works like this:

  • When the TCP layer receives the application layer protocol data from above, it segments it into manageable 'chunks' and then adds a TCP header with specific TCP information to each 'chunk'. The information contained in the TCP header includes the port number of the application the data needs to be sent to.
  • When the TCP layer receives a packet from the IP layer below it, the TCP layer strips the TCP header data from the packet, does some data reconstruction if necessary, and then sends the data to the correct application using the port number taken from the TCP header.
This is how TCP routes the data moving through the protocol stack to the correct application. TCP is not a textual protocol. TCP is a connection-oriented, reliable, byte stream service. Connection-oriented means that two applications using TCP must first establish a connection before exchanging data. TCP is reliable because for each packet received, an acknowledgement is sent to the sender to confirm the delivery. TCP also includes a checksum in it's header for error-checking the received data. The TCP header looks like this:


Diagram 7

Notice that there is no place for an IP address in the TCP header. This is because TCP doesn't know anything about IP addresses. TCP's job is to get application level data from application to application reliably. The task of getting data from computer to computer is the job of IP.
Check It Out - Well Known Internet Port Numbers
Listed below are the port numbers for some of the more commonly used Internet services.
FTP20/21
Telnet23
SMTP25
HTTP80
Quake III Arena27960

Internet Infrastructure

The Internet backbone is made up of many large networks which interconnect with each other. These large networks are known as Network Service Providers or NSPs. Some of the large NSPs are UUNet, CerfNet, IBM, BBN Planet, SprintNet, PSINet, as well as others. These networks peer with each other to exchange packet traffic. Each NSP is required to connect to three Network Access Points or NAPs. At the NAPs, packet traffic may jump from one NSP's backbone to another NSP's backbone. NSPs also interconnect at Metropolitan Area Exchanges or MAEs. MAEs serve the same purpose as the NAPs but are privately owned. NAPs were the original Internet interconnect points. Both NAPs and MAEs are referred to as Internet Exchange Points or IXs. NSPs also sell bandwidth to smaller networks, such as ISPs and smaller bandwidth providers. Below is a picture showing this hierarchical infrastructure.


Diagram 4
Diagram 4

This is not a true representation of an actual piece of the Internet. Diagram 4 is only meant to demonstrate how the NSPs could interconnect with each other and smaller ISPs. None of the physical network components are shown in Diagram 4 as they are in Diagram 3. This is because a single NSP's backbone infrastructure is a complex drawing by itself. Most NSPs publish maps of their network infrastructure on their web sites and can be found easily. To draw an actual map of the Internet would be nearly impossible due to it's size, complexity, and ever changing structure.

Networking Infrastructure

So now you know how packets travel from one computer to another over the Internet. But what's in-between? What actually makes up the Internet? Let's look at another diagram:


Diagram 3
Diagram 3

Here we see Diagram 1 redrawn with more detail. The physical connection through the phone network to the Internet Service Provider might have been easy to guess, but beyond that might bear some explanation. The ISP maintains a pool of modems for their dial-in customers. This is managed by some form of computer (usually a dedicated one) which controls data flow from the modem pool to a backbone or dedicated line router. This setup may be referred to as a port server, as it 'serves' access to the network. Billing and usage information is usually collected here as well.
After your packets traverse the phone network and your ISP's local equipment, they are routed onto the ISP's backbone or a backbone the ISP buys bandwidth from. From here the packets will usually journey through several routers and over several backbones, dedicated lines, and other networks until they find their destination, the computer with address 5.6.7.8. But wouldn't it would be nice if we knew the exact route our packets were taking over the Internet? As it turns out, there is a way...

How the Internet Works

The internet is a world-wide network of computers linked together by telephone wires, satellite links and other means. For simplicity's sake we will say that all computers on the internet can be divided into two categories: servers and browsers.
Servers are where most of the information on the internet "lives". These are specialised computers which store information, share information with other servers, and make this information available to the general public.
Browsers are what people use to access the World Wide Web from any standard computer. Chances are, the browser you're using to view this page is either Netscape Navigator/Communicator or Microsoft Internet Explorer. These are by far the most popular browsers, but there are also a number of others in common use.
When you connect your computer to the internet, you are connecting to a special type of server which is provided and operated by your Internet Service Provider (ISP). The job of this "ISP Server" is to provide the link between your browser and the rest of the internet. A single ISP server handles the internet connections of many individual browsers - there may be thousands of other people connected to the same server that you are connected to right now.
The following picture shows a small "slice" of the internet with several home computers connected to a server:
ISP servers receive requests from browsers to view webpages, check email, etc. Of course each server can't hold all the information from the entire internet, so in order to provide browsers with the pages and files they ask for, ISP servers must connect to other internet servers. This brings us to the next common type of server: the "Host Server".
Host servers are where websites "live". Every website in the world is located on a host server somewhere (for example, MediaCollege.Com is hosted on a server in Parsippany, New Jersy USA). The host server's job is to store information and make it available to other servers.
The picture below show a slightly larger slice of the internet:
To view a web page from your browser, the following sequence happens:
  1. You either type an address (URL) into your "Address Bar" or click on a hyperlink.
  2. Your browser sends a request to your ISP server asking for the page.
  3. Your ISP server looks in a huge database of internet addresses and finds the exact host server which houses the website in question, then sends that host server a request for the page.
  4. The host server sends the requested page to your ISP server.
  5. Your ISP sends the page to your browser and you see it displayed on your screen.