C SC 481.20 Lecture 3: Intro to Application Layer and HTTP
major resource: Computer Networking (4th Edition),
Kurose and Ross, Addison Wesley, 2008
[ previous
| schedule
| next ]
Introducing the Application Layer
Our focus will be on
- Networked applications structured for the Internet
- Application layer of Internet protocol stack
- Applications that rely on Transport layer services (TCP, UDP)
- Network edge issues (not network core)
- Client-server paradigm
- Peer-to-peer (P2P) paradigm
- Hybrid client-server and P2P
Some basic questions
- What are some example networked applications?
- What kinds of devices do networked applications run on?
- How much do you need to know about the Internet's structure and protocols, to
write a networked application?
- What is the difference between a networked application and Application Layer
protocols?
Client-Server paradigm
- application consists of two major software components: client, and server
- Name a couple such applications
- What are the characteristics of a server?
- What are the characteristics of a client?
- Put another way, what concerns does a server have that a client does not?
(consider scaleability and availability)
P2P paradigm
- application consists of one major software component: peer
- Name a couple such applications
- Peers take on characteristics of both client and server when communicating with each other
- Comment on scaleability of P2P system versus client-server
Client-Server vs. P2P
- Advantages of client-server over P2P?
- Advantages of P2P over client-server?
Hybrid of Client-Server and P2P
- application consists of two major software components: servers and peers
- Name a couple such applications -- structure not obvious from user view
- Peers act as both client and server when communicating with each other
- Peers act as client when communicating with server
Inter-process communication through sockets
- process is running program
- How does a client process communicate with server process to make request?
- How does a server process communicate with client process to give response?
- Both happen through sockets, the API to the transport layer
- Sockets are message-passing mechanism between application and transport layers
- Also applies to P2P because it is fundamentally client-server-based, client initiating request and server responding
- Sockets API (library) is fairly easy to use in C, C++, Java and other languages
- Java provides high-level support (more later)
- How to address the process at the other end?
- IP address (32 bits, 128 with IPv6), network layer, identifies machine
- port number integer that identifies the process
- There could be several Internet processes running at that IP address
- Each service has standard port number, so sender knows what to use (e.g. 80 for HTTP)
Application layer protocols
- recall that protocol includes message formats, sequences, and processing
- Some protocols (e.g. HTTP) are public standards, see e.g. http://www.ietf.org/rfc.html
- Other protocols (e.g. Skype) are proprietary
- Consider advantages and disadvantages of each approach
- Analogy to situation in 1980s with open IBM-PC architecture versus closed Macintosh architecture
- Analogy to situation today with open MP3 format and closed iTunes DRM (digital rights management)
Transport Layer services (Application Layer protocols are built on them)
- You will learn much more about this in the next chapter
- TCP provides totally reliable connection-oriented stream service based on packet-switched network
- UDP provides unreliable connectionless service based on same network
- Neither provides throughput guarantees, although TCP has flow (local) and congestion (global) control
- Neither provides timing guarantees, although TCP has flow and congestion control
- Neither provides security through encryption or other means
- SSL (Secure Sockets Layer) provides security as intermediate between socket and TCP
- Consider transport needs of applications
- What applications require reliability but not throughput, timing or security?
- What applications require reliability and security but not throughput or timing?
- What applications require throughput but not the other three?
- What applications require throughput and timing but not reliability or security?
- Why use UDP?
Application layer protocol: HTTP
- HTTP: HyperText Transfer Protocol
- Hypertext origins with Vannavar Bush and Ted Nelson, term coined by Nelson
- See RFC 1945 http://www.ietf.org/rfc/rfc1945.txt
- See RFC 2616 http://www.ietf.org/rfc/rfc2616.txt
- Uses TCP transport service
- HTTP requests and responses use TCP connection
- "3 way handshake" to establish
- sender --> REQUEST --> receiver
- receiver --> ACKNOWLEDGE --> sender
- sender --> ACKNOWLEDGE --> receiver
- The REQUEST step sends a SYN packet, misused to create "SYN flood" for D.O.S. attack
- In non-persistent HTTP, each file with web page sent using different TCP connection
- In persistent HTTP, each file with web page sent using same TCP connection
- HTTP is stateless, each request is seen by server as independent of all others
- Stateless is separate issue from persistent -- persistent only applies to one request.
HTTP protocol specifics
- HTTP commands are plain text
- GET command used to request a page
- First line contains GET command, subsequent lines contain header field name-value pairs
- header field values give the server information about the client (which browser, preferred
language, TCP persistence, etc)
- Response header also plain text
- First line contains response status and code (e.g. 200 is OK, 404 is not found, etc)
- Followed by header lines with info from server (server type, file type, file modification time, file length)
- Header followed by file contents
- Other commonly-used HTTP commands include POST (request containing data from form as request body) and
HEAD (request the response header but not file contents)
Maintaining State
- As indicated above, HTTP does not maintain state
- Clearly, "statefull" activities occur on the web, e.g. e-commerce, so how?
- Here are a couple techniques
- Client request contains parameter values appended to URL. Also used to pass form data
to server. Example:
http://www.google.com/search?hl=en&q=http+state+maintenance&btnG=Google+Search
is generated when http state maintenance typed into Google search
field.
- Cookies: small files on client disk created and maintained by server to contain state info
- More on cookies; they are established and used through this sequence
- Initially, client has no cookie for networknut.com
- Client (browser) connects to networknut.com with a regular GET command
- networknut.com notices the GET header did not contain a Cookie: header line,
so it generates a new cookie number, creates a database entry for that cookie number, and its response message contains a Set-Cookie: header
line containing the number.
- Client parses the Set-Cookie: line and, if permitted, creates the cookie file
- User clicks a link and client sends another GET command; this time the request header
contains a Cookie: line with the I.D. number provided by the server and stored in the cookie
- Server now knows who the request came from and can proceed accordingly
- Next time client goes to networknut.com, the initial GET request
contains a Cookie: line with the same I.D. and the server immediately
knows who is calling and can customize its response.
Caching and the Web
- Caching, as you know, is keeping a "local" copy of something that may be requested in
the near future, so that request can be serviced faster. The risk is that the local copy may
become outdated, or "stale"
- One technique for caching Web content is the browser cache. This
retains copies of previously-downloaded files on the browser's file system.
See
http://www.microsoft.com/windows/ie/ie6/using/howto/customizing/clearcache.mspx
for more information concerning Internet Explorer 6 browser cache.
- Another technique is the Web cache, or proxy server. This is a server that sits
somewhere between your browser and the desired Web server. The browser request goes to the proxy
server, which responds if it has the requested object or forwards the request to the real server if
it does not.
- Note that in both techniques, quicker responses and less network traffic result from a "cache hit"
- Client or proxy server can assure up-to-date copy by using conditional GET technique
- "conditional GET" is just a GET command with If-Modified-Since: header line. Its value is
a time-stamp.
- A server receiving this get will compare that time-stamp with the modification time
for the requested file.
- If the file's modification time is more recent, its contents will be included as the response body.
- Otherwise, the response will have code 304 and file contents are not included.
- Successful Web caching (high hit rate) is critical to Web performance and lots of research has been
directed to it. Text says typical is wide range, 20% to 70%. I've also seen 30%-50% range cited.
[ C
SC 481 | Peter
Sanderson | Math Sciences server
| Math Sciences home page
| Otterbein ]
Last updated:
Peter Sanderson (PSanderson@otterbein.edu)