C SC 481.20 Lecture 4: Application Layer - email and DNS
major resource: Computer Networking (4th Edition),
Kurose and Ross, Addison Wesley, 2008
[ previous
| schedule
| next ]
Email application and its application layer
Major application layer components
- user agent : the software you use to read, compose, and organize email
- mail server : the server that interacts with user agents and other mail servers to
deliver and store email
- Snail-mail analogy: Each post office is a mail server, post boxes are user agents.
- Note that mail server acts as both client and server;
client when it sends to another
mail server, server when it receives from another mail server
- Mail servers communicate using application layer protocol SMTP
- Note that email is "push" oriented, while Web is "pull" oriented
SMTP : Simple Mail Transfer Protocol
- Set of protocols used by mail servers to communicate with each other
- Defined in RFC 2821
- It is old (like 1970s old), and indeed very simple!
- Based on 7-bit ASCII codes (codes 0-127)
- Anything other than ASCII text has to be translated into ASCII by sending server, and back by receiver
- Not fun for transmitting multimedia!
- Basic protocol for message delivery
- sender's user agent sends message (M) to sender's mail server (SMS)
- SMS places M in its message queue
- SMS attempts TCP connection with recipient's mail server (RMS)
- If RMS valid but connection not possible, SMS backs off while and tries again later
- Once connected, SMS and RMS exchange pleasantries and SMS transmits M to RMS
- RMS places M into recipient's mailbox, located in RMS.
- Recipient accesses mailbox via user agent and reads M.
- Protocol takes place in plain text, see textbook example
- Client sends SMTP commands such as HELO, MAIL FROM, RCPT TO, DATA, QUIT
- Server responds to each command with one-liner containing response code and message
- It is possible to spoof being a sending mail server by using port 25 (gotta know the mail server names)
SMTP mail message format
- Message starts with first character sent after receiving code 354 response to DATA command
- Message ends with period character '.' on a line by itself!
- Message consists of header and body, separated by a blank line.
- Header line syntax resembles HTTP header line: attribute then colon then value (e.g. To: PSanderson@otterbein.edu)
- Example header lines are: To:, From:, Subject:
- The header line values are the ones displayed by mail reader
- Note that header lines are not part of SMTP delivery protocol, so you can give them any value!
- MIME, MultImedia Mail Extensions, are used to transmit non-ASCII content
- Header lines describe the content type so receiver knows how to interpret the content
- The Content-Transfer-Encoding: line tells receiver how original content was translated
to ASCII, so it will know how to undo the translation
- The Content-Type: line tells receiver what kind of content it is: HTML, JPEG, Word document, whatever
- Receiver's mail agent is responsible for rendering the content
- Multiple MIME components can be included in the same message
- Non-text email message bodies and attachments are both handled this way
- For more info about MIME, see RFC 2045 and
RFC 2046
Mail Access Protocols
- Delivery of received message from receiver's mail server to receiver's user agent is a different matter
- For one thing, this final delivery is a "pull" not a "push" operation and SMTP is push-only
- For another, authentication is of utmost importance
- Several such protocols have been defined: POP3, IMAP, HTTP-based
- Post Office Protocol (POP)
- The original mail access protocol, see RFC 1939
- Very primitive; only 12 possible commands and little security
- Server only maintains in-box, any saved-message organization must be performed by the user agent (client) - very inflexible
- Session is established by TCP connection to port 110
- Session proceeds through several states:
- AUTHORIZATION - entered upon TCP connect, establish client identity as valid mailbox owner (name and password transmitted in plain text)
- TRANSACTION - entered upon authorization, interact with server to read/delete messages
- UPDATE - entered upon QUIT command, resource clean-up on server side and TCP disconnect
- No information state is maintained between sessions
- Internet Mail Access Protocol (IMAP)
- Significantly more features, complexity, and security than POP; see see RFC 3501
- Server maintains user-created folders for organizing and search for messages
- Maintains state, such as folder organizations, between sessions
- Like other protocols, uses plain-text commands from client and responses from server
- Session is based on TCP connection and has several states
- Here is state-transition diagram, copied directly from RFC 3501
+----------------------+
|connection established|
+----------------------+
||
\/
+--------------------------------------+
| server greeting |
+--------------------------------------+
|| (1) || (2) || (3)
\/ || ||
+-----------------+ || ||
|Not Authenticated| || ||
+-----------------+ || ||
|| (7) || (4) || ||
|| \/ \/ ||
|| +----------------+ ||
|| | Authenticated |<=++ ||
|| +----------------+ || ||
|| || (7) || (5) || (6) ||
|| || \/ || ||
|| || +--------+ || ||
|| || |Selected|==++ ||
|| || +--------+ ||
|| || || (7) ||
\/ \/ \/ \/
+--------------------------------------+
| Logout |
+--------------------------------------+
||
\/
+-------------------------------+
|both sides close the connection|
+-------------------------------+
(1) connection without pre-authentication (OK greeting)
(2) pre-authenticated connection (PREAUTH greeting)
(3) rejected connection (BYE greeting)
(4) successful LOGIN or AUTHENTICATE command
(5) successful SELECT or EXAMINE command
(6) CLOSE command, or failed SELECT or EXAMINE command
(7) LOGOUT command, server shutdown, or connection closed
- Here's an example interactive IMAP session I logged back in 2001. My commands are bolded.
telnet csc.smsu.edu 143
Trying 146.7.45.212...
Connected to csc.smsu.edu.
Escape character is '^]'.
* OK [CAPABILITY IMAP4 IMAP4REV1 LOGIN-REFERRALS AUTH=LOGIN] csc.smsu.edu IMAP4rev1 2000.284 at Tue, 6 Feb 2001 20:53:10 -0600 (CST)
A002 LOGIN pete myPassword
* CAPABILITY IMAP4 IMAP4REV1 NAMESPACE IDLE MAILBOX-REFERRALS SCAN SORT THREAD=REFERENCES THREAD=ORDEREDSUBJECT MULTIAPPEND
A002 OK LOGIN completed
A003 SELECT INBOX
* 31 EXISTS
* 0 RECENT
* OK [UIDVALIDITY 915631092] UID validity status
* OK [UIDNEXT 4315] Predicted next UID
* FLAGS (\Answered \Flagged \Deleted \Draft \Seen)
* OK [PERMANENTFLAGS (\* \Answered \Flagged \Deleted \Draft \Seen)] Permanent flags
A003 OK [READ-WRITE] SELECT completed
A004 SEARCH ALL
* SEARCH 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
A004 OK SEARCH completed
A005 SEARCH UNSEEN
* SEARCH
A005 OK SEARCH completed
A006 FETCH 29 (FLAGS BODY[HEADER.FIELDS (DATE FROM)])
* 29 FETCH (FLAGS (\Seen) BODY[HEADER.FIELDS ("DATE" "FROM")] {89}
Date: Mon, 22 Jan 2001 16:17:26 -0500
From: Scott Grissom <grissom@GVSU.EDU>
)
A006 OK FETCH completed
A007 LOGOUT
* BYE csc.smsu.edu IMAP4rev1 server terminating connection
A007 OK LOGOUT completed
Connection closed by foreign host.
- HTTP-based mail access protocols
- Allow email access through web browser
- User agent is web-based
- Typical arrangement:
- Commands and responses communicated between user agent and webmail server using HTTP
- Webmail server communicates with backend email server using IMAP
- Most email interaction is now through HTTP: hotmail, gmail, yahoo, Outlook Web Access, etc.
DNS: the Domain Name System
Overview of DNS
DNS network organization
- Service must be distributed because of its database is so large and dynamic
- Organization is a very shallow but wide tree (few levels, many siblings)
- Root Servers at the root of the tree. There are currently 13, each replicated
- TLD Servers (Top Level Domain) just below. These represent the last component of
a server's name (organization and country codes such as com, org, fr,
us, jp, and so forth)
- Authoritative Servers provided by each organization to hold DNS database entries for its
servers and to perform DNS functions concerning them
- Every host has the IP address of a local server to which the host will request DNS service
- The process is addressed next.
Pure Recursive DNS query
- Each client in request chain issues DNS request on behalf of original requester and gets definitive response - resource record(s) or error.
- Request travels from local server up the hierarchy and back down to authoritative server, then recurses back to local server
- Example: A worst case recursive query for “www.pitt.edu” is this:
- Host asks local server for IP address
- Local server doesn't know but asks a root server
- Root server notes the "edu" and asks the TLD server for "edu" domain
- TLD server for "edu" doesn’t know but asks authoritative server for "pitt.edu".
- Authoritative server for "pitt.edu" retrieves the entry from its DNS database, packages it into a message,
and returns it to the TLD server
- TLD server relays the response message to the root server
- Root server relays the response message to the local server
- Local server relays the response to the original host
- Note the large amount of network traffic and demand on the root server. We'll address both issues.
Iterative DNS query
- All requests are made by local server and all responses go directly to the
local server.
- Response will contain either the DNS record or identity of another server to ask
- Example: A worst case iterative query for “www.pitt.edu” is this:
- Host asks local server for IP address
- Local server doesn't know but asks a root server
- Root server doesn't know but responds to local server with the address of a TLD server for "edu"
- Local server asks the TLD server for "edu"
- TLD server for "edu" doesn’t know but responds to local server with the address of the authoritative server for "pitt.edu"
- Local server asks the authoritative server for "pitt.edu"
- Authoritative server for "pitt.edu" retrieves the record from its DNS database, packages it into a message,
and returns it to the local server
- Local server relays the response to the original host
- Same number of messages, but less demand on the root and TLD servers since they don't have to
relay responses.
- Note this is not purely iterative, since there is no response to the original requesting host
until the end
Caching is Critical to DNS Performance
- Response containing an answer (DNS record) is cached by recipient
- More precisely, DNS record containing name-to-IP mapping is cached (more below)
- If multiple requests for same server are made over a short period of time, all but the
first will be handled by local server
- Caching obviously reduces network traffic, reduces DNS server loads, reduces response time
- Local server commonly caches TLD server addresses so it can by-pass root server
totally
Resource Record (RR)
- Resource Record is DNS database or cache entry. Fields are:
- Name -- the name used for matching DNS request
- Value -- result for DNS response, depends on record type (below)
- Type – can have one of several values. Most frequent are:
- A -- (authoritative) value field contains IP address
- NS – (name server) name is domain name and value is its name server
- MX – (mail) name is alias for mail server and value is mail server name
- CNAME – (canonical name) name is alias and value is real name.
- TTL – Time To Live. for caching.
- The complete RR is packaged into response messages and cached by recipients
DNS Security and Resilience are Paramount
- DNS service is used by all applications that require name-to-IP translation
- What would happen if root servers were brought down by denial-of-service, or infected?
- What if the DNS system, or any of its servers, were poisoned to give bogus responses (pharming)?
- DNS is heavily fortified and has survived all attacks so far with little negative consequence
Alternative Root Servers
- There are renegade "alternative root servers" out there, supporting ICANN’s TLDs plus a slew of others.
- OpenNIC is a prominent one, see www.opennicproject.org
- This is not very difficult to implement, but ICANN does not like it one bit...
- For more info, see for example the Wikipedia entry for Alternative DNS root (how authoritative is Wikipedia?)
[ C
SC 481 | Peter
Sanderson | Math Sciences server
| Math Sciences home page
| Otterbein ]
Last updated:
Peter Sanderson (PSanderson@otterbein.edu)