Chapter 3 : Using the Internet (selected notes)
Origins
- US Dept of Defense (DoD), late 1960s
- ARPANET, named after the DoD agency DARPA (advanced research projects)
- designed for resilience, would continue to function if some nodes and links taken out
- not designed for security, curiously. intended to carry non-classified data
Growth
- Slow growth early on
- Computing and telecom companies developed competing proprietary networks
- late 70s and early 80s brought convergence based on commonly accepted communication protocols
- Became known as ARPA Internet (network of networks), later shortened to Internet
- Exponential growth (defined by access points and usage) since early 80s
Internet and Web are not the same thing!
- the Internet is not the same thing as the Web!
- Think of the Internet as being like the shipping service FedEx
- FedEx delivers packages without knowing their contents
- the Internet delivers data packets without (necessarily) knowing their contents
- some packets are for Web communication, others are for email, IM, VoIP, DNS, etc
- Internet addressing is based on the IP standards, and all delivery services must follow it
- Internet delivery services are based on either TCP standards for reliable delivery or
UDP standards for unreliable but cheap delivery
- Email has been around since the mid 70s (Queen Elizabeth first used it in 1976)
- Web has been around since the early 90s (developed by Sir Tim Berners-Lee in Europe) and you can still access the
original web site
- Web browser is now user interface to most Internet services, thus the confusion
Communication Services
- Many services pre-date the Web but have been adapted to browser access
- FTP originally (File Transfer Protocol)
- Email early on and Listserv later
- Usenet and newsgroups
- Chat rooms
- multiplayer online games
- Instant Messaging (now in decline with AOL and rise of texting)
- Prior to the web, each service had its own user interface
- Other services have been developed since the Web
- Webcasts
- Blogs
- Wikis (online collaboration)
- VoIP (voice over IP) telephone service such as Skype
- Social networking (Friendster was a Facebook predecessor)
- E-commerce (look for https)
Web Browsers
- Several are available: IE, FireFox, Chrome, Opera, Safari (Apple), etc
- Because the communication protocols are public standards, anyone can develop a new one
- Addresses are called URLs, Uniform Resource Locators
- URL consists of: protocol, domain name (server), path
- Some protocols: http,https, ftp, file
- Browser plug-in extends capabilities to display newly-developed kinds of content (e.g. Flash)
"Annoyances" (textbook description)
- cookies (most are friendly!)
- Spam
- adware
- spyware
- malware (viruses, worms)
- phishing
Searching for Information on the Web
- At first, Berners-Lee was able to maintain a central directory of web sites
- By 1993, new web sites were being developed so quickly the central directory could not be kept up to date
- A search engine is a software product capable of finding and indexing Web content, using web spiders that crawl for information
- Simplified spider crawling algorithm: given a "seed" web page, the algorithm will "crawl" (read) that page, index its contents in a database, make a list of
its hyperlinks (links to other pages), then recursively crawl each of the hyperlinks
- Crawling goes on continually in the background
- the mechanics of crawling are pretty easy, the value of a search site is how well it presents its results to you
- your search query triggers search from the database, not directly from the web
- Text has nice primer on refining your Web searches (p 130-131) to get fewer and more relevant hits
- Use Boolean operators (AND, OR, NOT) and quote marks to refine
- See for instance Google "Advanced Search" link (http://www.google.com/advanced_search?hl=en
- There are many search engines! Partial list and history at http://en.wikipedia.org/wiki/Web_search_engine
Evaluating Search Results (from textbook)
- Consider authority; the source
- Will site purpose lead to biased information?
- How current is the information? Unless you are researching specifically for history, more recent is better. Many pages have a time stamp. If it has "dead" links it is not current.
- How relevant is the page to your focus?
- Is the page/site targeted to the audience you need?
- Follow the links on the page and apply the same criteria to them
How the Internet Works, Briefly
- Computers are connected every which way, but generally hierarchical
- Data packets travel this network and hierachy much like we travel roads and highways
- My route to school, and its network analogy:
- Down the driveway (local network)
- East on North Broadway Street (regional network)
- North on I-71 and east on I-270 (national or backbone network)
- North on Cleveland Ave and east on Main St (regional network)
- Into the parking lot (local network)
- Consider speed and carrying capacity on each of those roads
- The Internet is a point-to-point packet-switched network based on client-server communication
- Point-to-point: at every intersection the next section ("hop")
of the route is determined. If traffic is too heavy or slow on one outgoing
path a different one is selected
- Packet-switched: Long communications are broken up into pieces called packets, which are sent separately and
reassembled at the destination
- Client-Server: Service is available at well-known address, clients contact the server to request that service and
the server responds. Best example: Web browser as client, Web server as server
- "well-known address" means known name or IP address. Internally, IP addresses are numbers!
- type 205.133.226.114 into a web browser and see what happens!
- An Internet service called DNS (Domain Name System) does the translation
- Imagine the chaos if hackers were able to take over DNS!
- Each service at a server has a unique assigned port number. Web service is port 80. DNS uses port 50. FTP uses 20 and 21.
- Each packet carries, along with its data, the destination IP address + port number and the return IP address + port number
Getting Access to the Internet
Internet Service Providers (ISP)
- Purchase cheap low speed dial-up service from any such provider with a local phone number
- High speed DSL, Cable, Satellite, Cell service provided by signal carrier (DSL by phone company, etc) or other company such as WOW
- Boundaries between phone, cable, satellite companies are blurring; can get multiple bundled services from one
- See for example www.high-speed-internet-access-guide.com/
- Non-metered service: unlike other utilities (water, electricity, gas) it is flat-rate pricing at given performance level
- This is related to but not the same as net neutrality, which is about whether a telecommunication provider can
give better service for certain kinds of content over other kinds. This would be like FedEx opening your package and
delaying it if they didn't like the contents. Net neutrality say "nuh-uh, can't do that".
Wired Connections to Network
(Mbps means mega-bits per second, where a megabit is 1,000,000 bits)
(download means from ISP to you, upload means from you to ISP)
Dialup Modem:
- Uses telephone voice channel. It dials number to your ISP and ties
up the line.
- Limited to 56 Kbps (kilo bits per second) download; less for upload)
ADSL (a.k.a. DSL) Asymmetric Digital Subscriber Line
- Uses telephone network, an example of broadband.
- Requires special equipment at telephone switching stations and special
modem usually external to your PC and connected to it through Ethernet
port
- Here’s the asymmetric part: You get typically from 1 to 2 Mbps download, and less than 1 Mbps upload.
- Speed depends on how close you are to telephone switching station.
Further away means slower. Range: a few miles
- Does not use telephone voice channel, but does require filter on phones
Cable modem:
- Uses television cable network, an example of broadband.
- Requires special modem usually external to your PC and connected to it
through Ethernet port
- Download can be very fast, typical is 1-2 Mbps, upload usually much slower.
- Cable is a "tree" network topology in your neighborhood
and so you have to share the cable capacity with your neighbors.
LAN direct connection
- Normally within a building or campus, an example of broadband.
- Requires Ethernet card (as do DSL and cable modem).
- Maximum throughput 10 or 100 or 1000 Mbps; cannot achieve because channel is shared and messages can collide!
Wireless Connections to Network
Satellite
- Requires outdoor transceiver (transmitter/receiver)
- Download speeds comparable to DSL or Cable, but has 0.25 second lag time,
signal travels over 22,000 miles each way!
- Available "everywhere", only limit is line-of-sight to satellite
- Satellites are in "geosynchronous orbit" over the equator, a little over 22,000 miles altitude.
Cell wireless (3G)
- Wireless Wide Area Network (WWAN)
- Distinguished from Wi-Fi (below), which is wireless access to Local Area Network (WLAN)
- Performance comparable to DSL/Cable/Satellite, somewhat more expensive
- Typically available from cell-phone service provider
Wi-Fi (802.11)
- indirect! WI-FI does not connect you directly to the ISP, it connects you to an existing connection (such as DSL or cable)
- limited range of 100-200 feet.
- commonly used for in-house networks (private) and "hot spots" (private or public)
What a Web Page received by the browser really looks like
Web pages are plain text files formatted using a markup language called HTML
- Web pages can be either static or dynamic
- "static" web pages are marked up and stored as a file on the web server and that file is delivered to
your browser
- "dynamic" web pages are created by the server when requested by the browser
- It's like buying food from the grocery shelf versus at a fine restaurant
- HTML = HyperText Markup Language
- you literally "mark up" plain text to tell the browser how to display it
- you mark up the text using tags
- most tags come in pairs: one to mark the start of the markup the other to
mark the end
- To mark text as bold, use <b> to mark the beginning of the bolded
text and </b> to mark the end
- For example, to produce "that is a very funny joke" you'd write
that is a <b>very funny</b> joke
- From the browser, you can look at the current web page in its HTML form by selecting
View -> Source
Here is an example HTML program inside the box. To see how the browser displays
it, click here
<html>
<head>
<title>This is my page!</title>
</head>
<body>
What you <b>type</b> in HTML will be rendered by
the <i>browser</i>
<ul>
<li>most tags are used as bookends like begin and end</li>
<li>other tags have only a beginning with implicit end</li>
<li>a few tags can be used either with or without an end</li>
</ul>
<table border=1>
<tr>
<th>name</th><th>age</th>
</tr>
<tr>
<td>Lauren</td><td>27</td>
</tr>
<tr>
<td>Charles</td><td>18</td>
</tr>
</table>
<img src="http://faculty.otterbein.edu/PSanderson/pete2004.jpg">
<br>
<a href="http://www.otterbein.edu">Otterbein Home Page</a>
<br><br>
<a href="http://www.evergreen.edu/biophysics/technotes/misc/bin_math.htm">
<img SRC="images/binadddigits.jpg"></a>
<hr>
</body>
</html>
|
The body of the above page is everything between the <body>
tag and the </body> tag. The browser will display the body
like this (only without the red border):
What you type in HTML will be rendered by
the browser
- most tags are used as bookends like begin and end
- other tags have only a beginning with implicit end
- a few tags can be used either with or without an end
name | age |
Lauren | 27 |
Charles | 18 |
Otterbein Home Page
|
Static web pages are very boring! Web content is very dynamic!
There are a number of tools and techniques for generating HTML dynamically, but they are beyond the scope of this course!