What we're going to talk about
- Be a good user first
- The big picture-how the WWW works
- URLs
- HTTP
- HTML
- More advanced topics
The Big Picture

The Big Picture
- Let's say a client wants a web page (For example, http://www.silerfamily.net/~fms/)
- The client retrieves the server's address (in this case,
64.198.252.57)
- The client opens a connection to the server and makes a request (in this case, it wants
/~fms/)
- The server responds to the request - at this point a lot of things can happen
Names and Numbers
- Networked computers have numbers (IP Addresses) associated with them, and in many cases they also have names assigned to them.
- We use a service called DNS (Domain Name Service) to translate between names and numbers.
- Think of a phone book: we can go from names to numbers using it, but DNS also allows us to go from numbers back to names in most cases.
- It's important to note that when you're talking about www.amazon.com, all you're really talking about is the name of a computer (or in Amazon's case, lots of computers which appear to you as one).
TLDs
- .com - Commercial
- .org - Organizations
- .edu - Education
- .net - Networks
- .mil - Military
- .us - United States
There are lots of other TLDs; the biggest class is those like .us which are ISO country codes.
Clients
The term client is very broad and, in the case of the World Wide Web, applies to a vast array of programs which are designed for various uses.
- Browsers are typically designed for human interaction with a website. Keep in mind that there are a huge array of browsers. Some of them support a wide range of media and are used by clueful users with fast computers; others support only text. Some will be as simple as a cell phone, and others will read blind users sites.
- Robots or crawlers are programs generally designed to run with little human interaction. These are used by search engines (Google, MSN, Yahoo!, et al.) to make indexes of web pages that can be searched by their users.
Clients
- There are specialty clients also, such as
wget and curl, which are clients usually used to retrieve a few files here or there and save them to disk. These clients also have some features which make them a bit more robot-like, such as the ability to recursively get all the files on a site.
The network
- A network can be as small as a given machine, or as large as the entire Internet.
- Businesses often use local web sites, commonly referred to as intranet sites.
Your server
- In this context, a server is simply a piece of software that sits and waits for requests
- Apache is one of the most widely used servers.
- OS X users are particularly lucky: there's a switch you turn on in sharing and it's all set up
- Windows users have the option of installing IIS, though it's crippled on desktop versions.
- Experienced users often use their old desktop machines as personal web and mail servers: they needn't be fast for this job.
URLs
http://stuff.com/foo.html is an example of a URL.
http: specifies the protocol
stuff.com specifies which server is involved.
/foo.html is the path on the server to the goods.
Back to our example
What happens when the server receives the request:
The server will look at the request and see what it can do. The following are common responses:
- 200 OK - the server gives you your data with this one.
- 304 Not Modified - that page hasn't changed since the last time you asked for it.
- 301 or 307 - page has moved, either permanently or temporarily, respectively.
- 404 Not Found - The server doesn't have that page. This code is usually caused by someone removing something; it's considered tacky and very annoying. Put redirects in if you must move something.
HTML
The Hypertext Markup Language (HTML) is used to make up web pages. Think of HTML as little more than plain text with some layout commands. In it, you delimit paragraphs, emphasized or strong text, lists, and so forth with tags. Tags are surrounded by carets, like this: <a href="index.html">. This example is a link to index.html and would have text following it. This text would appear as a link in a web browser. Following this text, the link would be closed with a closing tag like this: </a>.
There are a lot of HTML tutorials out there, so I don't want to bore you too much. The important thing is that HTML is supposed to read a lot like text; formatting information should be included in CSS.
HTML Standards
- HTML 2, 3, 4
- XHTML 1.0 Transitional
- XHTML 1.0 Strict
It's a lot of alphabet soup, isn't it? The most important thing is that you use a validator to make sure that your site meets at least some of the standards.
Practices to avoid
- Opening new windows or altering existing ones
- Excessive or gaudy graphics and animations (and especially sounds)
- Moving material without putting in appropriate redirects (remember that you will likely get a lot of hits from search engines, and changes you make won't be reflected immediately by a search engine)
- Annoying form validators
- Gratutitous use of cookies
Practices to avoid
- Linking to files in proprietary formats
- 1997 called. They want their frames back.
- In short, don't do anything to mess with the user's browser or annoy them. Remember, they are your customers. If you build meaningful content into a standards-compliant, simple site, and it's fast, you will do well.
More Advanced topics
- CSS
- Javascript
- CGI-the Common Gateway Interface
- PHP
- Databases
- Other media- Flash, MPEG, etc.