Apple Logo Itsamac Hosting
Mac OS Journal
EditorialsColumnsFeaturesReviewsArchives/StaffSubscribe
 
Table of Contents From the Desktop Connect Feature Column Special The CoXFiles The Gaming Landscape The Surf Report Simply Web
Advanced HTML The Graphic Eye Medicine Man The Database Guru Shop Talk Review - DeusEx Review - MacMoney/InvoicIt Review - Scrabble Behind the Scenes      
   
 
Itsamac.com
Red Light Runner
Applelust.com
     
 

Simply Web
October 2000 || Volume 01, Issue 03

Last month I introduced the subject of this column and gave you some history of the web. I promised to do up a glossary this month which will act as a reference for all the columns to follow. And next month? We get down to basics. In the meantime, here are some terms you need to know...

divider

Glossary

The web as we know it today depends on three essential elements: HTTP, HTML and URL.

HTTP - The HyperText Transfer Protocol is a standardized method defining how servers and clients communicate. The client (your web browser) sends a request to the server for a document at a particular address. It also tells the server what kinds of documents the browser can understand. The server then loads the document into its memory (if it's not already there) and sends it off to the client. This next point will become important in understanding client/server communications later: the server then forgets about the client! Remember that point -- a test will be given. (see the W3C's Protocols page for much more on HTTP) (see below for more about protocols)

HTML - The Hypertext Markup Language is the code which specifies the appearance of the document in a way that a browser can understand. Originally a very simple code intended to allow the display of typical academic-type documents -- mainly text in various paragraph formats -- it has evolved into a convoluted soup of partially supported code allowing highly complex documents that may contain almost any kind of content to be displayed by browsers, albeit not necessarily displayed the same in any two browsers. HTML is in a state of flux at the moment. I'll have much more to say about it later in this series. (see the W3C's Hypertext Markup Language page for a lot more on HTML and its siblings) (see below for more about languages).

URL - A Uniform Resource Locator is the address of the document your browser is requesting from the server. It's in the form <protocol://node/address>, where the protocol is usually "http", the node is often a domain name, and the address is a file path to the document you want. For example, the home page of the publishing company, of which I am V.P., is http://www.thistledance.com/index.html. This is a web page, so the http protocol is appropriate; the domain name is thistledance.com; the file is at the root of that domain and is called index.html. (Sorry for the shameless plug. It won't happen again. Well, not too often anyway.) URLs were originally known as UDIs (Universal Document Identifier). A new way of identifying resources is called URN (Uniform Resource Name). Together URLs and URNs are called URIs (Uniform Resource Identifiers). Aren't you just thrilled to be learning this?

Those are the basic triumvirate on which everything else on the web depends, but let's go back a little further to define where the web is in the larger Internet. What? You didn't know there was a difference between the web and the Internet? That's why I'm writing this glossary.

The web, short for World Wide Web, is just one part of the Internet, though it's the part that's been getting the most press lately. The Internet, on the other hand, is essentially a collection of servers (nodes) wired together. Some of these nodes provide services to other nodes, while the rest provide services to us, for example serving the documents we are seeking. All are tied together in a vast network which, it has been said, would survive even a nuclear war because of its massive redundancy. (Let's not put that to the test, okay?) This immense network uses a variety of protocols to categorize and deliver information...

divider

Protocols

FTP - File Transfer Protocol is the means by which static information, often programs or images, can be placed on or retrieved from servers. The are special FTP programs, such as Fetch or Transmit, for doing this. But since most ftp sites don't require authorization (they're "anonymous ftp") you can just use your web browser to retrieve files if you prefer. Often, when you download an update to a piece of software, your browser automatically switches to ftp mode without your ever noticing.

NNTP - Network News Transfer Protocol provides the basis for newsgroups (USENET as it's sometimes still called) and their many threaded discussions. It rests on the same foundations that support email, so it's not too surprising that you use an email program (or, if you prefer, a special news program) to access news servers.

POP/SMTP/IMAP - These three are the protocols that give us email. At least the first two do for most of us, and the third does too for some of us. Post Office Protocol is what you use when you receive email, with Simple Text Transfer Protocol for sending email. The Internet Message Access Protocol does this too but also allows you to work with mail messages in remote mailboxes, which is why more and more email packages are offering support for IMAP. But ISPs (Internet Service Providers) don't seem to be jumping on the IMAP bandwagon very fast, so it's still not that common. But it's definitely the email protocol of the future.

MIME - Multipurpose Internet Mail Extensions specify and describe the format of Internet message bodies. Eh? Well, basically the MIME information contained in the header of any message flowing over the Internet tells the software receiving the message what it consists of. So your browser for example can tell whether to display the message itself or to call upon a suitable helper application, such as Acrobat Reader, Quicktime or even, say, Microsoft Word.

divider

Languages

In large part the web runs on HTML, and since that's the subject of this column, I'll be sticking pretty close to it in future. But you should know a little more than that, at least as background information.

XHTML - HTML has ended. It's still in very wide-spread use of course, and will be for years to come. But it's no longer in development. Instead it has been superseded by XHTML. Here, because I can't say it better myself, and because I don't want you to think this is just my opinion, is the official word about XHTML vs HTML from on high...

"XHTML 1.0 is W3C's recommendation for the latest version of HTML, following on from earlier work on HTML 4.01, HTML 4.0, HTML 3.2 and HTML 2.0. With a wealth of features, XHTML 1.0 is a reformulation of HTML 4.01 in XML, and combines the strength of HTML4 with the power of XML."

That's from the HTML page at the W3C. But what's that about the power of XML?

XML - The eXtensible Markup Language is rapidly becoming the lingua franca of communications. It provides a mechanism for creating your own tags to describe your data rather than relying on the predetermined tag set of HTML. Of course this doesn't do you any good as an individual unless you like playing around with this stuff. But industry and trade groups of various sorts are using XML to create tag sets specific to their group, thereby making communication between trading partners much more reliable, and even automatic. Think business to business eCommerce. Think dollars - lots of dollars. Also, XML is fated to become the foundation of all sorts of languages used in various wireless and handheld devices. WML, for example, is an XML-derived language used in WAP-capable cell phones.

SGML - All the above languages are "applications of" SGML, the Standard Generalized Markup Language, itself descended from an earlier GML. (No points for guessing what that means.) More than that you do not want to know. Trust me. The key issue for our purposes is that SGML lacks a linking mechanism, obviously fundamental to the web. And it is orders of magnitude more complex than XML, and therefore much harder for mere mortals like myself to learn. Hence, XML.

divider

Miscellaneous

Web Browser - a piece of software which sometimes displays documents the way the author intended them to appear. (I'll have much more to say about web browsers as the column proceeds. For now, the only thing you need to know is that - preferences aside - the latest version of Internet Explorer for the Mac is the most standards compliant, therefore the best browser currently available.)

Tags - defined in a particular language, such as XHTML; otherwise known as markup; the code that tells a "user agent" what the enclosed data is and what to do with it. More clearly - and specifically in regard to HTML - code that tells your web browser how to display the content found between the relevant start and end tags. Later columns will go into the most common tags in some detail.

divider

It's a Wrap

All that may be as clear as mud, but at least it covers the ground. I'll make additions (and, if necessary, corrections) to the above in subsequent columns, revising the glossary as I go so it remains complete and current.

Tune in next month as we continue our exploration of all things web.

Rob's Icon Rob Stevenson - rstevenson@macosjournal.com
Rob's Page - Feedback Form

back forward