- HTML By Example -

Chapter 1

What Is HTML?


The explosive growth of the World Wide Web is relatively unprecedented, although it resembles the desktop publishing revolution of the early and mid-1980s. As personal computers became more common in homes and offices, people began to learn to use them for document creation and page layout. Although early word processing programs were not terribly intuitive and often required memorizing bizarre codes, people still picked them up fairly easily and managed to create their own in-house publications.

Suddenly, the same kind of growth is being seen as folks rush to create and publish pages of a different sort. To do this, they need to learn to use something called the Hypertext Mark-up Language (HTML).

HTML at a Crossroads

HTML and the World Wide Web in general are currently in a stage of development similar to that of the desktop publishing revolution. Still working to reach maturity as a standard, HTML is feeling the same growing pains that early word processing programs did—as more users flock to HTML, there is a growing need to standardize it and make it less complex to implement.

These days, word processors are much more intuitive than they were fifteen years ago. There are fewer codes and special keystrokes required to get something done. The applications have matured to the point where most of the low-level formatting is kept hidden from the user of the application. At the same time, the printed page is now more completely mirrored on the computer screen, with accurately represented fonts, emphasis, line breaks, margins, and paragraph breaks.

Although programs are quickly being developed to offer similar features for HTML development, these tend to be less than ideal solutions. Currently, then, anyone who decides to learn HTML is going to have to know some codes, memorize some syntax, and develop pages for the World Wide Web without the benefit of seeing all the fonts, emphasis and paragraph breaks beforehand.

But anyone who has had any success with word processing programs of ten or 15 years ago (or desktop publishing programs as recently as five years ago) will have little or no trouble learning HTML. Ultimately, you'll see that HTML's basic structure makes a lot of sense for this emerging medium—the World Wide Web. And, as with most things computer-oriented, you'll find that once you've spent a few moments with it, HTML isn't nearly as difficult as you might have originally imagined.

HTML is not a Programming Language

There's nothing I'd like more than to say: "Yes, HTML is a very difficult programming language that it's taken me years to master. So I'll have to charge $75 an hour to develop your Web pages for you." Unfortunately, it's simply not the case. As I've already hinted, creating an HTML document is not much more difficult than using a ten-year-old copy of WordPerfect with the Reveal Codes setting engaged.

Remember the definition of HTML: Hypertext Mark-up Language. In HTML itself, there is no programming—just the "marking up" of regular text for emphasis and organization.

In fact, I prefer to call people who work with HTML "designers" or "developers," and not programmers. Actually, there's only limited design work that can be accomplished with HTML (especially the most basic standards of HTML), and anyone used to working with FrameMaker, QuarkXPress, or Adobe PageMaker will be more than a little frustrated. But the best pages are still those created by professional artists, writers, and others with a strong sense of design.

As Web page development matures, we are starting to see more concessions to the professional designers, as well as an expansion into realms that do require a certain level of computer programming expertise. Creating scripts or applets (small programs) in the Java language, for instance, is an area where Web page development meets computer programming. It's also a relatively distinct arena from HTML, and you can easily be an expert in HTML without ever programming much of anything.

The basics of HTML are not programming, and, for the uninitiated in both realms, HTML is much more easily grasped than are most programming languages. If you're familiar with the World Wide Web, you've used a Web browser like Netscape, Mosaic, or Lynx, and if you have any experience with a word processor or text editor like WordPad, Notepad, SimpleText, or Emacs, then you're familiar with the basic tools required for learning HTML.

A Short HTML History

HTML developed a few years ago as a subset of SGML (Structure Generalized Mark-up Language) which is a higher-level mark-up language that has long been a favorite of the Department of Defense. Like HTML, it describes formatting and hypertext links, and it defines different components of a document. HTML is definitely the simpler of the two, and although they are related, there are few browsers that support both.

Because HTML was conceived for transmission over the Internet (in the form of Web pages) it is much simpler than SGML, which is more of an application-oriented document format. While it's true that many programs can load, edit, create, and save files in the SGML format (just as many programs can create and save programs in the Microsoft Word format), SGML is not exactly ideal for transmission across the Internet to many different types of computers, users, and browser applications.

HTML is more suited to this task. Designed with these considerations in mind, HTML lets you, the designer, create pages that you are reasonable sure can be read by the entire population of the Web. Even users who are unable to view your graphics, for instance, can experience the bulk of what you're communicating if you design your HTML pages properly.

At the same time, HTML is a simple enough format (at least currently) that typical computer users can generate HTML documents without the benefit of a special application. Creating a WordPerfect-format document would be rather difficult by hand (including all of the required text size, font, page break, column, margin, and other information), even if it weren't a "proprietary"—that is, nonpublic—document format.

HTML is a public standard, and simple enough that you can get through a book like this one and have a very strong ability to create HTML documents from scratch. This simplicity is part of a trade-off, as HTML-format documents don't offer nearly the precision of control or depth of formatting options that a WordPerfect- or Adobe PageMaker-formatted document would.

Marking Up Text

The most basic element of any HTML page (and, therefore, any page on the Web) is ASCII text. In fact, although it's slightly bad form, a single paragraph of regular text—generated in a text editor and saved as a text file—could be displayed in a Web browser with no additional codes or markings (see fig. 1.1). An example of this might simply be:

Welcome to my home on the World Wide Web. As you can see, my page isn't

completely developed yet, but there were some things I simply had to say

before I could get anything else done. My name is Emmanuel Richards, and

I'm a real estate developer located in the San Fernando Valley. If you'd

like, you can reach my office at 555-4675.

Although possible, you would never want to display plain text on the Web without conforming to certain HTML conventions, which are explained in Chapter 6, "Creating a Web Page and Entering Text."

Fig. 1.1

Text is so basic to HTML that it can be displayed in a Web browser with no additional commands or codes.

Remember that HTML-formatted documents aren't that far removed from documents created by a word processing program, which are also basically text. Marking up text, then, simply means you add certain commands, or tags, to your document in order to tell a Web browser how you want the document displayed.

One of the most basic uses for HTML tags is to tell a browser that you want certain text to be emphasized on the page. The HTML document standard allows for a couple of different types of emphasis, including explicit formatting, where you choose to make something italic as opposed to bold, or implicit formatting, where it's up to the browser to decide how to format the emphasized text.

Using part of the example above, then, an HTML tag used for emphasis might look something like this:

Welcome to <EM>my home</EM> on the World Wide Web.

In this example, <EM> and </EM> are HTML tags that tell the Web browser which text (in this example, my home) is to be emphasized when displayed (see fig. 1.2).

Fig. 1.2

HTML tags can be used to mark certain text for emphasis.

The browser isn't just displaying regular text; it has also taken into account the way you want the text to be displayed according to the HTML tags you've added. Tags are a lot like margin notes you might make with a red pen when editing or correcting term papers or corporate reports. After you've entered the basic text in a Web document, you add HTML mark-up elements to tell the browser how you want things organized and displayed on the page.

You'll learn more about the specific types of tags in Chapter 6, "Creating a Web Page and Entering Text," but for now, the most important distinction is between text and HTML tags. All HTML documents will be basically text, as are all word processing documents and most desktop publishing documents. The only difference, then, is how the text is described for display on the screen (or, in many cases, for a hard-copy printout).

In most word processing documents, the "mark-up" that describes the emphasis and organization of text is hidden from the user. HTML, however, is a little more primitive than that, as it allows you to manually enter your text mark-up tags to determine how the text will appear. You can't do this with an MS Word document, but, then again, MS Word documents aren't the standard for all Web pages and browser on the Internet!

Who Decides What HTML Is?

It's difficult to pin down exactly who is responsible for the HTML standard and its continued evolution. While what may be most important question is who uses HTML, and how they use it, a number of groups exist to monitor, brainstorm, and try to pin down the standards as they evolve.

The HTML Working Group

The HTML standard is maintained and debated by a group called the HTML Working Group, which, in turn, is a creation of the Internet Engineering Task Force. The Working Group was charged in 1994 with the task of defining the HTML standard that was in widespread use on the Web at the time (known as HTML 2.0), and then submitting proposals for future standards, including the HTML 3.0 standard.

Up until the spring of 1996, the Working Group seemed to be the bearer of the basic standard for HTML around the world, while others work to agree on standards for other Web-oriented technologies that have a cursory relationship—like graphics formats, digital movies, sounds, and emerging Web languages such as Java and VRML (Virtual Reality Modeling Language). Now, nearly all responsibility for future Web development will most likely fall to an industry cooperative called the W3 Consortium.

The World Wide Web Consortium

HTML was originated by Tim Berners-Lee, with revisions and editing by Dan Connolly and Karen Muldrow. Up until the time when the Working Group took over responsibility for the standard, it was largely an informal effort.

Still very much involved in the evolution of the standard is Tim Berners-Lee, who now serves as director of the World Wide Web Consortium (W3C)—a group of corporations and other organizations with an interest in the World Wide Web. The group is run by the Laboratory for Computer Science at MIT, and includes members such as AT&T, America Online, CompuServe, Netscape Communications Corp., Microsoft Corp., Hewlett Packard, IBM, and many others.

Here, member organizations get together to iron out differences over Web-related standards and practices while working to maintain some level of standardization between their products. Corporate self-interest can sometimes get in the way, but it is definitely of utmost importance to most of these organizations that their products stay abreast of the most popular standards, and that their customers are able fully to take advantage of the Web.

Individual Companies and HTML

In the meantime, HTML continues to evolve, sometimes in spite of standard-bearing organizations. As more and more commercial companies take an interest in the HTML standard, it has become increasingly difficult to know who, exactly, decides what HTML will become in the future.

Some notable deviations from the standard are the extensions, or additional commands, that Netscape Communications Corp. has added to HTML 2.0 (see fig. 1.3). Only Netscape's browsers (and those written to be compatible with Netscape's products) can view all of these extensions, and some of them have yet to be recognized by the HTML Working Group. Netscape can get away with this, though, since it controls somewhere around 60 percent of the World Wide Web Browser market.

Fig. 1.3

Aside from being able to view most of the HTML-standard tags recognized by the HTML Working Group, Netscape Navigator can also display text in special ways.

With that sort of influence, Netscape can sway the hearts and minds of members of the W3 Consortium to some degree, and plans for future HTML specifications often take into account the additions made by companies such as Netscape.

Other companies, notably Microsoft, have also distributed Web browsers—in Microsoft's case, the Internet Explorer—that offer enhancements over the agreed-upon HTML standards, and acceptance of those extensions by a majority of Web designers may further sway groups like the HTML Working Group.

What is the IETF?

Additional Information on HTML Standards and Organizations

Most of the HTML standard bodies and organizations maintain an active presence on the World Wide Web, and information about these groups and their work can be found in many places.

For more on the World Wide Web Consortium, consult the W3C Web site at http://www.w3.org/. This site will probably be the most useful as you continue to learn more about HTML and emerging new standards.

For more information on the IETF, point your Web browser to the URL http://www.ietf.cnri.reston.va.us/home.html. This is the IETF's home on the Web, offering tons of links to related projects as well as information about meetings and other Internet-related groups.

To learn about the HTML Working Group, take a look at http://www.ics.uci.edu/pub/ietf/html/. Here you'll find a little about the history of HTML, who the current members and officers of the Working Group are, and how to contact the group.

Information about Netscape and Netscape's additions to HTML can be found at http://www.netscape.com/.

Summary

HTML is a document format, somewhat like word processing or desktop publishing formats, but considerably less complicated and based on more open standards. Creating HTML programs isn't really programming—although some programming can be necessary in other aspects of Web page creation. There are a few different organizations that make it their business to oversee the HTML standard, but the standard can just as easily be affected by the software companies that write Web browsers. The standard is also influenced very much by what commands and layout features Web designers implement, and what commands they ignore.

Review Questions

  1. Is HTML a programming language?
  2. True or false? HTML documents can be created with nothing more than a text editing program.
  3. What other mark-up language is HTML based on?
  4. What's the difference between explicit formatting and implicit formatting?
  5. True or false? You can directly edit a WordPerfect-format document.
  6. Is the HTML Working Group a subsidiary of the World Wide Web Consortium?
  7. Why is it important that HTML be a public standard?
  8. How can individual Web designers affect the HTML standard?


| Previous Chapter | Next Chapter |

| Search | Table of Contents | Book Home Page | Buy This Book |

| Que Home Page | Digital Bookshelf | Disclaimer |


To order books from QUE, call us at 800-716-0044 or 317-361-5400.

For comments or technical support for our books and software, select Talk to Us.

© 1996, QUE Corporation, an imprint of Macmillan Publishing USA, a Simon and Schuster Company.