Special Edition Using HTML 4

Previous chapterNext chapterContents


- 6 -
Applying Charcter Formatting

by Robert Meegan and Mark R. Brown

Text Formatting

Once you've created your document, much of the hard work is done. The text that you've written is neatly broken into paragraphs, headings are in place, and the miscellaneous items such as the title and the author information have been added. At this point you could walk away satisfied, but something still seems to be missing.

One of the primary things that separates documents created on a word processor from those produced on a typewriter is the idea of text formatting. Word processors give the author control over how her text will look. She can chose the font that she likes in the appropriate size, and she can apply one or more of a myriad of options to the text. In HTML, you have this same capability. Your only real restrictions involve the importance of viewer independence.

Logical Formatting

One of the ideas behind HTML is that documents should be laid out in a logical and structured manner. This gives the users of the documents as much flexibility as possible. With this in mind, the designers of HTML created a number of formatting elements that are labeled according to the purpose they serve rather than by their appearance. The advantage of this approach is that documents are not limited to a certain platform. Although they may look different on various platforms, the content and context will remain the same.

These logical format elements are as follows:

<CITE>Tom Sawyer</CITE> remains one of the classics of American literature.
One of the first lines that every C programmer learns is:<BR>
<CODE>puts("Hello World!");</CODE>
The actual line reads, "Alas, poor Yorick. I knew him, EM>Horatio</EM>."
To run the decoder, type <KBD>Restore</KBD> followed by your password.
The letters SAMP>AEIOU</SAMP> are the vowels of the English language.
The most important rule to remember is <STRONG>Don't panic</STRONG>!
The sort routine rotates on the <VAR>I</VAR>th element.
<DFN>The aardvark is an ant-eating animal.</DFN>

Note that all of these elements are containers, and as such, they require an end tag. Figure 6.1 shows how these logical elements look when seen in the Netscape viewer.

FIG. 6.1
Samples of the logical format elements are displayed in Netscape.

You have probably noticed that a lot of these format styles use the same rendering. The most obvious question to ask is why use them if they all look alike?

The answer is these elements are logical styles. They indicate what the intention of the author was, not how the material should look. This is important because future uses of HTML may include programs that search the Web to find citations, for example, or the next generation of Web viewers may be able to read a document aloud. A program that can identify emphasis would be able to avoid the deadly monotone of current text-to-speech processors.

The <BLOCKQUOTE> Element

You may have the opportunity to quote a long piece of work from another source in your document. To indicate that this quotation is different from the rest of your text, HTML provides the <BLOCKQUOTE> element. This container functions as a body element within the body element and can contain any of the formatting or break tags. As a container, the <BLOCKQUOTE> element is turned off by using the end tag.

The normal method used by most viewers to indicate a <BLOCKQUOTE> element is to indent the text away from the left margin. Some text-only viewers may indicate a <BLOCKQUOTE> by using a character, such as the "greater than" sign, in the leftmost column on the screen. Because most viewers are now graphical in nature, the <BLOCKQUOTE> element provides an additional service by enabling you to indent normal text from the left margin. This can add some visual interest to the document.

Listing 6.1 shows how a <BLOCKQUOTE> is constructed, including some of the formatting available in the container. The results of this document when read into Netscape can be seen in Figure 6.2.

Listing 6.1  Construction of a <BLOCKQUOTE>

<HTML>
<TITLE>BLOCKQUOTE Example</TITLE>
<BODY>
<BLOCKQUOTE>
Wit is the sudden marriage of ideas which before their union were not
perceived to have any relation.
</BLOCKQUOTE>
<CITE>Mark Twain</CITE>
</BODY>
</HTML>

Physical Format Elements

Having said that HTML is intended to leave the appearance of the document up to the viewer, I will now show you how you can have limited control over what the reader sees. In addition to the logical formatting elements, it is possible to use physical formatting elements that will change the appearance of the text in the viewer. These physical elements are as follows:

This is in <B>bold</B> text.
This is in <I>italic</I> text.
This is in TT>teletype</TT> text.
This text is <U>underlined</U>.
This is a <STRIKE>strikethough</STRIKE> example.
This is <BIG>big</BIG> text.
This is <SMALL>small</SMALL> text.
This is a  SUB>subscript</SUB>.
This is a <SUP>superscript</SUP>.

FIG. 6.2
This is the appearance of the document in Netscape.

If the proper font isn't available, the reader's viewer must render the text in the closest possible manner. Once again, each of these is a container element and requires the use of an end tag. Figure 6.3 shows how these elements look in the Internet Explorer.

FIG. 6.3
Samples of the physical format elements are shown in the Internet Explorer.

These elements can be nested, with one element contained entirely within another. On the other hand, overlapping elements are not permitted and can produce unpredictable results. Figure 6.4 gives some examples of nested elements and how they can be used to create special effects.

FIG. 6.4
Logical and physical format elements can be nested to create additional format styles. In line three of this example, the <I> and <B> tags have been combined to create bold italic text.


TIP: There is a tag available only in Netscape Navigator that has acquired a particularly bad reputation: the <BLINK> tag is notorious in HTML circles. Unless you want people to speak ill of your documents, it's best to avoid this tag. If you do use it, make absolutely sure you remember to use a </BLINK> tag in the proper place. There's nothing more annoying than a whole page of blinking text.

Fonts

You, as document author, have the ability to control the appearance of the text in your documents. This capability was restricted entirely to the reader in versions of HTML previous to 3.2. The problem with this ability is that you can only use fonts that exist on you readers' machines. So how do you know what your user might have available?

Unfortunately, you don't. If you are building documents to be used on an intranet, your organization should set standards as to which fonts should be found on every machine. As long as this is a reasonable set, it will be easy to maintain and you will be able to choose any of the standard fonts for your document. If you are developing for the Web, however, you have a more serious problem. In practice, you really don't know what fonts your readers might have. Even the most basic selection depends greatly upon the hardware that your readers are using. There are no really graceful ways around this problem at the present, although several companies are looking into ways of distributing font information with a document.


NOTE: If you are developing for the Web and you would like to use some different fonts, you should be aware that Microsoft has several fonts available for free download on their Web site. These fonts are available in both Windows and Macintosh formats. If you decide to use any of these fonts in your documents, you might want to put a link to the Web page where your readers can download the fonts, if they don't already have them.
http://www.microsoft.com/truetype/fontpack/default.htm

The FONT Element

The method that HTML uses for providing control over the appearance of the text is the FONT element. The FONT element is a container that is opened with the <FONT> start tag and closed with the </FONT> end tag. Unless attributes are assigned in the start tag, there is no effect of using a FONT element.

The FONT element can be used inside of any other text container and it will modify the text based upon the appearance of the text within the parent container. Using the FACE, SIZE, and COLOR attributes, you can use FONT to drastically modify the appearance of text in your documents.

The FACE Attribute

The FACE attribute allows you to specify the font that you would like the viewing software to use when displaying your document. The parameter for the this attribute is the name of the desired font. This name must be an exact match for a font name on the reader's machine, or the viewer will ignore the request and use the default font as set by the reader. Capitalization is ignored in the name, but spaces are required. Listing 6.2 shows an example of how a font face is specified and Figure 6.5 shows the page in Microsoft Internet Explorer.

Listing 6.2  An Example of Font Face Selection

<HTML>
<HEAD>
<TITLE>Font Selection Example</TITLE>
</HEAD>
<BODY>
<FONT FACE="Tolkien">
This is an example of font selection. </FONT>
</BODY>
</HTML>

FIG. 6.5
The FACE attribute of the FONT element lets you select the font in which the text will be displayed.

Since you don't know for certain what fonts the user might have on his system, the face attribute allows you to list more than one font, with the names separated by commas. This is especially useful, since nearly identical fonts often have different names on Windows and Macintoshes. The font list will be parsed from left to right and the first matching font will be used. Listing 6.3 shows an example where the author wanted to use a sans-serif font for his text.

Listing 6.3  Font Face Selection can use a List of Acceptable Choices

<HTML>
<HEAD>
<TITLE>Font Selection Example</TITLE>
</HEAD>
<BODY>
<FONT FACE="Verdana", "Arial", "Helvetica">
This is an example of font selection. </FONT>
</BODY>
</HTML>

In this example, the author wanted to use Verdana as his first choice, but listed Arial and Helvetica as alternatives.

The SIZE Attribute

The SIZE attribute of the FONT element allows the document author to specify character height for the text. Font size is a relative scale from 1 though 7 and is based upon the "normal" font size being 3. The SIZE attribute can be used in either of two different ways: the size can be stated absolutely, with a statement like SIZE=5, or it can be relative, as in SIZE=+2. The second method is more commonly used when a BASEFONT size has been specified.

Listing 6.4 shows how the font sizes are specified and Figure 6.6 shows how they would look.

Listing 6.4  An Example of Font Size Selection

<HTML>
<HEAD>
<TITLE>Font Size Example</TITLE>
</HEAD>
<BODY>
<FONT SIZE=1>Size 1</FONT><BR>
<FONT SIZE=-1>Size 2</FONT><BR>
<FONT SIZE=3>Size 3</FONT><BR>
<FONT SIZE=4>Size 4</FONT><BR>
<FONT SIZE=+2>Size 5</FONT><BR>
<FONT SIZE=6>Size 6</FONT><BR>
<FONT SIZE=+4>Size 7</FONT><BR>
</BODY>
</HTML>

The COLOR Attribute

Text color can be specified in the same manner as the face or the size. The COLOR attribute accepts either a hexadecimal RGB value or one of the standard color names. Listing 6.5 is an example of how colors can be specified.

FIG. 6.6
Text size can be specified with the SIZE attribute of the FONT element.

Listing 6.5  An Example of Font Color Selection

<HTML>
<HEAD>
<TITLE>Font Color Example</TITLE>
</HEAD>
<BODY>
<FONT COLOR="#FF0000">This text is red</FONT><BR>
<FONT COLOR="GREEN">This text is green</FONT><BR>
</BODY>
</HTML>

The <BASEFONT> Tag

The <BASEFONT> tag is used to establish the standard font size, face, and color for the text in the document. The choices made in the <BASEFONT> tag remain in place for the rest of the document, unless they are overridden by a FONT element. When the FONT element is closed, the BASEFONT characteristics are returned. BASEFONT attributes can be changed by another <BASEFONT> tag at any time in the document. Note that BASEFONT is a tag and not a container. There is no </BASEFONT> end tag.

BASEFONT uses the FACE, SIZE, and COLOR attributes just as the FONT element does.

Listing 6.6 is an example of the <BASEFONT> tag. Figure 6.7 shows how the example looks in Internet Explorer.

Listing 6.6  An Example of the <BASEFONT> Tag

<HTML>
<HEAD>
<TITLE>BASEFont Example</TITLE>
</HEAD>
<BODY>
This text is before the BASEFONT tag.<BR>
<BASEFONT SIZE=6 FACE="GEORGIA">
This text is after the BASEFONT tag.<BR>
Size changes are relative to the BASEFONT <FONT SIZE=-3>SIZE</FONT>.<BR>
</BODY>
</HTML>

FIG. 6.7
The <BASEFONT> tag can be used to control the text characteristics for the entire document.

Text Formatting Tips

Now that you have all of the tools to format your text, you need to decide how you are going to use them. It is possible to use so many different fonts, sizes, and formats that your document will be unpleasant to read. Figure 6.8 is a bad example of how a document can use too many formats.

The following are general tips to keep in mind as you format your documents:

FIG. 6.8
The ability to select formats should not be overused.

Creating Special Characters

It's bound to happen sooner or later--you'll need to include some weird character on your Web page like a copyright sign or trademark symbol. Fortunately, HTML provides an easy way to do this. For example, if you need a trademark symbol, you use the substitute &trade. A Web browser program will interpret this properly as &trade;.

HTML 4.0 adds a whole list of new "entities", or special symbols. They fall roughly into three categories.

First is a set of international typography symbols that are necessary for creating Web sites that are truly world-wide. Though we don't use them in English, most Western languages couldn't get along without the "o" with an umlaut.

The second set of new entities are mathematical symbols. Long demanded by scientists and engineers, these new symbols allow them to put complex formulas inline with regular text. An integral equation is now almost as easy to create and display elegantly as a Shakespeare quote.


NOTE: Though Greek characters are included among the mathematical entities, the set is not adequate for creating documents in Greek. These symbols are intended for use in mathematical formulas only.

The final set of new characters included in the HTML 4.0 specification is a set of special characters that are included in Adobe's Symbol font, like daggers and fancy quotation marks.

Though entities are easy to use, the list of available characters is quite long. The full list is on the Web at http://www.w3.org/TR/WD-entities, but Table 6.1 lists a few popular characters to get you started:

Table 6.1  Some Symbols Defined in HTML 4.0

Entity Symbols
&cent, &pound, &yen ¢, #, ¥
&copy, &reg ", ®
&deg °
&frac14, &frac12, &frac34 1/2 , 1/3, 3 /4
&divide ÷
&pi [Pi]
&le, &ge < >
&amp &
&dagger [dagger]
&spades, &clubs, &hearts, &diams ´, ®, §, ©

To use one of these entities in an HTML document, just include it inline with text, as in this example:

I like bread &amp butter, and for dessert I like &pi.

The &amp will be displayed as an ampersand ("&") and the &pi will show up as the mathematical symbol for pi.


Previous chapterNext chapterContents


Macmillan Computer Publishing USA

© Copyright, Macmillan Computer Publishing. All rights reserved.