Copyright ©1996, Que Corporation. All rights reserved. No part of this book may be used or reproduced in any form or by any means, or stored in a database or retrieval system without prior written permission of the publisher except in the case of brief quotations embodied in critical articles and reviews. Making copies of any part of this book for any purpose other than your own personal use is a violation of United States copyright laws. For information, address Que Corporation, 201 West 103^rd Street, Indianapolis, IN 46290 or at support@mcp .com.

Notice: This material is excerpted from Special Edition Using CGI, ISBN: 0-7897-0740-3. The electronic version of this material has not been through the final proof reading stage that the book goes through before being published in printed form. Some errors may exist here that are corrected before the book is published. This material is provided "as is" without any warranty of any kind.

CHAPTER 17-Person-to-Person Interaction

So far you've seen several FORM-based interactions and a few uses for IMAGE MAPs. Although you already have a good start on understanding what makes CGI tick, you need to look at one area that seems to be overlooked a lot on the Web-the end user and how to use CGI scripts to interact with this unknown variable.

As you'll see in this chapter, the more invisible a CGI script is, the better off you'll be. You can make your scripts more invisible by making the interaction with them intuitive. By making your scripts and the associated interfaces intuitive, the end user will move seamlessly within your Web site, thus increasing the flow of work and knowledge.

After all, isn't this one of the reasons you bought this book? You wanted to go beyond HTML and create true client/server interactivity on your Web site.

In this chapter, you'll learn about the following:

The next step in CGI interaction, including WWW Interactive Talk and HTML-based chat systems
HTTP cookies and possible cookie applications

The Next Step with CGI

You've already seen CGI interaction through basic HTML forms in Chapter 8, "Modifying CGI Scripts"-for example, by using such scripts as Guestbook and WWWBoard (see figs. 17.1 and 17.2). However, all of us should strive to take this interaction further. How? By enabling multiple users to interact with each other through a CGI script.

Fig. 17.1

Here's the Add Entry HTML Interface to the guestbook CGI script covered in Chapter 8.

Fig. 17.2

Shown here is some of the HTML output from Matt Wright's WWWBoard CGI script.

See  "Installing and Modifying a Guestbook CGI Script,"  for more information on various methods of interacting with the user. 
See  "Web-Based Bulletin-Board Systems,"  for more information on WWWBoard.

With Guestbook and WWWBoard, you started allowing other users to do some kind of interaction with each other. In the Guestbook example, users could post simple messages and leave their e-mail addresses, which allowed your users to send notes to one another. In WWWBoard, you took this example a little further, enabling the users not only to read the postings of others, but also to reply to the original posting without starting an e-mail program. Doing this gives you a repository of related information within the message threads. This also adds significant value to the archived information.

This allowed your users, in a fairly simple manner, to interact with many other users through the CGI script. However, the communication between users can be taken even further with CGI. What if you could give your users a way of communicating with others on your Web site in pseudo real time? This would provide additional flexibility that some groupware designers only wish their products had.

WWW Interactive Talk

WWW Interactive Talk (WIT) is an HTML forms-based discussion system that's very similar in most cases to the way Lotus Notes can be used for group discussion and comments. (This kind of software is also referred to as groupware.) It was created to allow individuals to comment on various areas within a fairly structured environment. It's a way that you can provide an HTML page that others can append their comments to, so individuals immediately see whether a specific matter has been brought to the surface before it's resolved.

For more information on WIT, check out WIT.

This is a far different approach than what happens in Usenet newsgroups or mailing list servers. In fact, it's far superior for group discussions, such as workgroups. It can handle the process of discussion in a manner that many managers would appreciate. Instead of a drawn-out process, in which everyone must go read the FAQ of that area and follow threads, WIT items are divided into three groups: discussion areas, topic documents, and proposals.

Discussion Areas

Unlike newsgroups, everyone participating in a WIT forum must follow a few rules for everything to work the way it should. First, a discussion area can be created only by the system manager. For example, as the system administrator, I created a discussion area called CGI Discussion Area. This will be the area in which we discuss items related to CGI and, of course, can cross-reference related discussion areas. The CGI discussion area might be cross-referenced to the Using Perl for CGI area. Get the idea? Figure 17.3 shows a typical discussion area.

Fig. 17.3

The WIT user information documentation can be found at this site.

Topics

Under each discussion area is a slightly more specific area called topics (see fig. 17.4). Now the way the CGI script is written, anyone can create a topic document under any of the discussion areas. Here's where the rules come in:

The topic document must be related to the discussion area under which it's placed.
The topic document may describe only an issue that hasn't been resolved elsewhere.

An example of a topic would be, "What should we do about secure Web transactions?" This would be a great topic to be discussed under the CGI Discussion Area, and maybe "What language should we use for CGI?" would also fit.

Fig. 17.4

This figure shows examples of the WWW Interactive Talk's Topics and Proposals HTML output.

Proposals

After you create the basic discussion areas and users start putting topics for discussion within the topic area, the next type of document falls into place: proposals. In this section, people post their ideas for addressing a topic or problem (see fig 17.4). For this to work with the other areas, these proposals are posed as statements with which users can either agree or disagree, such as the following:

"We should use SSL."
"We need a new, secure Web server, such as Netscape."
"We should restrict access to some parts of the Web site."

The Work Flow

By using such a system as WWW Interactive Talk, the knowledge accumulated over time can be referenced and modified at any time. You can look through the sections and specific documents to see whether a matter has already been resolved. For example, the work flow in a CGI discussion area would go something like the following:

"What should we do about secure Web transactions?"
"We should use SSL."
"Can we afford a new secure Web server, such as Netscape?"
"We should restrict access to some parts of the Web site."
"What language should we use for CGI?"

That's the progression of the simple guest book in a workgroup organizational system. Such a system gives your Web site added value because of the information stored on it. And while we're here, one other environment that this type of CGI application would easily fit into is a corporate intranet Web server. Then you've given management a way to follow ideas and processes from conception to implementation, both easily and affordably.

An intranet is when a company uses Internet technology, such as Web servers and CGI, on the corporate LAN. Access to intranet services is limited to workstations on the inside the company via the LAN or even a wide area network (WAN). Because so many companies use TCP/IP as their network protocol of choice, especially if they're in a WAN environment, they can use an Internet firewall and Web server configurations to keep outsiders out.

All the CGI scripts in this book can be used in intranet environments. Again, you see the use of widely used, affordable Internet technology to replace proprietary and sometimes very costly communication software.

HTML-Based Chat Systems

Another way of enabling human-to-human interaction is simple communication. One of the most popular features of the Net is Internet Relay Chat, which allows many users in various locations to chat with each other. Think of IRC as a text-based telephone party line through which many people are sometimes talking about the same general thing. Although IRC is as close to real-time human to human interaction than is currently available, the communication is very unorganized. Information that has been discussed is lost unless one of those in the conversation has been capturing the flow of text.

Chat systems is an area where CGI scripts have faded a little bit because of the dedicated IRC client software that has become increasingly available on the Net. This isn't to say that chat systems implemented on a Web server are less functional-in fact, I would say just the opposite is true. HTML and CGI has given many people a great form of entertainment. Not only can they attach small pictures of themselves to the text they write, they can even create other little HTML worlds in which to chat with one another (see fig. 17.5).

Fig. 17.5

Shown here is an example of the WebChat interface, along with some actual HTML based conversations. Notice the cat image that's associated with the user named LYN.

Performance Considerations

Before I get to deep into CGI chat scripts, I feel I should cover the area of performance of these types of CGI- and HTML-based systems. Because of the nature of real-time chat, you have to update the client's browser by using Netscape's extensions of PUSH/PULL or have the user keep clicking some sort of update button.

The way these scripts work is quite simple. Think of it this way: A user receives a HTML document like the one in figure 17.5. At the bottom of the screen the user can type a message along with a user name of some kind. When the user submits the HTML form, a CGI script takes the inputted message, appends it to a chat file, and normally deletes the first message in the file. It then composes the HTML header for the page and inserts an HTML version of the chat file. Then the CGI script composes the bottom of the HTML file while passing information, such as the user's nickname, back into the form.

It's pretty simple program flow, except that it's a huge resource hog! Why? Because the user will get the new updated screen. Suppose that 50 people are using this chat system. That means the server will need to send the HTML document 50 times so that everyone will get the updated HTML document. It also means that you have to put HTML extensions in the document to cause the users' browsers to request the chat document every so often or to have them keep reloading the page. And you would need to do both because not everyone is going to be using a PUSH/PULL compatible browser. However, those who are using a compatible browser are going to be screaming, "Why don't you use PUSH/PULL technology?"

So if you add PUSH/PULL, how long are you going to wait for the browser to PULL the next update? Five, 10, 20 or more seconds? For the sake of argument, you set the META header to pull the document every 30 seconds. That means your server hits will be adding up at the rate of 100 per minute-and that doesn't include graphic images. If three of the 10 or so chat file messages have associated graphics with them, you've just added an additional 300 hits per minute on your server. That works out to be 24,000 hits an hour.

All of this is contingent on updating the file only every 30 seconds. This gets even wilder; only 10 messages are in the chat file at any given time. This means that no more than 20 or so messages can be entered every minute, or else everyone is going to miss a few messages every refresh. This isn't to mention that the average file, including graphics files, will be around 4K in size. This leads to another problem. You know you'll have 24,000 hits an hour. So if you multiply 24,000 hits by 4K of data, you're going to wipe out 96,000,000 bytes, or 96M, of bandwidth an hour (1.6M per minute). A T-1 has approximately 1.1M-per-minute capacity. I don't know about your circumstances, but doing this through a T-1 just became a wipeout. It's time to be adding another T-1 data circuit, or no one will be happy.

Is it unrealistic to imagine 50 people using a chat system? Maybe a little; it might be even higher. Due to the nature of the Web server, there also is no way of limiting the users reliably. So always do the math on some of these little projects. It doesn't take long for things to get out of hand. Remember two things when calculating bandwidth requirements. First, the IP bandwidth numbers are finite. Second, speed costs money. How fast do you want to go?

WebChat

One of the nicer chat systems available is WebChat. This CGI-based system is very flexible for most Webmasters. It's written in Perl, so it should be fairly easy to modify to suit your specific needs.

See  "Flavors of Perl,"  for some considerations on modifying WebChat for your specific Perl implementation. 
See  "To Allow or Not To Allow HTML Tags,"  for information on including images within HTML output.

This CGI chat system consists of a couple of GIF images, two Perl scripts, and an HTML form interface. You can FTP the archived tar file from ftp site. One nice feature of this system is that all the popular Web browsers can be used to interface with it because this system uses only an HTML form for the interface (see fig. 17.6).

Fig. 17.6

This figure shows the HTML form interface to WebChat, and some of the associated Netscape control variables.

Another feature of WebChat is the capability to link to images anywhere in the world. This takes an unnecessary burden off the Web server because it allows the client's browser to get the file from someone else's Web server. The downside of this is that it may take a while for the requested corresponding image to be returned to your browser.

See  "Trust No One,"  if you're interested in referencing HTML and images within CGI scripts.

This area of CGI scripts-chat systems-is becoming more and more commercialized. In fact, many of the publicly available CGI software packages are being taken commercial. WebChat even has a bigger brother that isn't cheap; however, the "commercial" version does do some impressive things, like WebChatCam. Some CGI-based chat systems are selling for well over $1,000 for a 10-user license. I have a hard time justifying that, unless it's going to be used for corporate Web sites.

After you retrieve the software, you'll need to untar the archive into the cgi-bin directory on your Web server. Now, this version works only on UNIX-based Web servers. Another version is available that might be modifiable to run on a Windows NT Web server with Perl.

After following the installation directions included with the distribution tar file, the CGI scripts will need only a small amount of modification to be able to use them. Mostly, this will be the editing of paths and executables. It shouldn't take you any more than 30 minutes to get this system up and running.

After you install the CGI scripts, you can test the system by loading the URL of the chat form. Enter information into the HTML form and submit it. You should be sent another HTML form similar to that shown in figure 17.7.

Fig. 17.7

The HTML output and form returned from the initial testing of the WebChat CGI script shows that the WebChat CGI script is working and accepting user input.RE

If you're using Netscape's browser, you'll be glad to know that the WebChat system of CGI scripts uses client PULL to get the new updated HTML document. If your browser doesn't support Netscape's PULL feature, click the chat button to update your page.

Now that you've installed a highly interactive CGI system, what can you really use this type of chat system for?

Customer support questions, which would allow other users to maybe help answer questions.
New employee education (or even just plain old employee education for that matter)
Online global conferencing system, which you could also archive the text from the chat system for future reference

No matter what you use a chat system for, I'm sure you'll find even more uses for it than cited here.

Introducing HTTP Cookies

One way you can make your CGI scripts more interactive is by using something fairly new to the world called magic cookies or just cookies.

So what are these cookies? They're just small text files stored on the client side of the Web. That means you can actually have your CGI scripts make a cookie, and then have your Web server send this information to the client's browser. When the client's browser gets the information, it will store the data on the client's hard drive. Then at a later date, when the client revisits your Web site and uses a CGI script that request this cookie, the client's browser will look to see whether it has the requested cookie. If it does, the browser will send the information stored in the cookie.

There's a possible downside to using cookies. Currently, only Netscape, Netcruiser v3.0, Microsoft Internet Explorer, and Quarterdeck's Mosaic v2.0 browsers support using cookies. So you'll probably have to make sure that your CGI script is going to be compatible with the other browsers in the world. This shouldn't be a problem, though, if you require the users of your service to use one of these browsers or if you're using cookies on an intranet Web site where the company regulates what browser software is running within the company, and the company chooses to use a "cookie"-compatible browser.

Possible Cookie Applications

Compared with using CGI to build a custom HTML form that has hidden input data for forms, cookies have a much greater prospective use. You could use cookies to support a CGI-based shopping system in which the customers' selected items are put into a virtual shopping cart, which is really stored in the cookie.

For other services, such as those that require registration, you could store your users' registration information in a cookie so that when they return to the service, a CGI script can check to see whether they already have an appropriate cookie. If they do, you could have the CGI script retrieve it from the client side and use the cookie data to build a custom HTML interface. That would seem to the users as though the service already knew who they were. And if a client has rights to only certain features of the service, your CGI script would already have that information.

See  "Integrating CGI into Your HTML Pages,"  for more information on building HTML CGI interfaces.

Think about it this way: The client needs to fill out a registration form only once. This information is stored on the client side rather than on your Web server in some huge data file, which will become unmanageable. Talk about behind-the-scenes invisible CGI user interaction!

You could even use cookies as a kind of virtual coupons. This could be a little incentive for users to fill out a questionnaire. After the form is filled out the way you wanted, you could give users virtual coupons to be redeemed for some type of Web-based service. In fact, you could even set an expiration date so that if a client didn't use the cookie/coupon by a certain date, it would be void.

That's enough about what you could use cookies for; I'm sure you've even thought up a couple of other uses for them.

Cookie Ingredients-er, Specifications

A cookie is made up of several items: URL names, an expiration date, PATH, and a secure flag. This information is actually sent in the HTTP header of a document. Now, the format for a cookie is as follows:

Set-Cookie: NAME=VALUE; expires=DATE; path=PATH;
[ccc]domain=DOMAIN_NAME; secure

Set-Cookie: and NAME=VALUE

To break this format down, Set-Cookie: tells the client's browser that a cookie is getting ready to be handed to it. The next attribute is the cookie's name. This name can be anything you want and, of course, the value associated with this name can also be anything, such as NAME_OF_BAKERY=Torlones or ITEM_NUMBER=CC295. There's a limit to how much you can put into a NAME and the associated VALUE. You're limited to 4K of data. That should handle just about anything you'll need. This is also the only required attribute of a Set-Cookie: header for Netscape. However, Microsoft's Internet Explorer requires a full cookie header.

Expiration Date, or How the Cookie Crumbles

The next attribute of a cookie is expires=DATE. This is a plain old expiration date. When this date is reached, the client's browser will delete the associated cookie and no longer give it out. The following is an example of expires=DATE:

Set-Cookie: USERID=Michael_T_Erwin; expires=Tuesday, 31-Dec-96
[ccc]23:59:59 GMT

In this cookie, my stored USERID name will no longer be valid after 11:59:59 p.m. GMT. Tuesday, December 31, 1996. This cookie will expire at that point, and the browser won't send it out.

If you need to use spaces in the stored value of the cookie, use %20. For example, if I wanted the USERID to actually be Michael T Erwin without the underlines within the value, I could have written the following:

Set-Cookie: USERID=Michael%20T%20Erwin;

path

The path attribute can get a little confusing, so bear with me. It tells the browser what directories are valid for this cookie, as follows:

Set-Cookie: USERID=mikee; path=/bbs

This tells the browser that any time it requests a URL from the site and the URL is below /bbs, send the cookie, USERID=mikee, to the Web server. For example, if you requested /bbs/mainmenu.html from the Web server with the request for the document mainmenu.html, it would have also sent USERID=mikee. What's more, it also would send USERID=mikee if the URL had been /bbsdocs/index.html because you told it that the cookie is valid for any URL using the path /bbs.

Now if you had specified path=/, any URL you requested from the cookie's originating Web site would cause the browser to send the cookie to the Web server with the request for any URL at that site. If you hadn't specified a path, the cookie would be sent only if the directory was the same as the originating URL.

There's a bug in Netscape Navigator version 1.1 and earlier. Cookies that have the path attribute set to / will be saved only if they have an expires attribute.

domain

The domain attribute tells the browser what domain names this cookie is valid for. If you set domain=.mcp.com, the browser would send that cookie to any of the Web servers at mcp.com. However, this also depends on the contents of the path attribute.

Another related issue with domain is that only hosts within the same domain may set cookies to be used within that domain. To carry this even further, you have to have at least two periods in the domain attribute. This prevents someone from doing something lame like domain=.com, and if you use a regional type domain name, such as .k12.wv.us, you need to have three periods in the domain attribute.

If a browser is requesting an URL that meets the criteria of several stored cookies, it will see every cookie that meets the domain and path criteria with the URL request. That results in the Web server receiving a cookie like this:

Cookie: NAME=VALUE; NAME=VALUE; NAME=VALUE;...
***end note***

Handling Cookies

So how do you set a cookie in your CGI scripts? Well, you'll need to have a section of script that looks something like listing 17.1. In this example of a UNIX shell script, the CGI script creates the HTML header information and then sends the cookie, which is then followed by the rest of the HTML document.

Listing 17.1 A UNIX Shell Script for Sending the Cookie
#!/bin/sh
echo "Content-type: text/html"
echo "Set-cookie: UserId=mikee; expires=Wednesday, 31-Jan-96
[ccc]12:00:00 GMT"
echo "Set-cookie: Password=guess; expires=Wednesday, 31-Jan-96
[ccc]12:00:00 GMT"
echo ""
echo "<HTML><HEAD><TITLE>Welcome to WWW BBS</TITLE>"

Making the Cookies Chewy

The CGI script in listing 17.1 is hard coded. That means you have to write a new shell script for every cookie you want to send, and you don't want that to happen. First, you must decide what information you need to put in the form of a cookie. After you do that, you need to decide where this information is to come from. Is the information going to be generated on the Web server, from an HTML form the client filled out, or possibly both?

Look at figure 17.8. This flow chart shows how the user will interact with a simple CGI registration service using cookies as the form of authentication. The first step is to request an URL, which is really a front door to the service. To keep things simple, make this HTML document a combination of items (see fig. 17.9).

Fig. 17.8

This simple flow chart shows user interaction with a simple CGI cookie BBS application.

Fig. 17.9

The CGI cookie BBS's initial new-user HTML screen warns the user that the user's browser may not support cookies.

See  "The Registration Page,"  for more information on considerations for a registration page.

When the client submits the form shown in figure 17.9, a CGI script is started to process the information. This CGI application will take the information the user inputted into the HTML form and generate several cookies, including one you won't tell the user about. The CGI script generates cookies for Userid, Password, and then set the user's initial security or access level. It will then send a thank you and a welcome HTML document (see fig. 17.10). What your user doesn't know is that, at this point, he has just received three different cookies.

Fig. 17.10

The CGI cookie BBS application will generate a thank-you and welcome screen in real time based on the user's access rights or security level.

Now when a user reloads the welcome screen from his hotlist, the browser will notice that there are three cookies for this URL. It will then send the three cookies with the request to load the URL. When the Web server gets the request, it sees that it's to start up a CGI script.

The CGI script will actually look at the cookies' contents to see whether they have rights to access this page. The CGI script will also add an entry to the logs for this visit.

Users no longer need to worry about their registration information because it just became seamless to them. As the administrator, you see not only the Web hit, but you can look at the logs to see who's actually using the system. This gives users the freedom of not worrying about passwords and such.

You also get the capability to increase or decrease your user's security level because you've included it in a cookie. This creates a nice flexible system that's easily navigated by the client and manageable by the Webmaster.

A Commercial Shopping Cart

As I stated before, more and more software is becoming commercial. One of the better commercial cookie-based software packages available on the Web is OopShop Shopping Cart System (see fig. 17.11). Because the system is being developed for commercial accounts, expect to pay around $500 for it.

Fig. 17.11

The OopShop home page is located at this site.

Jerry Yang, author of this system, has even released a smaller version of the software that he published under the GNU General Public License, version 2, which is included on the CD-ROM accompanying this book. He calls this software OopShop Free Cart. For more information on this cookie-based system, check out http://www.ids.net/~oops/tech/make-cookie.html.

Previous Chapter <-- Table of Contents --> Next Chapter

QUE Home Page

For technical support for our books and software contact support@mcp.com