Copyright ©1996, Que Corporation. All rights reserved. No part of this book may be used or reproduced in any form or by any means, or stored in a database or retrieval system without prior written permission of the publisher except in the case of brief quotations embodied in critical articles and reviews. Making copies of any part of this book for any purpose other than your own personal use is a violation of United States copyright laws. For information, address Que Corporation, 201 West 103^rd Street, Indianapolis, IN 46290 or at support@mcp .com.

Notice: This material is excerpted from Special Edition Using CGI, ISBN: 0-7897-0740-3. The electronic version of this material has not been through the final proof reading stage that the book goes through before being published in printed form. Some errors may exist here that are corrected before the book is published. This material is provided "as is" without any warranty of any kind.

CHAPTER 18-Taking Advantage of Web-Based Audio

When the World Wide Web was launched in 1992, it was a text-based system that was as mute as a stone. It didn't remain text-based for long, and it also didn't remain mute. Although the number of different audio formats you'll find on the Web still trails the number of different graphic formats you can encounter, sound is definitely making a big impact on the Web.

In this chapter, you'll learn the following:

The difference between static and live Web-based audio
The different types of audio players available for different types of Web browsers
What Internet Radio is
What audio compression and audio streams are
How to set up your Web server to deliver audio, and what factors you need to consider for an audio-delivery system

Introducing Web-Based Audio

The World Wide Web was released by CERN in 1992 as a simple text-based, information delivery system. Graphics didn't appear until 1993 with the release of Mosaic, the first graphics-based Web browser. Even then, it should have seemed inevitable that graphics alone would remain about as popular as silent movies were in the 1920s. It wasn't long before the inevitable happened and sound echoed from the Web.

The first audio on the Web appeared as static sound files, either as Windows Waveform files (.WAV), Sun/NeXT audio files (.AU, .SND) or Mac audio files (.AIF, .AIFF). Because early Web browsers couldn't play audio files, you needed some sort of audio player to hear the audio file. And because these first audio files were static or stationary files, as opposed to the streaming audio (or audio-on-demand) files common today, you needed to download the entire file before you could listen to it. This could present an interminable delay, because sound files of any length tend to be rather large. For example, a Windows Waveform (.WAV) file of just 10 seconds will often be more than 220K in size; a one-minute .WAV file can be more than 1.3M in size.

In 1994, a new type of Web-based audio appeared that removed the tedium of downloading the entire audio file before listening to it. In Seattle, RealAudio was born.

RealAudio is a client/server, audio-on-demand delivery system that allows you to hear audio files of virtually any length within seconds rather than minutes. RealAudio works by compressing the sound file and then delivering it in a buffered byte stream.

It wasn't long before RealAudio had competitors, or imitators, in its file-stream delivery method. By most estimates, RealAudio now has the lion's share of the Web's audio-on-demand market. Its chief competitor is ToolVox by Voxware.

Another recent innovation for delivering audio over the Web is Internet Radio. ABC and National Public Radio use RealAudio to broadcast over the Internet, but other broadcasters, including NBC and Bloomberg Information News Radio, are using another byte-stream, audio-on-demand delivery system, StreamWorks by Xing.

Still another Internet byte-stream, audio-on-demand delivery system making headway on the Web is a product called TrueSpeech by DSP Group, Inc.

Finally, the last type of Internet audio system that will be covered in this chapter offers a much more useful (some say) type of functionality-turning the Internet into a real-time telephone system. Five Web-based telephone products will be discussed in this chapter. Although Web-based phone systems aren't related to CGI (or at least, not yet), they do fall under the broad category of Web-based audio and thus deserve mention. Because five companies (at this writing) are developing their own product is also a suggestion that Web-based or Internet-based telephones are a product we'll probably be seeing more of.

To play back Web-based audio, you need a sound card and speakers. The speaker driver program found on many BBSs and FTP sites won't play Web-based audio. Any good 16-bit sound card (don't waste your money on older 8-bit sound cards, even if it's a good deal) that's SoundBlaster-compatible will suffice. SoundBlaster compatibility isn't an absolute requirement, but it does ensure that you'll be able to play virtually any type of audio file you encounter on the Internet. Your choice of speakers will be driven by your budget and what type your sound card will support.

Static Audio

Static audio, as mentioned earlier, consists of files that must be downloaded in their entirety before you can listen to them. The earliest audio files to appear on the Web-and probably still the majority of the audio cuts you'll encounter-fall under the heading of static audio.

Static audio files that you could encounter on the Web most likely will fall under one of the file formats (and OS platforms) listed in table 18.1. Table 18.1 Static Audio Formats

Type	Extensions
Windows	Waveform .WAV
Sun/NeXT	Audio .AU, .SND
Mac Audio	.AIF, .AIFF
Mac, PC	.SND, .FSSD
Amiga	.IFF
MIDI Music	.MID, .MIDI, .RMI
Amiga, Atari	ST .MOD, .NST
IRCAM	.SF
SoundBlaster	.VOC

All the audio file formats listed in table 18.1 are digital files, meaning that analog sound has been converted to one of these formats by passing the sound through some type of analog-to-digital converter. One converter type you might be familiar with is the sound card in your PC. By plugging some type of input source (microphone, analog tape recorder, CD, and so on) into the audio input of your sound card, you can use the software that came with your sound card to create your own digital audio files in the formats your card and software support.

Digital audio file formats vary in three ways:

Sampling rate. The sampling rate is the number of times per second the original audio source is sampled as it's converted to a digital format. The sampling rate of PC sound cards is between 8,000 and 41,000 samples per second (a higher rate produces better quality sound).
Compression. Digital audio files are often compressed to save disk space. When audio files are played back, the playback program will decompress and buffer the files (in memory, not on your disk).
File type organization. Digital audio files organize the digital data in different ways. The files may or may not have header information, and the data may be interleaved from two or more soundtracks (stereo).

Helper Applications

Web browsers are designed to be hypertext page display tools, not audio players. To hear the audio files included in Web pages you view, you'll need some sort of helper application (or, simply, helper app) designed to play the particular type of audio file. Most Web browsers released in the last year or so can be configured to use helper apps to play audio files, or, for that matter, to work with any type of file (MIME type) the browser isn't specifically designed to display or read. Figure 18.1 shows the dialog box used to configure Mosaic to use various helper apps.

Fig. 18.1

You configure helper apps in Mosaic in its Preferences dialog box.

Some Web browser manufacturers have started shipping helper apps with their products or preconfiguring their browsers to use certain Windows utilities, such as MPLAYER, as helper apps to play the MIME types .WAV, .MID, and .AVI (Microsoft video). Some versions of Mosaic ship a helper app utility called IMAGEVW configured to display GIF, JPEG, MPEG, and TGA MIME types. Netscape includes a helper app utility called NAPLAYER configured to play .AIF and .AU audio files.

WHAM

WHAM is a small Windows-based utility program that you can configure as a helper app for most Web browsers. WHAM allows you to play a variety of audio file types. To get a copy of WHAM, download it from

ftp site

To install and configure WHAM, follow these steps:

Copy the downloaded file WHAM133.ZIP into a temporary directory or folder on your PC and extract (unzip) its archived contents.
Start your Web browser and open the dialog box for configuring helper applications.
Set the file/MIME type as audio and the subtype to x-wav.
To configure WHAM as a helper app in Netscape, the entry in the Preferences dialog box should look like figure 18.2.

Fig. 18.2

In Netscape, you configure WHAM and any other helper app in the Preferences dialog box.

WHAM not only supports .WAV file types, but also .AU, .IFF, .VOC, and .AIF. If you already have helper apps configured for any of these other file types, you might want to consider using WHAM for these audio file types as well. It doesn't matter in Netscape whether you have one helper app or 20, but eliminating a few extra helper apps will save on disk space.

Netscape version 2.0 introduced a new approach to the concept of helper apps-plug-ins. A Netscape plug-in is basically an add-on module designed primarily to play "live objects," such as audio and video, when encountered in Web pages. Plug-ins are also designed to integrate more closely with Netscape than traditional helper apps. Several of the audio systems discussed in this chapter offer a Netscape plug-in (which will be discussed), in addition to their standard client helper app.

Live Audio

Live audio, also referred to as audio-on-demand, has been implemented on the Internet in several formats to solve the problems and delays of downloading static audio files. But implementing live audio on the Internet posed its own set of problems. The Internet, using the TCP protocol, is designed as a reliable, packet-switching network. The TCP protocol is a connection-based service, meaning that the sending and receiving systems are connected and in communication with each other at all times.

The terms live audio, live objects, and audio-on-demand all refer to the same thing-audio that plays and that you hear almost immediately, regardless of how long the audio clip is or how large file is. Live audio is produced by generating and transmitting an audio stream-or, more precisely, a byte stream-that you hear as it's being transmitted or downloaded to your PC. This contrasts to static audio, which exists in a stationary file format that can't be streamed and played as it's being transmitted, but must be downloaded in its entirety before you can hear the audio clip.

The Internet wasn't designed for continuous, time-based transmission of data. Part of TCP's reliability stems from its capability to retransmit lost information from a server to a client and await an acknowledgment of its receipt.

Although this setup works quite well for most Internet traffic, it's not the ideal conduit for the transmission of continuous, live audio. You might cope with a retransmission rate as low as even 2 or 3 percent over a high-bandwidth transmission path such as a T1 line, but over a 14.4kbps or even a 28.8kbps dial-up Internet connection, audio-on-demand is brought to its knees.

All the live audio systems discussed in this chapter use a different approach to solving the problem of the Internet's packet retransmission problem. Bear in mind, however, that any Web-based system delivering HTML pages as well as live or on-demand audio will need considerably more horsepower than a standard Web server, especially if you anticipate a high demand for the audio you plan to present. You also should look more closely at the type of connection your server has to the Internet. A T1 line might be adequate for a high-demand Web server, but when you add the additional burden of delivering on-demand audio, you may need to consider multiple, multiplexed T1 lines to prevent a bandwidth bottleneck.

RealAudio

RealAudio, by Progressive Networks, is one of the oldest and most widely used audio-on-demand systems on the Internet. Now, more than a hundred organizations ranging from radio stations, broadcast networks, and news and financial services disseminate audio feeds over the Net using RealAudio. Its list of clients is impressive and includes the following:

ABC News
National Public Radio
BBC Radio 3
CBC (Canadian Broadcast Corporation) Radio
Commercial Radio Hong Kong
Dow Jones Investor Network
ESPN Sports Zone
The Democratic and Republican national committees

RealAudio has solved the Internet/TCP packet retransmission problem by creating a client/server system that uses a proprietary time-based protocol that's based in part on the User Datagram Protocol (UDP).

The User Datagram Protocol is defined as a connectionless communications service because it doesn't require that the sending and receiving systems be in contact or communication with each other. This protocol is weighted more toward speed than reliability, because an acknowledgment isn't sent by the receiver that the packet stream was ever received.

The RealAudio protocol, as it's called, is a bi-directional, time-based protocol communicating between the RealAudio client and server. The RealAudio server portion of the system is installed on a Web server, and the RealAudio client portion is configured as a helper app with your Web browser. A bi-directional protocol based on UDP may sound like a contradiction in terms, but the bi-directionality of the RealAudio protocol stems from the addition of a "loss-correction system," which is designed, according to developers at Progressive Networks, to minimize any possible breakup in the audio stream by allowing the client to attempt to re-create any missing pieces of the audio stream.

Installing RealAudio on Your Web Server

The server portion of the RealAudio system operates on a variety of server OS platforms, as follows:

Several flavors of UNIX (Sun Solaris 2.x, SunOS 4.1x, SGI IRIX 5.3 or later, FreeBSD, Linux 1.2.x)
Windows NT 3.5
Apple Mac OS 7.5.x

It operates on any Web server software system that supports configurable MIME type, as follows:

Netscape Netsite
O'Reilly Website NT
Mac HTTPD
NCSA HTTPD (v1.3 or v1.4)
Emvac HTTPS 0.96
CERN HTTPD (v3.0)
Webstar for Macintosh

See  "How MIME Became Part of the HTTP Specification," p. xxx, for more information on using MIME with CGI.

The RealAudio server package is available in several configurations (10, 40, or 100 audio streams), depending on how many simultaneous users you want to accommodate. Follow the instructions that come with the RealAudio server package to install it on the particular Web server software and operating system platform you're using.

The RealAudio server operates across a variety of networks and bandwidth ranges-from a 56kbps frame relay line up to T3. Keep these numbers in mind when planning your site. If you plan to set up and market a potentially heavy use site, make sure that you have the hardware and software to support the possible demand; this includes the hardware platform you use as your server (that is, RISC-based processor), a robust OS (for high-end, UNIX is still preferred over Windows NT and Mac OS), and the communications line you use to connect your site to the Internet.

Progressive Networks also has a program where you can set up and use RealAudio on your server for a 60-day evaluation period, so you can try RealAudio before making a serious investment. The RealAudio server ranges in price from $2,490 for a 10-user license up to $13,490 for the 100-user version.

Encoding Audio Files with the RealAudio Encoder

When you buy (or evaluate) the RealAudio server package, the RealAudio audio file encoder is included. The encoder is used to convert .WAV and .AU files into RealAudio (.RA) files.

To make good quality RealAudio files, Progressive Networks' tech support department says you should start with good quality .WAV or .AU files. Additional suggestions for making good quality audio files include the following:

Encode from 16-bit sound files.
Digitize at a 22,050 Hz sample rate. (The encoder will also accept files digitized at 8,000 and 11,000 Hz, but 22,050 Hz seems to produce the best quality .RA files.)
When creating (recording) your own audio files, set audio input levels to the full range of available amplitude, but avoid clipping (exceeding maximum input level), which produces popping and clicking in the .RA file.
Use the best sound card you can afford.
Avoid using complex audio sounds (for example, several voices, background music, and so on).

Remember, though, these are just suggestions, not rules chiseled in stone.

To encode a .WAV file to the .RA format, follow these steps:

Launch the RealAudio encoder and select the file to encode.
The encoder prompts you to enter (optional) title, author, and copyright information.
The encoder then begins encoding the file (see fig. 18.3).

Fig. 18.3

You use the RealAudio encoder to convert a Windows Waveform (.WAV) file into a RealAudio (.RA) file.

Embedding RealAudio in Your Web Pages

After you encode your Waveform audio files into the RealAudio format, you're ready to embed the .RA file into Web pages on your RealAudio server. Embedded RealAudio files are embedded simply as hypertext links, with the reference being the RealAudio file. If you create a reference to opening.ra, the encoded RealAudio file created from the Waveform file opening.wav, the HTML coding might look something like this:

<A HREF="opening.ra"><IMG SRC="raworld.gif" alt="" border=0 hspace=10 ALIGN="LEFT">Opening Theme</A>

This example would look like the link in figure 18.4.

Fig. 18.4

This is how an embedded RealAudio file would look using the RealAudio icon.

Now that you've set up the RealAudio server, and encoded and embedded your RealAudio files, it's time to install the RealAudio client player to hear the results of your efforts.

Installing the RealAudio Client Player

The RealAudio client player is freely distributed from the RealAudio Web site located at this site (see fig. 18.5). To install the client player, follow these steps:

Follow the prompts to download the RealAudio player. The 2.0 Windows version is named RAWIN200.EXE. This file is a compressed archive file.
Decompress the file and follow the instructions for installing the RealAudio client player.

Fig. 18.5

The RealAudio Web site freely distributes the RealAudio client player for downloading.

RealAudio also offers its client player as a Netscape plug-in for Netscape version 2.0 users. If you're using Netscape version 2.0 as your Web browser, the RealAudio plug-in will work just as well as the standard RealAudio player helper app playing audio, but the plug-in makes a more seamless integration with Netscape 2.0.

Listening to RealAudio

To test your RealAudio player, you need to jump to a Web site that offers RealAudio broadcasts. As mentioned earlier, dozens of radio stations and broadcast networks extend part of their normal broadcast onto the Internet using RealAudio. Follow these steps:

On the RealAudio Web site, under the title Sites and Sounds, select RealAudio Guide to bring up the list of RealAudio sites (see fig. 18.6).

Fig. 18.6

The RealAudio Web site keeps a list of broadcast sites that use the RealAudio technology.

Select a Web site using RealAudio. When the site appears, follow the prompts to select the site's available audio clips.
Within a few seconds the RealAudio player appears, and if you installed the RealAudio client player as a Netscape plug-in, the clip you selected begins to play automatically (see fig. 18.7). If you're using the client player as a helper app with another Web browser, you will need to click the play button (an arrow pointing to the right).

Fig. 18.7

The RealAudio client player plays a sound clip from the National Public Radio Web site.

If you were logged in to the Internet between 9 and 10 p.m. on January 23 and had RealAudio running, you could have listened to President Clinton's State of the Union Address broadcast live (albeit with a processing delay of about 55 seconds, to convert his live presentation to a RealAudio audio stream and route that stream over the Internet to your PC) via ABC News Radio.

ToolVox

Another company trying to add sound to Web sites is Voxware, Inc. located in Skillman, New Jersey (see fig. 18.8).

Fig. 18.8

The Voxware, Inc. home page is located at this site.

Its product, ToolVox, takes a different approach to delivering audio-on-demand over the Web. ToolVox works by using a highly efficient audio encoder, which Voxware says delivers a 53:1 compression ratio on encoded audio files. This 53:1 compression ratio, the company claims, can squeeze 1 minute of standard speech into a file that's only 18K in size. The entire file must still be downloaded before it can be heard, but ToolVox's efficient compression technique greatly minimizes the download time.

Voxware also offers a plug-in for Netscape 2.0 users, which Voxware say allows for true audio-on-demand. The Netscape 2.0 plug-in can start playing sound within a few seconds because it buffers the audio stream as it starts receiving the compressed audio file.

Installing ToolVox on Your Web Server

Because ToolVox isn't a client/server type system, there's no software per se to install on your Web server. Because there's no Web server software to install, Voxware doesn't sell or license ToolVox. It's freely distributed from the Voxware Web site.

Like RealAudio, ToolVox does require that the Web server support MIME types. Installation consists of adding the following line to the MIME.TYPES file in the configuration directory:

audio/voxware vox

See "How MIME Became Part of the HTTP Specification,"  for more information on using MIME with CGI. (Ch. 10)

This configuration entry informs the server that any files with the extension .VOX are ToolVox-compressed audio files. The server in turn will pass this information to the Web browser, which in turn invokes the ToolVox helper app player (or Netscape 2.0 plug-in).

Encoding Audio Files into .VOX Files

Before you can place .VOX files in your Web pages, you must compress your audio files into .VOX files by using the ToolVox file encoder, which you can download from the Voxware Web site.

Installing the ToolVox Encoder

The ToolVox audio encoder (Windows version) is freely distributed from the Voxware Web site as a self-extracting, auto-installing archive file, meaning that when you decompress the archive, it automatically starts SETUP.EXE to begin installing the encoder.

Encoding an Audio File

With the ToolVox encoder installed, start the program to encode a .WAV file to a .VOX file by following these steps:

Open the .WAV file you want to encode. The ToolVox encoder displays the size of the file, its playing time, and how it was recorded.
Click the Compress button to begin the conversion (see fig. 18.9).

Fig. 18.9

You use the ToolVox encoder program to convert Waveform (.WAV) files to the ToolVox (.VOX) audio format.

After the .WAV file is converted to the .VOX format, you can see how efficiently ToolVox compresses audio files. In this example, the .WAV file was 112,198 bytes in size. The converted .VOX file was only 1,645 bytes, a compression ratio of 1:68.

Embedding .VOX Files in Your Web Pages

ToolVox version 2.0 was obviously designed with Netscape 2.0 in mind. When designing Web pages with Netscape 2.0 HTML extensions, and when using the Netscape 2.0 ToolVox plug-in, ToolVox offers some extremely helpful and flexible alternatives to the standard method of embedding a ToolVox audio file. For a Web browser other than Netscape 2.0, you would use the following command to embed the file sound.vox:

<A HREF=SOUND.VOX>

However, when embedding a .VOX audio file for use with the Netscape 2.0 plug-in, you can use the EMBED tag with the following options:

<EMBED SRC=HTTP:SOUND.VOX PLAYMODE=playmode VISUALMODE=visualmode>

The parameters for PLAYMODE include the following:

USER The user controls when the embedded audio file begins to play by clicking the Vox icon or the player window's Start button.
AUTO The embedded audio file begins to play automatically when Netscape begins to load the page.
CACHE The embedded file is downloaded without playing it; the file is stored for later playback.

The parameters for VISUALMODE include the following:

ICON A Voxware face icon appears on the page. While the sound is playing, the icon is red; the user can start and stop the sound by clicking the icon (see fig. 18.10).

Fig. 18.10

This is the Voxware face icon as it appears on a Web page.

BACKGROUND This is used when PLAYMODE is set to AUTO. There's no icon or interface on-screen, and the user has no way to stop the audio file from playing.
EMBED The ToolVox player interface window appears on-screen, allowing the user to start and stop the sound and to control the playback speed (see fig. 18.11).

Fig. 18.11

This is the ToolVox player interface as it appears on a Web page.

FLOAT The ToolVox player appears as a floating window that the user can minimize or close.

Installing the ToolVox Player

The ToolVox player is as easy to install as the ToolVox encoder because it, too, is a self-extracting, auto-installing archive file. Voxware offers Windows 95, Windows 3.1, and Macintosh versions of the ToolVox player.

When you decompress the downloaded file, SETUP.EXE automatically starts installing the ToolVox player. If you're running Netscape 2.0, the ToolVox setup program prompts you to install the Netscape plug-in instead of the standard player.

To test your newly installed ToolVox player, you need to locate a Web page with embedded .VOX audio files. The Voxware Web site contains numerous embedded audio files you can play on just about every page, such as the ones shown in fig. 18.12. The Voxware Web site also contains a listing of other Web sites using the Voxware audio system (see fig. 18.13).

Fig. 18.12

The ToolVox face icon indicates that a .VOX audio file is embedded on this page.

Fig. 18.13

The Voxware Web site keeps a listing of other Web sites using the Voxware audio system.

TrueSpeech

The TrueSpeech audio system by DSP Group, Inc. is another up-and-coming player in the Web-based audio-on-demand market (see fig. 18.14).

Fig. 18.14

The TrueSpeech home page is located at this site.

The TrueSpeech system is similar in operation to the ToolVox system in that it's not controlled by a server-based software product you install on your Web server. Like ToolVox, TrueSpeech is comprised of an audio file encoder used to convert .WAV files into its proprietary .TSP format, and an audio file player that works as a helper application with your Web browser.

Configuring Your Web Server for TrueSpeech

Although there's no Web server software to install, TrueSpeech does require that the Web server support MIME types and be configured to recognize TrueSpeech-encoded audio files. This allows the server to pass the correct MIME type information to Web browsers, which can then spawn the TrueSpeech player/helper app.

See  "How MIME Became Part of the HTTP Specification,"  for more information on using MIME with CGI.

For UNIX-based Web servers, the following line needs to be placed in the configuration file MIME.TYPE:

application/dsptype      tsp

On the CERN HTTP Server, the configuration line should be

AddType.tsp      application/dsptype      binary 1.0

On Windows-based servers, the configuration file type .TSP should be registered/associated the same as any other file type, through the Registry or Control Panel.

Encoding .WAV Audio Files to the TrueSpeech Audio Format

Like ToolVox, TrueSpeech also creates audio-on-demand by creating a highly compressed, proprietary audio file. In the following example, an 80K .WAV file was converted to a 6K .TSP file.

If you're running Windows 95 or Windows NT, you already have the TrueSpeech audio file encoder. It's is built into the Windows multimedia Sound Recorder. To use the Sound Recorder to convert .WAV audio files to the TrueSpeech audio format, follow these steps:

If you're recording your own .WAV files, TrueSpeech recommends that you record the PCM-encoded .WAV file using a sampling rate of 8,000 Hz with 16 bits of resolution. The recording amplitude should be held to a maximum of 14 bits to avoid clipping.
If you're converting an existing .WAV file, you can use the Sound Recorder to convert the file to 8,000 Hz, 16 bit, Mono. Start Sound Recorder and open the file. Open the File menu and choose Properties,Convert Now, and change the Attributes setting to 8,000 Hz, 16 bit, Mono, 16 KB/s, and save the changes.
To convert to a TrueSpeech file, open the File menu and chooseProperties, Convert Now, and change theFormat setting from PCM to DSP Group TrueSpeech(TM). Save the changes (see fig. 18.15). The converted file will now have the file extension .TSP, indicating that it's in the TrueSpeech audio format.

Fig. 18.15

Use the Windows 95/NT Sound Recorder to convert a Waveform (.WAV) file to a TrueSpeech (.TSP) file.

You can't use the Sound Recorder in Windows 3.1 to do TSP conversions. If you're still running Windows 3.1, you need to download the PCM to TrueSpeech conversion utility from the TrueSpeech Web site to encode (convert) your .WAV files to .TSP format.

Embedding TrueSpeech Audio Files in Your Web Pages

Now that you've encoded your PCM .WAV audio file to the TrueSpeech (.TSP) format, you're ready to embed the audio file on a Web page. The file encoded in the preceding section has the file name SMOKIN.TSP. To embed SMOKIN.TSP into a Web page, follow these steps:

Create a text file containing the following (case-sensitive) command:

TSIP>>url/smokin.tsp

Replace url with the directory location of SMOKIN.TSP on your Web server. Don't include the characters http://. For example, if I want to place SMOKIN.TSP on my home page, I would copy SMOKIN.TSP to the Web server www.city-net.com, in the directory /~gagrimes. The exact location would be:

Site

The command I would place in my text file would be

TSIP>>www.city-net.com/~gagrimes/smokin.tsp

Save the text file with the file name SMOKIN.TSP (the file name can be anything you like, just be sure to use the extension .TSP) in the same directory where the .wav file is stored, which in my case would be /~gagrimes.
Reference the .TSP file in your Web page by using the command:

<HREF="location/filename">

In my example, I would reference SMOKIN.TSP with the command

<HREF="www.city-net.com/~gagrimes/smokin.tsp">

Installing the TrueSpeech Audio File Player

The TrueSpeech audio file player (Windows 3.1/95/NT versions) can be freely downloaded as a compressed archive file from the TrueSpeech Web site (see fig. 18.16).

Fig. 18.16

The TrueSpeech player download page is located at this site.

TrueSpeech also offers its audio player as a Netscape plug-in for Netscape version 2.0 users.

To install the TrueSpeech audio player, follow these steps:

Copy the downloaded file into a temporary folder or directory and decompress the archive file.
Run SETUP.EXE and follow the prompts to install the audio player. If you're using Netscape 2.0 as your Web browser, you'll have the opportunity to install the audio player as a Netscape plug-in.

After you install the TrueSpeech audio player, you can test it by selecting a Web site that features TrueSpeech audio. Just like its competitors, TrueSpeech keeps a list of Web sites using its audio system (see fig. 18.17).

Fig. 18.17

The TrueSpeech Web site also maintains a list of other Web sites using TrueSpeech audio.

StreamWorks

The use of StreamWorks on the Internet is somewhat different from the previously mentioned Web-based audio systems. Even its user (client) interface suggests a different usage (see fig. 18.18).

Fig. 18.18

This StreamWorks Windows-based client interface is used to access both audio and video streams.

StreamWorks is a Windows-based audio system manufactured by Xing Technology Corp., and its major usage on the Internet appears to be for retransmission of radio- and broadcast-simulated programs.

Xing Technology Corp. also produces a Windows-based video system, On-line Video System. If you're interested in finding out more about this video system, check out Xing's Web site at this site.

StreamWorks is designed as a client/server system that delivers a streaming audio (and video) signal based on the MPEG (Motion Pictures Expert Group) international standard for audio and video compression. It uses the standard Internet TCP/IP protocol and buffers the incoming signal on the client side to compensate for any retransmission of data packets. StreamWorks also allows its data signal to be scalable according to the transmission and receiving speeds of the Internet connection (from T1 down to 8.5kbps).

Setting Up a StreamWorks Server

StreamWorks servers are configured as audio streamers for broadcasting recorded audio, live audio encoding that can simultaneously encode and transmit, or both. StreamWorks operates over a broad range of bandwiths (from 1.5Mbps to 1600Mbps) and on a variety of platforms:

SGI-IRIX
Sparc Solaris
HP-UX
PC-Linux
Windows NT

StreamWorks is one Web-based audio system that can demand a lot from a server. Remember, too, that Xing also offers a video component as part of its server package. If you plan to present audio and video in both recorded and encoded live transmissions, make sure that your plans (and your budget) include using a high-end RISC-based server.

Xing doesn't include a lot of installation documentation for its servers. It prefers to have system administrators call tech support for its brief set of installation instructions. This allows Xing to also register each server being installed, and more or less to tailor its installation instructions to your particular hardware, OS, and communications setup. The tech support instructions also explain how to register the StreamWorks MIME type.

Installing and Using the StreamWorks Client

A copy of the StreamWorks client player (Windows version) can be downloaded from the Xing Web site. In addition to Windows, Xing also makes versions for Mac, SGI, Sun, and Linux. To install the Windows client software, follow these steps:

Download and copy the file streamwk.exe (a self-extracting archive file) into a temporary directory or folder on your PC.
Run setup.exe to install the client player on your PC.
Start the StreamWorks player and click the Setup button (located in the upper right corner of the StreamWorks interface) to open the StreamWorks Setup dialog box (see fig. 18.19).

Fig. 18.19

You need to configure StreamWorks by using its Setup dialog box before you can receive its audio streams.

Select the Maximum Connection Speed for your Internet connection. If you connect to the Internet through a network, ask your systems administrator. This setting also determines which broadcast streams you can listen to, because some are encoded for transmission at certain speeds. Click OK to save your configuration.
Choose one of the station buttons (they're marked Xing, KPIG, KMPS/KZOK, CFRA, ICRT Taiwan ROC, VT (Vortex Technology), WXYC, or IP (Interactive Planet)). These buttons are preconfigured to existing Xing Technology broadcast servers located around the world. For example, if you click the KPIG station button, the screen shown in figure 18.20 appears, and you'll see the listing of available audio streams. Double-click one of the listed audio streams to play that selection.

Fig. 18.20

The KPIG server offers a variety of broadcast audio streams.

Web-Based Telephony

All the previous Web-based audio systems described in this chapter have one thing in common-they're all one-way only systems. The audio system is configured to transmit or broadcast audio to an eagerly awaiting audience of listeners who have configured their PCs with the appropriate software to receive and play back the audio message. But none of these systems allows the listener any avenue to communicate back to the sender.

The Web-based audio systems described in this section not only allow but, in fact, are designed for two-way communication. They permit you to carry on a conversation with another Internet user much the same as you would if you picked up your phone and placed a call.

Although these two-way communications systems have nothing to do with CGI, they do comprise a category of Web-based audio that I felt I should at least mention in this chapter.

At their current stage of development, none of the products here are poised to replace your current telephone for the following reasons:

Audio quality. The audio quality is still below the quality of even the cheapest telephones on the market.
Cost. Although the call itself incurs no additional expense, to place a phone call you still need an Internet connection and a multimedia-based PC (486 or Pentium, with sound card, speakers, and microphone).
Ease of use. Placing a call is still not as easy as picking up a receiver and dialing a 7- or 11-digit number; you still have to boot your computer and log on through your service provider.
Coordinating conversations. A standard telephone is always on and ready to ring, letting you know that someone is trying to call you; neither of the products discussed here will turn on your computer and start your Internet phone software, but they will allow you to leave an e-mail message in the receiver's mailbox.
Proprietary incompatibilities. Each Internet telephone system described in this chapter will allow conversations only with users using the same software. Since there is no "Internet telephone protocol," each vendor is free to implement net telephony in any manner possible.

Despite these obstacles, there's still a tremendous clamoring by Internet users to be able to add two-way voice communication to their existing Internet functionality, especially by those who regularly communicate via the conventional Internet chat service (IRC).

Despite their proprietary nature, each product does share a few common characteristics. Each vendor has set up phone directory servers. Although the servers don't actually facilitate communication between two users (the products all communicate directly), they provide a directory of users so you can see who's connected and using a particular product.

Each product will also allow you to start the program and leave it in a sort of "waiting-for-a-call" mode. And when someone tries to "call" you, each product produces a simulated ringing phone sound to alert you to an incoming call.

All the products also offer a full-duplex mode, meaning you can listen and talk at the same time just as you can on a regular telephone. Full-duplex mode requires either a single full-duplex sound card or two half-duplex sound cards.

Internet Phone

Internet Phone by VocalTec, Inc. is the first of the five Internet telephony products I will discuss (see fig. 18.21).

Fig. 18.21

The VocalTec, Inc. home page is located at this site.

Internet Phone works the same as Internet Relay Chat (IRC). Internet Phone requires that each user have an installed copy of the program (the client software) on his PC. VocalTec has installed numerous "phone servers" on the Internet to which users connect using standard IRC port designations. At last count, Internet Phone had servers in the following locations:

iphone.aloha.net (Honolulu)
iphone.fast.net (Allentown, Pennsylvania)
iphone.iaccess.com.au (Australia)
iphone.interramp.com (Herndon, Virginia)
iphone.interserv.net (San Francisco)
iphone.pulver.com (Long Island, New York)
iphone.smartnet.net (St. Joseph, Missouri)
iphone.vocaltec.com (Washington, D.C.)
iphone.wau.nl (the Netherlands)

As with standard IRC, when you connect to a server you get a list of the other users who are also connected to the VocalTec Internet Phone system. You then select the user you want to converse with. When that user answers, you simply begin talking just as though you were using a regular telephone. If the user doesn't answer, at present there's no way to indicate that you tried to place a call. There is no equivalent of Internet phone voice mail.

In addition to a sound card in your PC, you need a microphone connected to your sound card to use Internet Phone. To test your microphone connection, start Sound Recorder and try to record something with the microphone.

Installing and Using Internet Phone

VocalTec distributes a Windows demo version of Internet Phone that you can download from its Web site and try before you buy the fully functional product. The demo version limits you to 60 seconds of conversation. To install Internet Phone, follow these steps:

Download the archived demo version from the VocalTec Web site into a directory or folder on your PC and decompress the archive.
Run ADDICONS.EXE to create the Windows icons you'll use to start Internet Phone.
Start Internet Phone. Open the Options menu and choose User Info to enter your name and nickname into the User Info dialog box. Just like IRC, your nickname is limited to nine characters (see fig. 18.22). Click OK to save your user information.

Fig. 18.22

You use the User Info dialog box to enter your name and nickname for Internet Phone.

To connect to the Internet Phone network, open the Phone menu, choose IRC Connect, and select a VocalTec phone server in the Connect to IRC dialog box.. To minimize network traffic, try to select a server that's near you geographically (see fig. 18.23). Click OK to close the dialog box and log on to the server you selected.

Fig. 18.23

Before you can place a call, you need to select and connect to one of VocalTec's IRC phone servers.

Open the Phone menu and choose Call to scan through the list of currently connected users (see fig. 18.24). Select a user and click OK. When the user you selected responds, begin your conversation.

Fig. 18.24

To make a call on Internet Phone, you need to select another logged-on user.

One problem you may encounter when using Internet Phone is that you can't talk and listen at the same time as you can on a regular phone. This is only a problem if you are using a half-duplex sound card. To solve this problem, use VocalTec's full-duplex version of Internet Phone. To use the full-duplex version, though, you'll need either two sound cards or a full-duplex sound card.

When you're ready to hang up, open the Phone menu and chooseDisconnect to break your connection to the phone server.

WebPhone

If you decide to evaluate the Internet telephone products listed here, by all means try WebPhone. The audio quality was one of the best I tried, plus it offers a broad range of additional features:

Voice mail messaging
Directory assistance
Context-sensitive help
Multiple phone lines
A "Do Not Disturb" feature, to block incoming calls and route them to voice mail
Speed dialing
Last number redial capability

Downloading, Installing, and Configuring WebPhone

You can download a copy of WebPhone from Netspeak Corporation's Web site at this site. Installation is relatively simple. Decompress the downloaded archive and follow the prompts to install WebPhone.

When you have it installed, you need to configure it before you can place calls. WebPhone uses either your e-mail or IP address as your phone number, and you need to supply this information along with personal data. Follow these steps to configure WebPhone:

Click the CFG button to open the Configure dialog box in WebPhone if it doesn't open automatically the first time you start the program (see fig. 18.25).

Fig. 18.25

Use the Configure dialog box in WebPhone to set user information and configuration parameters.

You need to enter information only under the User Information and Network Parameters sections to begin placing calls.

If your Internet service provider dynamically assigns you an IP address every time you log on, don't enter an IP address. If your IP address is dynamically assigned, most likely it changes every time you log on to your ISP. Leave this field blank.

Placing a Call Using WebPhone

Once WebPhone is configured, you're ready to use it to place a call to another WebPhone user. If you know the IP address of another WebPhone user, simply enter the address using the number pad on the WebPhone interface, followed by a click on the SND (send) button. If you don't know the IP address of the person you want to call, you'll have to use WebPhone's directory assistance:

Click the DIR button to open WebPhone's directory dialog box.
Click the Information button to open the directory information dialog box.
Enter enough information for the directory assistance feature to locate your party. If you want to get information on all users in Pennsylvania, for example, enter PA (not case-sensitive) next to the State/Province prompt. If you want to eliminate WebPhone users who aren't online, click the Only Parties Online button.
Click the word Information in the Information dialog box to begin your search. In a few seconds, the results of your query appear in the Information dialog box (see fig. 18.26).

Fig. 18.26

You activate WebPhone's directory assistance feature to locate other WebPhone users.

Double-click a user's name to place your call.

TeleVox

If the name seems vaguely familiar, there's good reason. TeleVox is manufactured by Voxware, Inc., who also makes ToolVox.

There's nothing especially different or outstanding about TeleVox to distinguish it from other Internet telephone products. It's a rather plain vanilla telephone product, lacking many of the bells and whistles found in some the so-called "high-end" products such as voice mail, multiple lines, and caller ID.

Downloading, Installing, and Configuring TeleVox

You can download TeleVox from the Voxware home page, this site. After you download the file, which is a compressed, self-installing archive, run it to begin installing and follow the prompts. The only configuration information you will need to enter is user information. To configure TeleVox, follow these steps:

Start TeleVox. Open the Option menu and chooseLocal User Info to open the Local User Information dialog box.
Enter First Name, Last Name, City/Town, State/Province, Country, and Email address (see fig. 18.27)
Click the Close button to save user information.

Fig. 18.27

You need to enter local user information in TeleVox before you can place calls.

Placing a Call with TeleVox

To place a call with TeleVox, click the Call icon (or open the Phone menu and chooseCall) to open the Phone Book dialog box. Double-click the user you want to call (see fig. 18.28).

Fig. 18.28

You select the person you want to call in TeleVox from the Phone Book dialog box.

Other Internet Telephone Programs

The remaining two Internet telephone programs are Quarterdeck's (WebTalk) and Digiphone by (Third Planet Publishing). Both are adequate telephone products that you may want to try if you're interested in doing a head-to-head comparison of all available phone programs. The only caveat is that Quarterdeck's program, WebTalk, is definitely the most difficult to install of all the current batch of Internet phone programs. WebTalk is bundled with Quarterdeck's version of Mosaic and tries to install its own Winsock, which could cause problems with your existing Winsock if you aren't careful.

Server Performance Considerations

Obviously, if you decide to install any of the Web-based audio products described in this chapter on your Web site, you need to take certain performance considerations into account. A Web server that's dedicated to nothing more than distributing a few HTML pages (albeit with an occasional GIF or JPEG image of moderate size) isn't going to require the same horsepower as a server distributing streaming audio files or live broadcast.

There are no hard-and-fast rules on what type of server you need. The decision on a server depends on what you plan to distribute from your Web site and the level of demand for your distributed wares. In many cases, the manufacturers of many of the products mentioned in this chapter have already determined guidelines for server requirements based on the number of simultaneous connections you plan to establish. To determine which platform (for example, UNIX, NT, or Mac) offers the most robust performance, you should seek the advice of the manufacturer based on the company's experience with its product on various platforms (provided there's even a choice of platforms).

Because software has continued to lag behind hardware, if you anticipate extremely heavy use on your server, you would be wise to invest in a RISC-based hardware platform based on either the DEC Alpha processor, or the Mips Technology R4600 or R4400 series processors. Although RISC-based processors have taken the lead in the server hardware field, companies producing dual-Pentium based servers, such as Intergraph, are starting to close the performance gap.

As for OS platforms, UNIX continues to lead, but as Windows NT becomes available and optimized on a wider range of platforms, this gap will close further.

Performance considerations don't end with the decision of which platform to use. Even the most robust hardware and OS platform will be bottlenecked if it's constrained by insufficient bandwidth. A T1 line is a must for a moderately busy Web server distributing streaming audio feeds. And while a T3 line (T1*26) might be overkill, it's possible to multiplex T1 lines to provide increased bandwidth.

The final decision on your hardware, OS, and bandwidth decisions might just come down to trial and error. You may have to experiment with certain system configurations before you settle on the right combination.

Previous Chapter <-- Table of Contents --> Next Chapter

QUE Home Page

For technical support for our books and software contact support@mcp.com