09 - Understanding the Internetby Azam A. Mirza
![]() The enormous growth of the Internet during the early 1990s has fundamentally changed the computer software and hardware industries. Today, organizations and individuals alike consider the Information Superhighway a critical piece of the infrastructure that will dominate information technology well into the twenty-first century. Microsoft's Internet strategy is a key component of the Microsoft BackOffice suite of server-based applications. Microsoft Internet Information Server together with Windows NT, SQL Server, and Exchange Server provide a comprehensive set of tools for connecting to the Internet and leveraging its technologies and services for business purposes. This chapter provides a brief introduction to the Internet and how it has changed the way the world thinks of information technology. Although the main focus of this chapter is to introduce the Internet and provide a basic understanding of what it means to your organization to embrace the Internet, it also provides a road map of what BackOffice has to offer to make your organization's foray into the Internet world a success.
ARPANet initially used the UNIX operating system, the Network Control Protocol (NCP), and a 50 kilobits per second (kbps) line to connect four computers located at the University of California at Los Angeles, the Stanford Research Institute, the University of California at Santa Barbara, and the University of Utah at Salt Lake City. By the early 1980s, a new protocol for network connectivity and communications - the Transmission Control Protocol/Internet Protocol (TCP/IP) - was proposed and adopted for ARPANet use. As a public domain protocol, TCP/IP was widely accepted by the computing community for connecting to the fledgling ARPANet. By the mid 1980s, ARPANet had grown from the humble beginning of a four-computer network to more than 200 linked networks with thousands of computers. Recognizing the potential of the ARPANet as a major network for research, education, and communications, the National Science Foundation (NSF) developed the NSFNet in the mid 1980s to provide a high-speed backbone for connecting to the Internet. From the mid 1980s to the early 1990s, the NSFNet backbone was consistently upgraded from a 56 kbps line to a 1.54 Mbps line to a 25 Mbps line. The NSFNet played a major role in funding the advancement of the Internet as a viable tool for research and development. Shortly after the NSFNet backbone was put into place, other government agencies in the United States and organizations abroad got into the act with the creation of backbone networks by the National Aeronautics and Space Agency (NSINet), the Department of Energy (ESNet), and numerous European organizations. Over a short time period, the Internet grew to become a conglomeration of more than 5,000 networks in more than 60 countries with more than 5 million computers (see fig. 9.1). Fig. 9.1 The Internet now provides connectivity to a major percentage of the world. (Illustration courtesy of the Internet Society.) With the eventual decommissioning of the original ARPANet and the NSFNet backbone network as it was outgrown, other Internet backbone network providers emerged. Currently, the backbone for the Internet is supplied by a group of national commercial providers such as AT&T, MCI, and Sprint, as well as several smaller regional providers. Internationally, the backbone is supported by government organizations and private sector corporations.
The initial Internet was nothing more than a collection of networks connected together that facilitated the flow of information between computer users. Because the Internet was largely based on computers that ran various flavors of the early UNIX operating system, it was mainly a text-based command line environment. Also, the slow transmission lines connecting Internet users necessitated the use of techniques that required the least amount of bandwidth for transmission of data. Most early tools and applications used cryptic commands and minimal user interfaces to save transmission overhead. However, as the NSFNet backbone was upgraded to higher speeds and the network became capable of handling higher volumes of information flow, the Internet became a more user-friendly and flexible environment. Efforts got underway to develop methods of accessing the information databases available on the Internet. Early efforts led to the development of tools such as Archie, Veronica, Jughead, and Gopher. All four of these tools are used for the sole purpose of search and retrieval of information. Archie was the first of such tools. It uses a simple method to catalog the information and files available on remote machines and makes the list available to users. Subsequent tools became more and more sophisticated in their approach, leading to Gopher, which required the use of special Gopher servers for collecting, storing, and displaying information content for use by Internet users. Gopher is widely used for providing catalogs of books in libraries and phone book listings. Veronica and Jughead each provide additional indexing and searching capabilities for use with Gopher servers. Even though all these search and retrieval tools are very sophisticated, they have one thing in common: they all used text-based user interfaces.
In the early 1990s, with the arrival of the Microsoft Windows' graphical user interface, the continuing popularity of the Apple Macintosh user interface, and the X-Windows environment on the UNIX operating system, the graphical user interface became the norm on the user desktop rather than the exception. However, the Internet was still largely a text-based environment in a world becoming predominantly graphical. When it became apparent that it was possible to publish information on the Internet for access by the mass population, efforts got underway to develop tools for graphical display of the information. The key factors responsible for the Internet's exponential growth are the development of the World Wide Web (WWW), covered later in this chapter, and a user-friendly way to browse through the information available on it. The development of the Mosaic graphical Internet browsing tool (in 1993) at the University of Illinois' National Center for Supercomputing Applications resulted in making the Internet accessible and much easier to use. Mosaic allowed graphical point-and-click navigation of the vast Internet expanse and permitted people to experience the Internet without having to learn archaic and difficult UNIX utilities and commands. Naturally this proliferation of users led to creative approaches to sharing information on the Internet, and as the amount of quality information from an expanding variety of sources increased, the Internet phenomenon became known as the Information Superhighway, a topic covered in the next section of this chapter. Through its first 20 years of existence, the Internet simply facilitated communications between researchers, scientists, and university students. Its primary value was in providing users with the capability to exchange electronic mail (e-mail) messages, participate in discussion groups, exchange ideas, and work with each other. The Internet was strictly a nonprofit domain that resented and shunned anyone who tried to make a dollar out of its use. However, in the last three to four years, the Internet has gone through a tremendous transformation. The Information Superhighway is a place for communicating, advertising, conducting business, and providing information to the masses or the individual.
These capabilities are available through the many services on the Internet, such as:
Each of these services is described in the following sections.
Fig. 9.2 Organizations, such as G. A. Sullivan, can convey information through the WWW. The following languages and interfaces are used by WWW servers and browsers to facilitate communications between them:
Each of these is described in the following sections. The WWW uses a standard called the Uniform Resource Locator (URL) for identifying services and machines available across the Internet. The URL is used for identifying the kind of service being used to access a resource, such as FTP, WWW, Gopher, and so on. A URL uniquely identifies a machine, service, or product over the Internet. A URL has three parts:
For example, the URL for accessing the Microsoft FTP server for downloading a file called readme.txt will be ftp://ftp.microsoft.com/readme.txt. This means that the service being used is FTP (the scheme); the server address is ftp.microsoft.com (the address); and the file to download is readme.txt (the path). To further explain the URL format, every URL scheme is followed by a colon (:), followed by two slashes to identify that an address follows. Simply put, URLs are a way for identifying resources on the Internet in a consistent manner. The HyperText Markup Language (HTML) is the scripting language used to define WWW server information content. HTML is a plain text ASCII scripting language that uses embedded control codes like the word processors of old to achieve formatting of text as well as graphics, images, audio, and video. The information is then stored as files on a WWW server. When a Web browser is used to access the file, it is first interpreted by the browser; the control codes are decoded, and the formatted information is presented to the user in a graphical format referred to as a Web page. The WWW and HTML were both first developed at CERN in 1990. HTML1 was the version used by initial Web browsers such as Mosaic. The current standard being used is HTML3, which incorporates tables, figures, and other advanced features into WWW document creation. Figure 9.3 presents a sample HTML document for a WWW site home page, and figure 9.4 presents the page as it looks when viewed using the Internet Explorer 2.0 Web browser.
Mosaic is the graphical Web browser developed by the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign.
Fig. 9.3 HTML source documents are used to create WWW home pages. Fig. 9.4 WWW home pages can be viewed using the Microsoft Internet Explorer Web Browser. The HT in HTML stands for HyperText, an important concept in WWW browsing. Hypertext, or hyperlinks, refers to links defined within normal textual documents that allow a user to jump to another part of a document. The Windows help system is an example of a document-based system that uses hypertext links. By clicking on highlighted or underlined words, users can navigate easily throughout the help system, even between different help files. The WWW takes the same concept to the next level by allowing hypertext links between Web pages and even WWW sites. By clicking on hypertext links defined on a Web page, users can not only navigate within the same WWW site and view different pages, they can even jump to links pointing to sites on other WWW servers in remote locations. This powerful feature allows navigation of the Internet in a manner never possible before the advent of the WWW. The HTML standard is platform independent because it does not incorporate any codes that specify platform-unique parameters. For example, the codes might specify what size font to use, but not the type of font to use. That is left up to the browser to determine based upon the platform on which it is running and the fonts available on that machine. The Common Gateway Interface (CGI) is a standard for extending the functionality of the WWW by allowing WWW servers to execute programs. Current implementations of WWW servers allow users to retrieve static HTML Web pages that can be viewed using a Web browser. CGI extends this idea by allowing users to execute programs in real-time through a Web browser to obtain dynamic information from the WWW server. For example, a WWW site may allow users to obtain up-to-the-minute stock quotes by executing a program that retrieves the stock prices from a database.
The CGI interface basically serves as a gateway between the WWW server and an external executable program. It receives requests from users, passes them along to an external program, and then displays the results to the user through a dynamically created Web page. The most common usage of the CGI standard is for querying information from a database server. Users enter queries into a Web page; the WWW server accepts the data, sends it to the application or processing engine that will process the data, accepts the results back from the processing engine, and displays the results to the user. The CGI mechanism is fully platform independent and can transfer data from any browser that supports CGI to any WWW server that also supports CGI. Because a CGI program is basically an executable file, there are no constraints on what kind of program can be executed through a CGI script. A CGI program can be written in any language that can create executable programs, such as C/C++, FORTRAN, Pascal, Visual Basic, or PowerBuilder. CGI programs can also be written using operating system scripts, such as PERL, UNIX script, or MS-DOS batch files.
The Virtual Reality Modeling Language (VRML) is a scripting language for displaying 3D objects on the WWW. The VRML addition to the WWW allows the display of interactive 3D worlds (for example, a virtual computer-generated model of a university campus) that can be traversed by the users accessing them. The capabilities and opportunities afforded by VRML are only limited by the imagination of the Web page author and available bandwidth. VRML promises to provide the capability to visit virtual worlds on the WWW, walk through them, and experience the multimedia power of the WWW. Microsoft maintains a sample Web page to demonstrate the capabilities of the VRML technology, as shown in figure 9.5. If you would like to view such a site, use a search engine (like http://www.yahoo.com or http://www.webcrawler.com) and execute a search on VRML. Fig. 9.5 Microsoft Internet Explorer can be used to view VRML 3D objects.
Usenet uses the TCP/IP-based Internet backbone as its transport mechanism. The standard used by Usenet news (or netnews) for propagation of Usenet traffic is called the Network News Transport Protocol (NNTP). NNTP is a higher level protocol that runs on top of the TCP/IP protocol to facilitate communications between various servers running the Usenet server software.
The newsgroups allow users with the appropriate news reading software to view articles posted to these groups, post their own articles, and reply to articles posted by other users. After an article is posted into a Usenet newsgroup, the article is broadcast using the NNTP service to other computers connected to the Internet and running the NNTP service. Usenet groups are different from mailing lists because they require central storage of articles at an NNTP server computer for viewing by all members of the network connected to that computer. At last count, there were more than 14,000 Usenet newsgroups for topics ranging from distributed computer systems to daily soap opera discussions. A multitude of Usenet news reading software programs are available on the Internet as shareware programs and also commercially. Internet Web browsers such as Microsoft Internet Explorer and Netscape Navigator have news reading capabilities built in. Figure 9.6 shows the user interface of a shareware product called Free Agent used for reading newsgroups. The same company also offers a more complete version of the program called Agent version 1.0. Fig. 9.6 Free Agent is a shareware Usenet news reading program available for reading Internet news. The latest news reading programs, such as Free Agent, provide sophisticated features such as:
E-mail provides a fast and cost-effective method of communication that is remarkably useful. E-mail messages can travel across the world in a matter of minutes to reach their destination. Even though the WWW has been instrumental in bringing the Internet to the masses and transforming it into the Information Superhighway, the speed, effectiveness, and simplicity of the e-mail concept has made it the most widely used service over the Internet.
Numerous commercial and shareware software packages are available for receiving and sending e-mail messages using the SMTP service. Popular proprietary e-mail programs such as Microsoft Mail, Microsoft Exchange, and Lotus cc:Mail have special interfaces for receiving and sending SMTP based e-mail. Microsoft Exchange Server is a part of BackOffice and is covered in Chapters 12 through 16. An off-shoot of the individual user-to-user e-mail connectivity has been the invention of mailing lists. As the name suggests, mailing lists are similar in concept to the mass mailings you receive through the postal service. However, on the Internet, you must subscribe to a mailing list. Users just send a simple message to the mailing list administrator asking to be included in the list and shortly thereafter will start receiving messages originated from the list as normal e-mail messages.
Fig. 9.7 Telnet can be used to connect to remote Internet host machines. Telnet is inherently a command-line application interface that uses the popular VT-100 terminal emulation for displaying information to the user. When users log in to a remote machine using telnet, they are presented with a command-line prompt. Users may execute any command-line programs using telnet. The TCP/IP protocol suite included with Windows NT includes a telnet client as part of the TCP/IP utility programs.
The TCP/IP protocol suite included with Windows NT includes an FTP client and an FTP server as part of the TCP/IP utility programs. The FTP system is platform independent and facilitates file transfers between disparate systems such as a UNIX workstation and a DOS PC. The FTP protocol allows for the transfer of both plain text ASCII files and binary files. Figure 9.8 presents a sample FTP session with the ftp.microsoft.com server site. Fig. 9.8 Use FTP to transfer files between a local computer and a host computer (both on the Internet), such as Microsoft's FTP server. FTP uses a command-line interface that requires users to know and understand FTP keywords for transferring files. Many graphical FTP programs also are available that facilitate point-and-click use of FTP services. The FTP standard defines a basic set of commands that must be supported by all implementations of the FTP service. Because FTP uses clear text for transfer of information between client and server, it is not a very secure service. FTP should not be used for transferring sensitive files or information. For example, when connecting to a host computer, users are required to enter a logon ID and password. The logon ID and password are passed from the client to the server using clear text, and as such there is a potential of having the information being intercepted and viewed by a third party. A powerful feature of the FTP service is its capability to allow anonymous logons to users. An anonymous logon is similar in concept to the guest account on Windows NT machines. It allows users to log on to a machine and have viewing and reading rights on predetermined directories and files on the system. Users can download files using anonymous FTP from any FTP server that allows anonymous logons. Anonymous FTP makes the FTP service more secure than normal by allowing the capabilities of the client to be limited.
Gopher is similar in concept to the FTP service, however, it only allows retrieval of information and has no provision for uploading information to the server. However, Gopher does provide the following significant advantages over FTP:
Figure 9.9 presents a sample session with a Gopher server. Fig. 9.9 Use a Gopher client to connect to a Gopher server.
This is only a partial list of what the Internet offers, and as time goes by and technology advances, you are sure to see other applications added.
These concepts are supported by the predominant applications being implemented that utilize the Internet and the WWW in particular:
The most common method of marketing information about your business is to develop a WWW site for introducing your products. Many organizations have created WWW sites for introducing customers to their products in the hopes of enticing them to buy the products. Unique and innovative ideas are used by organizations to attract potential customers to their WWW sites. Figure 9.10 shows the WWW site maintained by Toyota for providing WWW users with information about its automobiles. Fig. 9.10 Toyota maintains a WWW site for introducing its automobile products to potential Internet-based customers. Another more common method of advertising over the WWW is to buy advertising space on WWW sites commonly visited by Internet users at regular intervals. Organizations providing services to Internet users, such as the Yahoo WWW search database site, sell advertising space on their WWW site to finance their operations. Businesses can buy advertising space on other sites that grab users' attention and introduce them to products being offered by organizations around the world. These advertising spots usually also include hyperlinks to the WWW site being maintained by the advertiser so that the users can immediately navigate to that site if they are interested in checking out the products. Figure 9.11 depicts the Yahoo Search Database WWW site with some advertisements by other businesses. Fig. 9.11 The Yahoo WWW Search Database sells advertising spots on its site.
Before commercial activity was acceptable over the Internet, the only means of buying or selling products over the Internet was special Usenet newsgroups. These newsgroups were set up for allowing users to engage in trade amongst each other and are still heavily used by Internet users. For example, a newsgroup called misc.forsale.computers.monitors exists for allowing users to buy or sell computer monitors. The newsgroup rec.photo.marketplace is used exclusively to buy and sell photography equipment.
However, after the NSFNet backbone was decommissioned and the Internet became more commercial-sector based, changes were brought about to allow commercial activity over the Internet. The WWW became the medium of choice for carrying out electronic commerce over the Internet. One of the most famous and popular WWW sites for sales of commercial products over the WWW is The Internet Shopping Network. The ISN, as it is more commonly called, is one of the first WWW sites developed exclusively to sell commercial products over the Internet. Figure 9.12 displays the home page for the Internet Shopping Network WWW site. Fig. 9.12 The Internet Shopping Network sells a multitude of products to WWW users. Over the last couple of years, hundreds of WWW sites have sprung up for selling products to the Internet community. They are commonly referred to as online shopping malls. The Internet can be used to buy products in any imaginable category from clothes to skiing gear to computers to boats. Traditional businesses that have moved the fastest to embrace the Internet have usually been in the apparel retail business and the mail-order catalog business. Companies such as Lands' End, The Nature Company, The Limited, Damark mail-order catalog, and others have moved quickly to adapt their businesses to embrace the Internet.
Mailing lists are maintained by organizations around the world to inform their customers of new happenings, product announcements, and other important events. One example of a mailing list is the Microsoft Windows NT Server mailing list available to registered users of Windows NT Server. It informs them of product updates, bug fixes, and upcoming events relating to the Windows NT Server product.
Fig. 9.13 WWW content creators also have their own Web pages. The WWW is also being used as a medium for electronic publishing. Following are examples of popular uses of the WWW:
Fig. 9.14 ESPN has a WWW site available over the Internet called ESPNet SportsZone.
Today, the Internet is comprised of a backbone of networks being maintained by companies such as AT&T, MCI, and Sprint. This backbone network typically runs at speeds of 45-100 Mbps. It also provides connectivity to the mid-level and regional networks being operated throughout the world. For example, one of these is the Canadian Network (CA.net), which provides connectivity to most of Canada. These mid-level and regional networks then provide connectivity to local organizations, universities, and Internet Service Providers (ISPs) who provide or sell Internet connectivity to the commercial and private sectors.
The enormous growth of the Internet has resulted in a shortage of available IP addresses. An inappropriate allocation of addresses in the Class B range has resulted in inefficient address usage. Many organizations have been assigned relatively large Class B addresses when a smaller Class C address range would have sufficed. Efforts are underway to rework the addressing scheme to alleviate the problems.
Dividing the IP address space into classes makes it easier to distribute addresses using the top-down domain level hierarchy method. Backbone providers are usually assigned Class A addresses, with subsequent lower level domains being assigned Class B, C, or D addresses under that range. For example, Macmillan Computer Publishing USA is assigned a Class C address of 199.177.202.X, resulting in 256 Class D addresses being available to Macmillan Computer Publishing to be assigned as they see fit within their domain. Macmillan's network provider has a Class B address assigned to them, 199.177.X.X, allowing them to assign more than 65,000 lower domain addresses to their customers. The IP address purposely distributes administration of address assignment to lower levels of the domain hierarchy for autonomy of operations.
Because it is difficult to remember cryptic numbers, the Internet uses a naming convention called the Domain Name Service (DNS), which translates IP addresses into names that are easier to remember. For example, the address of the Macmillan Computer Publishing USA Internet server is (currently) 199.177.202.10. Because it is difficult to remember a number like this, the server has been assigned the name www.mcp.com. Just like the numeric IP addresses, domain names are also separated by dots for the purpose of creating a name-based domain hierarchy. In terms of domain names, the addresses are read backwards to identify the top level domain and so on. With www.mcp.com, the top-level domain is the com domain. The mcp domain is a mid-level domain that is part of the com domain, and www is one of the names of a computer in the mcp domain (it may have other names as well). Six top-level domains are defined for the United States:
In addition, countries around the world have each been assigned a two-character top level domain name. For example, "uk" is for the United Kingdom; "ca" is for Canada; "au" is for Australia; and "fi" is for Finland. The DNS naming convention allows for anywhere from two to four levels of nested domains, a completely arbitrary selection based on ease of use and simplicity.
Dividing computers and networks into domains distributes the administration of the naming system to lower levels of the hierarchy. Because every computer name on the Internet must be unique, it is easier to handle the administration by placing the responsibility on the network administrators to maintain uniqueness within their own domain. For example, you can have two computers, each named "server" and be legal as long as one is in the gasullivan.com domain (server.gasullivan.com), and the other is in the hamilton.gasullivan.com domain (server.hamilton.gasullivan.com). You cannot have both computers in the gasullivan.com domain because their name strings would be identical (server.gasullivan.com). However, because "hamilton" is a subdomain of the gasullivan.com domain, it is possible to have a second computer named "server" within the hamilton.gasullivan.com subdomain. So as long as you can append the computer name to a domain name and have the entire name string be unique, you have satisfied the naming convention.
On the client access side, the advances have been even more dramatic. The Internet has evolved from a network of supercomputers accessible via 300 bps lines to a network of networks accessible from millions of locations at speeds up to 45 Mbps. Today, the lowest acceptable access speed for the average user connecting to the Internet is 28.8 kbps, and 64-128 kbps ISDN lines are quickly increasing in popularity. In the near future you can only expect these access speeds to increase by an order of magnitude. Cable modems with transmission rate claims of 5-10 Mbps are already on the horizon. Advances in ATM (Asynchronous Transfer Mode - a new standard for network connectivity) technology promise to put 25-100 Mbps network connections on corporate desktops, and eventually this technology will trickle down to the private sector. Because ATM technology is scalable, transmission rates will only go up from here. Additionally, satellite connections will provide increased access speeds for the Internet community. Advances on the software side will be as interesting. The current push is to develop standards for securing financial transactions on the Internet. When the standards are in place, you can expect to see a multitude of software applications ranging from secure online shopping to online banking and online trading of financial instruments. Electronic commerce, a means for doing financial transactions such as credit card purchases, stock purchases, and automatic fund transfers from bank accounts, will become a common occurrence as users conduct their day-to-day business using the Internet. Tools such as the InternetPhone, which allows users to carry on a real-time audio-based conversation with each other using the Internet, have already broken new ground toward a new class of multimedia applications for the Internet. With the increase in available access speeds, you can expect to conduct extensive audio and video-based interactive sessions on the Internet. The Internet will become a place where people can interactively communicate with one another. If the Internet maintains its current rate of growth, and there is every indication that it will, most of the world's population could have Internet access by the end of the century. Internet users will also have a host of professional and personal productivity applications and tools available for them on the Information Superhighway. The Internet is sure to provide the following:
The possibilities on the Information Superhighway are limitless. The rest of this decade will allow us to witness technologies and applications well beyond the dreams of the original Internet creators when they conceived the idea some 20 years ago.
|