Veronica

by Billy Barron

Fred Barrie and Steven Foster

--by Tod Foley

Access Tricks
Different or Invalid Results
The Stoplist
Relevance Ranking
Word Stemming
Multiple Word Searching
Boolean Searching
Word Delimiters
Using Synonyms
Seeing More Responses
Searching on Types
Link Files
Helping Veronica
Competing Systems
Future Directions
Summary

16

Fred Barrie and Steven Foster are the creators of Veronica, a powerful Gopher-based program that continually gathers and maintains a list of menu items on all Gopher servers worldwide—enabling you to search Gopherspace by topic rather than location. By finding your keywords and providing you with a simple menu-driven interface to navigate Gopherspace by, Veronica certainly lives up to her name: Very Easy, Rodent-Oriented, Netwide Index of Computerized Archives. She was the main topic of discussion recently on IRC channel #veronica, where Barrie and Foster met with me to talk about their work.

<tod> Tell me something about the origins of Veronica.

<barrie> Well, in the beginning we started by compiling the code for Gopher.

<foster> Steve and I were trying every little program and if it did not compile the first time, we gave up on it.

<barrie> Well, to Paul Linders credit, it compiled the first time and we were in the Gopher world.

<tod> How long have you worked together?

<foster> We started together in the fall of '91. We had both workd for NV computer services for a while before, but not together.

<foster> It got exciting together when we found we both thought Gopher was interesting. At that time we were the only 2 here who thought so.

<barrie> After the Gopher server was compiled at UNR, we started to think how to improve Gopher. Originally we were to index the Gopher world similiar to Archie—that is, find out where an item is and then give the route to the user to find it.

<barrie> The idea was to give them the address of the Gopher server that matched their query and then the directory—they would have to go down to find that item.

<barrie> Steve came up with the great idea of actually connecting them direct, and then we hit the big time.

<tod> Ahaa! Like the idea that Archie and FTP should really be one tool....

<barrie> Yes, Archie and FTP should be one tool....

<foster> Actaully, the dumb way of doing it would make some sense. At first, people were very interested in "Where is this resource?" and what institutional domain it belonged in; that is, they really felt they needed a lot of context.

<tod> The realspace was really that important to people?

<foster> Well, yes. I talked to Peterd [Peter Deutsch] shortly after our work began, and Peter did emphasize that...no, he emphasized that the "failure" of archie to directly connect is really a client issue, not a server issue. He said someone should do that client. But mainly, it was timing. The interactive protocols weren't there until '91...late '91.

<barrie> Actually, Gopher is the FTP and Archie client that made that all work.

<tod> So what happened next for you and V?

<barrie> Well, we found some code that did a simple search of a Gopher site for administrative purposes, by Boone at Michigan State.

<barrie> Well, we took that code and modified, really took a machete to it :) and started to do an index of about 250 Gopher sites.

<barrie> The first index we used was on a NeXT computer, using the Digitial Librarian with really really small data files to speed up the index process.

<barrie> After we brought the NeXT computer to its knees by the third day of the release...

<tod>!!!

<barrie>...we started to create new index schemes.

<tod> How many Gopher sites are there now?

<foster> June 1994: about 5800 Gopher sites.... It is a bit inexact because some gateways are halfway Gophers, maybe. From analysis of the last Veronica data collection, it looks like Gopher is being used more as "duct tape" than it was at first: many more gateways to FTP, Usenet, etc. At first we just had files, of course.

<tod> Okay—we're back in '92....

<foster> Okay, in 1992...

<barrie> Well, back in '92 we gave the guys at University of Minnesota a sneak peak at our little server.

<barrie> Farhad from U Minn came back and said we had to go to the Internet with this.

<barrie> So on November 17, we made an announcement that Veronica had started and asked people to comment on our great experiment in networking.

<tod> You originally saw V as just a personal tool?

<foster> NO NO—not just as a personal tool. But we needed the encouragement from the Minnesota team to announce it before we thought it was ready.

<foster> We wanted to work on it some more. They essentially said that the world needs this now, go for it guys, or some such Farhad-ism!

<tod> Well, Gopherspace sure the hell needed it....

<foster> That it did. Most interesting response.

<tod>"Most interesting?"

<foster> We posted to a few lists on the evening of Nov 17, and by the next morning our little NeXT box had taken about 4700 search connections!!!!

<tod>!!!

<foster> We were pleased.

<barrie> Oh by the way—the code was written in Perl.

<tod> Thanx.

<barrie> Let us say that it was the first thing we had ever written in Perl.

<tod> You guys are quick learners....

<barrie> We were learning Perl and Veronica at the same time. We made a lot of improvements in the first three months.

<tod> What have some of the significant improvements been?

<barrie> At first, Veronica was a search that took quite a bit of time to run.

<barrie> On the order of an Archie search, somewhere about 20 minutes or longer per search.

<barrie> Most people were happy with this since they were used to Archie.

<tod> How did you cut that down?

<barrie> I read a book on how to index...after I tried to hack the WAIS code into Veronica.

<barrie> The WAIS code was faster but it still lacked something...about 10 minutes.

<barrie> I experimented a lot with different codes and finally got the search time down to a reasonable time—20 to 30 seconds. This was all in April 93 (right before GOPHERCON '93). We had to show off a little :)

[Here Barrie gets offline for a short time; the interview continues.]

<tod> Let's talk about your individual contributions—how's the work divided between you and Fred? Who does what?

<foster> It has varied at different parts of the project. Fred was definitely faster at learning Perl than I was. On the other hand, I saw some of how the architecture should work, first. The first cut was really very evenly split as to server code. I did the first writeups on the WAIS cut, Fred started it, and I continued with it. I was still trying to write the server in C, using pieces of WAIS code, when Fred redid it in Perl with his nice no-overhead algorithm. Last summer I went off and consulted on some other stuff for a while; Fred did the data harvests alone for a couple of months then.

Anyway, I took over managing the harvests last August. Have done them since. Fred wrote some new stuff into the server and harvesting code in the fall. Since Feb, I have been doing it without Fred. [He] has actually started a new project with a different firm, and I think he will talk about that when he gets back on.

I am working on some new things for Veronica; improved interfaces, mostly. Seems every month there is a different pressing aspect to server bugs, or better control of no-index options, or something. In the next few months I want to work on interface issues.

<tod> There are plenty of people doing GUIs of one type or another.

<foster> YUP. I am definitely interested in the server end of it.

<tod> how do they fit with you, commercially/professionally? Has anyone approached you to package "Veronica in a Box?"

<foster> Interesting question. Yes, there have been some suggestions, but nothing real firm.

<tod> Is the code public domain?

<foster> No, the code is copyrighted. Shareware with no fee.

<tod> Heh!

<foster> Yes, that is the way we think it should be. I do think it is better to avoid proliferation of many indexers for the Net's sake. I am working with Peterd (for instance) to keep synchrionized about options for controlling no-index flags, update frequency, and so forth. And of course we are working with the MN team to add things to Gopher servers which can simplify the indexing problem....

<foster> Tell me Tod, do you have any ideas about how to package "Veronica in a box?"

<tod> Heh. I'm thinking of a personalized Veronica. She could limit her harvests to categories determined by questions you answer.

<foster> What would that be like ( for you :^) ?

<tod> For me? Well, "interactive," "fiction," "games," "magick," and she could infer from that, and go do harvests for me.

<foster> Ah, yes. But not, of course, if each one actually does a harvest on-demand. Oddly, there is much more interest in many topics like that, but not so many resources on the Gophers. People do ask for lots of things that are not in the index yet. I hope I can do something with that information, to promote services to supply that need.

<tod> I guess it would really be sort of an AI-esque shell over a Veronica client....

<foster> Yes, the AI Veronica. I am actually interested in telescript and so forth.

[At this point, Fred Barrie rejoins the conversation. Because of his quick return, he is now known as "Fast."]

*** Fast (barrie@shadow.scs.unr.edu) has joined channel #veronica

<fast> Hello again.

<tod> Hey, FastFred. <g>

<foster> These guys can't stand alone and do their own harvests; they need an infrastructure. At least with forseeable channel capacity.... I certainly sound geeky here don't I ? <g>

<fast> Say, where are we?

<tod> We were just talking about AI Veronicas.

<foster> Right, we had just been thinking about how a personalized Veronica should work.

<fast> Whoa, went off the deep end didn't we?<g>

<tod><g>

<foster> I summarized past work division, and told Tod that you had been doing something else since Feb, but other than that have not talked about the future.

<tod> Even though the AI Veronica is a ways off, I was wondering what's the next step now for Veronica?

<fast> Well, in my opinion we need to bring V fully up to Gopher+, with both an index for Gopher+ fields and a neat ask block interface.

<foster> Yes, we will make V handle URLs soon, for URL-savvy clients....

<fast> I think the URLs will be very important. We've talked about a V that returns html for those clients....

<tod> co0lco0l. URLs are starting to be listed in mags & directories in print....

<foster> Yes, but more important, V needs to provide Gopher addresses in URL form when asked, because the client software can do stuff with those URLs.

<fast> I have envisioned a BIG V that could handle WWW, FTP, and Gopher.

<tod> Yaas—The other side of the tool...the supertool....

<foster> YES. That side. Big server side. I want to do it. Actually, I am thinking a lot about that.

<fast> Well, the supertool is not that far away. Veronica can do both FTP and Gopher now but we turned off the FTP.

<foster> Yes, exactly. It is not so far away. And it will need the AI to do it correctly, or something a little smarter on the client side. Need to sort and cross-link and so forth.

<fast> I think if Mosaic could do Gopher+ we would be there.

<tod> How far away, realistically, is the required AI?

<foster> I think the required AI is available. But I don't know how to really use it until we have URNs.

<fast> Yes, URNs would make the job easier.

<tod> URNs—Names?

<foster> Yup. Uniform Resource Names...

<fast> ...so we have one resource for any service.

<tod> Fff... who does *that* indexing job?

<foster> He hee.

<fast> Say anyone...we all could do it....

<foster> That is a good question. We are asking whether the AI will do it or librarians will do it. But I think Fred is getting at a potential for us EACH to register our domain name space of resources, as we are each providers...so we could have URNs within a year? 2?

<fast> Yeah 2...the Internet community is sometimes slow in committee.

<foster> Lots of far-out ideas here. One very interesting thing to me is a sort of filter that could impose various editorial organizations on the realm of resources—a plug-in editor module. If I like Tod's view of the Magick on the Net, I can use his view. Otherwise, use McKenna's :^}

<tod> Uh-huh. Different filters for different industries/whatever....

<foster> Yes.

<fast> I think that all of the next-generation stuff can be done if we only had more time to experiment. The difficulty is two years ago we could waste bandwidth, but now the Internet is growing so big that it is unwise to do that. So we need to get our ducks in a row before we announce the next tool.

<tod> How long do you plan on being "in charge" of Veronica? Do you see "passing her along" some day, or continually refining and developing her?

<fast> Actually, I have alreay passed the Veronica stuff over to Steve.

<foster> I want to keep developing her. Continual refinement does not look boring to me. Lots of interesting stuff keeps coming along.

<fast> I got a *real* job and am no longer employed at UNR. My efforts are being put into NorthStar for now.

<tod> What are you doing there?

<fast> NorthStar is an attempt at indexing the Web. So far so good. (Plus it will fall into Veronica in the future.)

<fast> What I did was take the existing code for Veronica and modify (note I am not a programmer but a major hacker) it so that it could understand URLs, then I went out and indexed the WWW with this piece of code. Same idea, gather the information, retain what I need for the index, index that information, and present it in proper WWW format for clients.

<tod> It seems that conservation of time is always a key issue with you—is this a life thread ;-)?

<fast> Well yeah...so many things to do and so little time.

<foster> Interesting...time conservation is a major thread for us both. We wonder if we are normal.

<fast> <g>

<foster> We are Net guys. Our view exceeds our sleep requirement ...

<fast> The major problem, however, to the great tools of the future besides time is the computing resources. It is hard to experiment on a working Veronica server.

<tod> You need a (what?) a pseudoserver for testing?

<fast> Yes, a fast workstation for testing purposes that is not being bogged down.

<tod> Um-hmm.

<fast> Something we can reboot often :)

<fast> Not that I have ever brought a computer to its knees, mind you :)

<tod> Heh. Are there any things you'd like to be sure to say to the future gurus who read this interview?

<fast> My suggestion to them is: On the Internet, go ahead and just do it. No one else will do it until you do, so if you get an idea....go for it.

<tod> Like the god Nike sez!

<fast> Oh yes, no committee work either....

<foster> I think I also would make a plug for public interest. We are all shaping the way the Net will be, and what its potentials will be...so keep thinking about how it "should" be.

On the surface, Veronica may seem very simple. You enter a search word, and it returns a menu of items. Then you browse through the items looking for what you really want to do. Veronica is surprisingly complex and has power that the Internet guru can only exploit.

The Internet guru can use Veronica more efficiently than a novice and therefore will have more time for other tasks. In this chapter, it is assumed that you can use Gopher and know the basics of using Veronica.

Access Tricks

One of the most difficult things about Veronica often is just gaining access to it. Each Veronica server only allows a limited number of simultaneous searches (usually two to six). If these slots are busy, you will get a "connection refused" message.

An Internet novice just keeps trying repeatedly to get through manually. Statistics I generated based on the University of Nevada Reno and the University of Manchester, Veronica servers showed that this method is not good. When one server is busy, they usually are all busy. It is much better to wait and try at a different time.

From my research, I determined that the worst time of day to access a particular Veronica server is between 10 a.m. to 6 p.m. server time (see Table 16.1).

On the other hand, the best times appear to be between 9 p.m. and 12 a.m., and then from 1 a.m. to 9 a.m.. For some odd reason, between 12 a.m. and 1 a.m. is a busy time.

Table 16.1. Veronica usage based on time of day.

*Hour*	*Percentage of Queries*
12 a.m.	4.19
1 a.m.	2.26
2 a.m.	1.96
3 a.m.	1.97
4 a.m.	2.02
5 a.m.	2.25
6 a.m.	2.53
7 a.m.	2.97
8 a.m.	3.63
9 a.m.	4.26
10 a.m.	5.15
11 a.m.	5.74
12 p.m.	6.29
1 p.m.	6.52
2 p.m.	6.33
3 p.m.	6.32
4 p.m.	6.06
5 p.m.	5.58
6 p.m.	4.98
7 p.m.	4.76
8 p.m.	4.34
9 p.m.	3.77
10 p.m.	3.29
11 p.m.	2.81

The intermediate Internet user would probably come up with the idea of a war-dialer program that would keep trying until it got through. However, the real Internet guru would realize that all this does is slow down Veronica service for all other people. Therefore, the Internet guru would not use this method.

A real Internet guru with quite a bit of spare disk space (1.3GB for the April 1994 dataset) can run his/her own Veronica server to guarantee access. If the security tool TCP Wrapper is installed also, it is easy to control who can access the Veronica server. More advanced Veronica server administrators can even set up two Veronicas: One for public Internet access with a limited number of simultaneous connections, and one for local access with a high number of simultaneous connections.

Related to this discussion is a program called Maltshop. Some Veronica servers have a bad habit of crashing frequently. Maltshop is installed by a Gopher server administrator, and it automatically tests the servers to see if they are up. Unfortunately, some server administrators have set up their Maltshops to check too frequently, which has put some strain on the Gopher servers out there. In fact, the author of Maltshop has called it a virus.

Hopefully, one of the improvements mentioned at the end of the chapter will largely eliminate the need for Maltshop. If you set up Maltshop in the meantime, please set it to check only about once or twice a day.

Different or Invalid Results

Though, in theory, all Veronica servers should give you the same results, they do not. Some Veronicas are running an old version of the database. Others may remove some parts of the data that are of little value to conserve disk space. Some Veronica servers may even be running an old version of the Veronica code that may have some bugs.

Even when Veronica works normally, you probably will notice that many of the items returned do not work. Several events cause this to occur.

First, sometimes Veronica ends up indexing a collection of files that expire quickly. If you run such a collection, check the section "Helping Veronica," later in this chapter. Second, Gopher administrators love to move items around their Gopher servers to make their Gopher servers look better. This is okay because Veronica will find the items the next time it indexes. Other Gopher administrators will restrict items from public view, making them inaccessible. Finally, sometimes when you are looking for an item, the Gopher server that it's on is just down.

The Stoplist

Most indexing systems, including Veronica as well as WAIS, contain a list of words that are not indexed. This list is known as a stoplist. The words in the stoplist are commonly occurring words that are not useful in almost all searches. For example, words like the, a, and an can be found in the stoplist. The advantages of the stoplist are that it makes the index much smaller and saves CPU time when a query is being processed because the word can be safely ignored if it is part of the search string.

An Internet guru should probably learn what words occur in the stoplist. The Veronica stoplist can be found in the Veronica server package, which can be found via anonymous FTP on futique.scs.unr.edu.

Relevance Ranking

As many Internet gurus are also WAIS users, they are used to a WAIS feature known as relevance ranking. Relevance ranking assigns a relevance number to each document. The more times the search string appears in the document, the higher its ranking. Veronica has no similar feature, so the order Veronica returns items has no meaning.

Word Stemming

Occasionally, you may want to search on both the single and plural forms of a word. You may also want to search on both the noun, adjective, verb, and adverb forms of the word. Word stemming, also known as wild carding, is the solution to this need. The way word stemming works is that it finds all words that start with the specified characters.

While doing a Veronica search, end the word with a * (asterisk) to specify stemming. For example, you may want to look for book and books. In this case, you enter book* as your search string. This will return all words starting with book followed by any letters. In addition to what you are looking for, you will find that the search returns bookkeeper too, so you need to be careful when analyzing the results.

Some systems allow you to control the number of characters that can be added to the stem. For example, book? returns book and books, but not bookkeeper. Veronica unfortunately does not support this feature.

It should also be noted that a stem by itself (for example, *) is not a valid search string. Also, it does not make much sense to use common stems such as a* or under*. Thousands of items will be returned, and the result will be so big that it's meaningless.

Multiple Word Searching

Multiple words can be used in a Veronica search. For example, if you are looking for the book Tom Sawyer, you can just specify tom sawyer as your search string (remember that Veronica is case-insensitive). First, Veronica will find all items with the word tom. Then, Veronica will find all items with the word sawyer. Finally, it will compare each list and return items that are in both lists.

It should be noted that any multiple word search is not order dependent. If a search is done on tom sawyer, you will also be returned items of the form sawyer tom if they occur in the Veronica database. Currently, there is no way to specify that you want hits only of a specific order to be returned to you.

In multiple-word searching, there is no performance advantage in reordering the terms as there is with some database systems. Searching for tom sawyer and sawyer tom will take the exact same amount of time and give you the same results.

Boolean Searching

Frequently, Veronica returns hundreds if not thousands of entries when a query is entered. Every once in a while, that is useful. However, most of the time, it is not. Boolean searching allows you to more closely specify your query on your target. Another benefit of Boolean searching is the ability to include more words into your search and make your results larger, but contain exactly what you need.

Boolean searching uses three keywords: AND, OR, and NOT. In addition, parentheses allow words to be grouped together and processed first. The use of AND is unnecessary as any two adjacent words excluding OR and NOT will be ANDed together automatically, as we saw previously.

OR works by evaluating each of the words and then returning all of the items that match either word. For example, we are interested in finding all Gopher items about Bach or Mozart. We can do the search bach or mozart. Veronica will first find all items with the word bach in the title and then it will find all the items with the word mozart. Then it will present all of these items to the user.

NOT finds items where the first word is in the item and the second word is not. You cannot start a query with the word not. For example, saying not baseball is an invalid query. It is not really a useful query anyway, as it would return hundreds of thousands of items.

Parentheses are useful for more complex searches to make sure that what you want is what you actually are going to get back. The easiest way to see this is to run through an example. You want information on either left or right field. To begin with, you enter left or right field. Veronica returns you over 2000 items. When you start looking through them, you realize that Veronica has returned a lot of items that have the word left but not the word field in them. The reason is that Veronica interpreted the query as left or (right field). After figuring this out, you enter (left or right) field, which returns you just six items that all meet your requirements.

You may be wondering why Veronica grouped the original query as left or (right field). This is because Veronica parses the query from right to left instead of left to right as most systems. Instead of remembering this and wasting your valuable time figuring out how to build your query so it parses correctly right to left, it is much easier and highly recommended to just use parentheses.

Word Delimiters

Veronica has an unusual concept of what a word is that it applies when it is indexing a title. This is very important to know when you are querying Veronica. Veronica treats any special characters as a delimiter. Therefore, a newsgroup name such as alt.bbs.internet would be indexed under alt, bbs, and internet separately. When you are performing searches, it does not make any sense to use special characters either. If you want to search and find items about the newsgroup alt.bbs.internet, you can just search for alt bbs internet or alt and bbs and internet.

Using Synonyms

At the current time, Veronica does not support a thesaurus feature. If it did, then when you searched on a word, you would also find items with other words that have the same meaning. At times, it is important to do searches of this nature to find all the items you are looking for.

The solution for now is to pull out your own thesaurus and build your own list of related words, and do a query on all of them. Let's say we are interested in the 1960s and hippies in particular. I looked up the word hippie and decided to enter the query hippie or hipster or beatnik or bohemian.

If you do not have a thesaurus handy, several thesauri exist on the Internet that you can use. If you do not know where they are, just query Veronica to find out. One caveat, however, is that most of the thesauri accessible by the general Internet community is the 1905 version of Roget's Thesaurus. Needless to say, it is not current—especially on topics like computing.

The danger of any thesaurus feature is that in English, at least each word has a single different meaning. Therefore, some of the results may not be exactly what you are looking for, but you can always ignore the items you don't want. Some of the Gopher clients even support a key to remove items from your display so as you go through your Veronica results, you can make the items you are not interested in disappear. For example, on the University of Minnesota UNIX Gopher client, you can invoke this function by hitting D when your arrow is pointing at the item you no longer want to see.

Seeing More Responses

Very frequently, you will enter a Veronica query, 200 items will be returned to you, and at the end you will see an item that says ** There are 1958 more items matching the query 'golf', or some similar message. This is often a good time to do a Boolean search, as was previously discussed. However, at other times, you'll want to see all of these items.

To see more items, you can add the option -m to the end of your queries (for example, golf -m). If you specify it as -m, it will return all items no matter how many there are. Alternatively, you can specify a number after the -m to specify exactly how many items you want to see. For example, golf -m500 returns you the first 500 items matching golf.

It is important to use -m only when you need it. Some clients have trouble with long menus being returned. Also, using -m adds an additional load to the Veronica servers. Most of the time when a large number of items are returned, the user will not use them, so the extra values are unneeded. The server has to do some extra processing to return all of the items to the user, so not using -m saves a little bit of time.

Searching on Types

Every item in Gopherspace has an associated type (see Table 16.2). By default, Veronica returns you all items no matter what their type is. By using the -t option, you can specify what types you want to get back. The -t is followed by a one character type that you want to get. For example, if you only want to see directories (menus), you can specify -t1.

Table 16.2. Popular Gopher types.

*Type*	*Description*
0	File
1	Directory/Menu
2	CSO PhoneBook
3	Error
4	Binhex file
5	DOS binary archive file
6	Uuencoded file
7	Index, such as Veronica, WAIS, and so on
8	Telnet session
9	Binary file
T	TN3270 session
I	Image
h	HTML file

Link Files

Internet gurus can use Veronica to add items to either their own Gopher server or their bookmarks file. If an -l option is added to the search, the first item returned is a link file that contains links to all the items that were returned in the search. This item can be saved into a file and then included into a UNIX Gopher server to build a new menu, or in the .gopherrc file on UNIX to add bookmarks. This is also useful when you need more than one session to go through the results of a Veronica search.

Helping Veronica

The Veronica database is ever growing. In April 1993, Veronica only consumed 300MB of disk space. By April 1994, Veronica had grown to 1.3GB. If the current growth rate continues, Veronica will be up to 5.6GB by April 1995.

The Veronica team has been finding ways to keep the database as small as possible without losing valuable data. Unfortunately, these methods alone are not sufficient to deal with the growth of Gopherspace by themselves. If the growth continues, fewer sites will be able to run Veronica databases. The fewer sites will have to absorb more of a load, meaning they will be busier than ever.

If you are a Gopher server administrator, as are many Internet gurus, then you also can help Veronica work better. Most importantly, it is critical that items of little or no value to the Internet community at large not be included in the Veronica database. An item that falls in one of the following classes generally should not be included in Veronica:

Experimental items that are not ready for production use.
Items that expire quickly. This is especially the case for Usenet News articles.
Restricted directories that should be accessible only by a limited number of users and not the whole Internet.
Items that are not of general interest.
Items that are useful to the Internet community at large, but the server administrator does not want to include in the Veronica database due to the fear of your server not handling the load of the entire Internet.
Items that are useful to the Internet community that exist in several locations on the network, but your copy of the items is older and more out-of-date than the ones existing elsewhere. A classic example of this is the hundreds of sites that have partial collections of RFCs.

The instructions in this section are written around the University of Minnesota UNIX Gopher server, which is the most popular Gopher server in existence. It is expected that if you are reading this book, you should be able to modify these into what is necessary for your particular server—or know how to ask on the comp.infosystems.gopher newsgroup for further assistance.

If you want to remove your whole server from the Veronica database and you are running version 2.x or higher, it is easy to do. Just add the line veronica index: no to your gopherd.conf file. Fro version 1 or excluding parts of your Gopher Tree, you need to ask for the current standard on comp.infosystems.gopher or by e-mailing Steve Foster at foster@scs.cnr.edu.

Also, it is possible to generate a file with all of the items that you want indexed by Veronica. Then Veronica will use this file instead of traversing your Gopher tree and looking at each individual item. As this feature is undergoing some changes even while I am writing this, I will not specify the details for now.

Competing Systems

Though Veronica was the first system of its type to index Gopherspace, other systems are starting to do it also. Internet gurus should keep track of these developments because each will probably be useful in its own way. The other systems mentioned next should support the options in the previous section to allow people to make sure their Gopher server is not indexed. If you notice they do not support them, you should inform the system's author that they do not.

Most likely before this book hits the streets, Archie will also support Gopher (it is in beta test as I am writing). The major difference between Archie and Veronica is that while Veronica indexes everything unless the Gopher administrator specifically requests it not be indexed, Archie will index only those sites that request to be indexed. In April 1994, Veronica indexed around 6900 Gopher servers, but the Archie people had requests that only 600 servers be indexed.

Rhett "Jonzy" Jones, author of Jughead, proposed a scheme by which Jughead will now allow searches of large parts of Gopherspace. The plan is to have Jughead servers have knowledge of each other in a DNS-like scheme. Other sites that run Jughead will be the only ones indexed by Jughead, so the number of sites listed in Jughead will probably be smaller than either Veronica or Archie. Some people doubt that this scheme will work. However, the unofficial motto of the Internet community is that almost anything is worth a try, and we can learn as much from our mistakes as our successes.

Future Directions

In the middle of writing this chapter, I attended the GopherCon 94 conference. During the conference, many discussions were held about the future of Veronica. I will present the features that are of interest to the Internet guru and are planned to be added to the code. However, I make no guarantees that these additions will be made. Of course, if a particular modification is important to you, you are always welcome to take the source code, modify it, and send the modification back to the Veronica Team.

As I write, an alpha-test version of Veronica is being worked on. In this version, Veronica will group the items it returns by hostname. For example, you search on the word dog; it will give you back a list of menus. The first menu is a list of all items with dog in the title. Selecting this menu gives you the standard Veronica results. The rest of the menus list host names. If you select the host name, you see all the items with the word dog on that hostname. The benefit of this feature is that you can get an idea of where sites are that contain materials of interest to you.

A couple of ideas about improving Veronica performance has been talked about recently. Veronica is written in Perl and runs under INETD. What this means is that the Veronica source code is compiled every time someone does a query. Heinz Stoewe is working on a way to keep Veronica running and not have it recompile every time a query is done—this should make a minor improvement in performance. My estimate is that it will save a quarter of a second per query. In the longer term, it is hoped that Veronica will be rewritten in C. Once that happens, it is hoped that Veronica's bottleneck will be purely disk I/O.

Mark McCahill of the Gopher Team came up with an excellent idea. The idea was to do a search with a special flag and end up in a random spot in Gopherspace. This would be a good method for you to find new, unexpected resources in Gopher.

Veronica's database will be undergoing some changes. First of all, the stoplist will be increased. Gopher gateways to FTP, X.500, and other services will no longer be included in the database. Some discussions were held about removing all filenames from the database.

Several of us came up with a method to minimize the Veronica access problems I discussed earlier. A Gopher-to-Veronica gateway is planned. The gateway will accept a query. The gateway will contact a preferred Veronica server if the system administration has defined one. If the preferred server does not resolve the query due to load, the gateway will randomly try another server. It will keep trying servers until the query is successfully completed or too many retries occur.

In the longer term, Veronica may support synonyms. When this feature is implemented, a user who searches for dishes would also see the occurrences of plates in the Veronica database. This is a common feature in other searching systems.

Another interesting possible future addition is the addition of facets. Facets give access to an idea by some of its other features instead of just its title. For example, searches could be done by Geography, Discipline, or time period. These additions will allow queries to be narrowed down more easily.

Summary

Veronica is useful to all Internet users, including the Internet guru. The rapid growth of Gopherspace is starting to strain Veronica. It is up to us—the Internet gurus—to find ways to improve Veronica and keep it the useful service it is over the next few years.