Figure 6-1 A Collection Of WWW Search Engines.
A collection of WWW search engines is available at the URL http://cui.unige.ch/meta-index.html Some of the main searching tools are listed below:
There are a number of problems with this approach to global indexing:
A list of robots is kept at the URL http://web.nexor.co.uk/mak/doc/robots/active.html
Aliweb (Archie Like Indexing In The Web) provides another approach to the indexing of WWW resources. With Aliweb each site is responsible for indexing files. The server administrator is responsible for choosing and describing the services to be indexed.
Further information about Aliweb is available at the URL
http://web.nexor.co.uk/aliweb/doc/aliweb.html
The paper ALIWEB -
Archie-Like Indexing In the Web, which was presented at the WWW 94
conference in CERN, is available at the URL
http://web.nexor.co.uk/mak/doc/aliweb-paper/paper.html
SWISH
SWISH, which stands for Simple Web Indexing System for Humans, was
announced on 16 November 1994. It is a program that allows you to index your
Web site and search for files using keywords in a fast and easy manner.
Documentation is available at the URL
http://www.eit.com/software/swish/swish.html
The software is available at the URL
ftp://ftp.eit.com/pub/web.software/swish/
WAIS
WAIS (Wide Area Information Server) is another mechanism for indexing resources. WAIS is used by the Computing Service, University of Leeds to index its documents and newsletters. An example of how the WAIS server and WAIS indexing software is used is given below.
The command:
waisserver -p 210 -d /apps/info/WWW/WAIS
is used to start the WAIS server software. The -p 210 argument specifies the name of the port on which the server runs while the -d argument gives the name of the directory which will contain WAIS databases. Note that since the WAIS server will normally be running continuously it will normally be initiated by the system administrator.
Newsletters are indexed by giving the command
waisindex -export -d /apps/info/WWW/ucs/newsletter/wais-sources/computing-service-newsletter -T HTML *.html
The name of the WAIS database is computing-service-newsletter This long name is used since a single directory is used for all WAIS databases - it will save confusion if other departments wish to index their own departmental newsletters.
The WAIS database can be accessed by a dedicated WAIS client or by a WWW browser which contains support for the WAIS protocol. The WAIS database can be accessed by giving the URL wais://www.leeds.ac.uk/computing-service-newsletter
A number of utilities are available which can post-process the output from WAIS.
wais.pl is a CGI script which is distributed with the NCSA httpd server.
Son of wais.pl is a CGI script which is based on the wais.pl script.
SFGate is a CGI script which interfaces to WAIS servers. SFGate provides a forms interface which can be used to access a number of WAIS databases. It is available at the URL http://ls6-www.informatik.uni-dortmund.de/SFgate/SFgate.html A demonstration is available at the URL http://ls6-www.informatik.uni-dortmund.de/SFgate/multiple.html
wwwwais is a small ANSI C program that acts as gateway between waisq or waissearch (programs that search WAIS indexes) and a forms-capable World-Wide Web browser. With the freely distributable freeWAIS package, this program, and your local Web site, you can:
You can FTP the source and related files from the URL ftp://ftp.eit.com/pub/web.software/wwwwais/
You can see how it looks at the URL
http //www.eit.com/cgi-bin/wwwwais
A WAIS Application
One interesting application of the use of WAIS is the multimedia archive prototype developed by Andy Walker, formerly of the CBL/Multimedia Unit, University of Leeds. The prototype was developed to investigate the feasibility of providing an archive of multimedia objects for use in CBL applications by members of the University of Leeds.
A directory is created for each multimedia object. The directory contains the multimedia object itself (e.g. a graphical file, video clip or sound file) together with a keyword file which describes the object. The keyword files are indexed using WAIS. A WWW browser which supports forms is used to run a CGI script. The CGI script invokes the waisq command to search the WAIS database. The output from waisq is then used to create a HTML file which contains pointers to thumbnail images of matching multimedia objects.
Figure 6-2 Multimedia Archive.
Which WAIS?
A number of WAIS servers are available. The freeWAIS software is
currently used at the University of Leeds. This software is maintained by
CNIDR, the Clearinghouse For Networked Information Discovery and Retrieval.
The freeWAIS software, however, is based on the 1988 version of the Z39.50
protocol. An implementation of WAIS based on the 1992 version of Z39.50 is
also believed to be available from CNIDR. freeWAIS is available at the URL
ftp://ftp.cnidr.org/pub/NIDR.tools/freewais
freeWAIS-sf is an implementation of WAIS developed at Dortmund University. It is available at the URL ftp://ls6-www.informatik.uni-dortmund.de/pub/wais/freeWAIS-0.2-sf-beta.tar.gz
CNIDR Isite
CNIDR Isite is an integrated software package including a text indexer,
search engine and Z39.50 communication tools to access databases. Isite
includes the CNIDR ZDist, Isearch and Search API distributions.
A mailing list has been established to discuss Isite. To join, send an -mail message to listserv@vinca.cnidr.org with the body of the message as subscribe ISITE-L your name To post messages to the list, send to isite-l@vinca.cnidr.org.
Further information is available at the URL http://vinca.cnidr.org/software/Isite/Isite.html
Further Information
A tutorial on Mosaic and WAIS is available at the URL
http://wintermute.ncsa.uiuc.edu:8080/wais-tutorial/wais.html
A WAIS overview is available at the URL http://info.cern.ch/hypertext/Products/wais/sources/Overview.html
A list of resources about the Z39.50 information discovery protocol is available at the URL http://ds.internic.net/z3950/z3950.html