<< , >> , Title

7 WWW Servers


If you wish to make information available you will need to run a WWW server. The server software is known as httpd - the hypertext transport protocol daemon. Just as there are many WWW browsers available there are also many servers, including ones for Unix, MS Windows, Windows NT and the Apple Macintosh.

This section gives an example of how to install and run a server for the Microsoft Windows environment. The section then goes on to illustrate a number of server management issues which are based on the CERN server for the Unix platform.

Example of Installing A Server On A PC

An example illustrating how easy it is to install a WWW server is given below. The example assumes that you have access to a networked PC.

Connect to the NCSA server software from the anonymous FTP server at ftp.ncsa.uiuc.edu Then change directory to /Web/httpd/Uni/ncsa_httpd/contrib/winhttpd Finally retrieve the file whtp13p1.zip An example of how to do this using the FTP software is illustrated below.

ftp src.doc.ic.ac.uk
image
cd /Web/ncsa/httpd/Windows/winhttpd
get whtpp13p1.zip
Create a directory called C:\HTTPD on the C: drive of your PC and then move to the directory using the CD \HTTPD command. Then uncompress the file by giving the command:

PKZUNIP -D WHTPP13P1.ZIP

The -D option will preserve the directory structure from the compressed file.

Run Microsoft Windows and create a program icon using the New option on the File menu. The icon should point to the file C:\HTTPD\HTTPD.EXE

Set the time zone in the AUTOEXEC.BAT file so that TZ=GMT.

Run the server program. The window shown below should be displayed.


Figure 7-1 Running The Windows HTTPD Server.

Run a World-Wide Web browser and then enter a URL containing the IP address of your PC. For example if your PC has an IP address of 192.11.1.1 you should enter the address:

http://192.11.1.1/

The following diagram illustrates NCSA Mosaic for X accessing a server running on a PC.


Figure 7-2 Accessing The MS Windows HTTPD Server.

This example is meant to illustrate the installation of a WWW server. In practice the server software is likely to run on a more robust system than a PC running MS DOS, such as a Unix or Windows NT system.

Server Configuration Files

World-Wide Web server software will normally have a configuration file which is used to:

As WWW develops, additional features will be provided in the server software and the configuration files are likely to grow in complexity. An example of a simple configuration file is shown below.

map / file:/apps/WWW/homepage.html
map /* file:/apps/WWW/*
pass file:/apps/WWW/*
fail *
Figure 7-3 A Simple httpd.conf Configuration File

Figure 7-3 shows a simple configuration file for the CERN httpd server. Line 2 specifies that files located under the directory /apps/WWW should be available to the WWW server software. Line 1 specifies that file /apps/WWW/homepage.html is the default file to be displayed when the WWW server is accessed.


protection prot-proxy { # Part 1
serverid www.leeds.ac.uk
mask @(129.11.*.*)
}
protect http:* prot-proxy # Part 2
protect gopher:* prot-proxy
protect ftp:* prot-proxy
protect wais:* prot-proxy

pass http:* # Part 3
pass gopher:*
pass ftp:*
pass wais:*

Exec /cgi-bin/ucs/* /apps/WWW/cgi-bin/ucs/* # Part 4
Exec /cgi-bin/bionet/* /apps/WWW/cgi-bin/bionet/*
Exec /cgi-bin/bmb/* /apps/WWW/cgi-bin/bmb/*

map / file:/apps/WWW/homepage.html # Part 5
map /* file:/apps/WWW/*
pass file:/apps/WWW/*
fail *

AccessLog /var/adm/httpd.log # Part 6
LogFormat Common
LogTime LocalTime

Caching On # Part 7
CacheRoot /usr/info/WWW_cache
CacheSize 300
CacheAccessLog /var/adm/httpd_cache.log

# Delete files from cache after specified number of days
CacheClean http:* 10 Days
CacheClean gopher:* 10 Days
CacheClean wais:* 10 Days
CacheClean ftp:* 10 Days

# Don't cache local files # Part 8
NoCaching http://*.leeds.ac.uk/*

# If a file hasn't been accessed within the last specified
# number of days delete from cache
CacheUnused * 5 days
CacheUnused http://info.cern.ch/* 10 days
CacheUnused http://www.ncsa.uiuc.edu/* 10 days

# ensure dynamically changing documents are only kept for short
# periods e.g. one modified 10 days ago will only last 2 days
CacheLastModifiedFactor 0.2

# If a file was retrieved more than 5 days ago do a
# 'conditional get' request to the source server to check
# that it hasn't been updated in the meantime.
CacheRefreshInterval http://* 5 days
CacheRefreshInterval gopher://* 5 days
CacheRefreshInterval ftp://* 5 days

# CacheDefaultExpiry ensures that Gopher and FTP files are
# cached. The default is 0 which is what we want for http
# documents with neither an expiry nor a last-modified stamp.
CacheDefaultExpiry ftp://* 5 days
CacheDefaultExpiry gopher://* 5 days

# Remove unwanted cached files daily at 3 am (garbage collection).
Gc On
GcDailyGc 3:00
Figure 7-4 A httpd.conf Configuration File

Figure 7-4 shows another configuration file (this is for illustrative purposes - some options may have been superseded). The various features are summarised below:

Parts 1 and 2 provides a mechanism for ensuring that the proxy gateway cannot be accessed from outside the local domain. Without these options it would be possible for a browser on an external system to use the proxy gateway to gain access to files which are restricted to local use.

Part 3 passes requests for the httpd, gopher, wais and ftp protocols.

Part 4 specifies the location for CGI files.

Part 5 specifies the area of the filestore which can be accessed by the server.

Part 6 describes the location and format of the server log file.

Part 7 specifies that server caching is to be available, and gives the location of the cache and the cache log files, together with the size (in Mbytes) of the cache.

Part 8 specifies the purging frequency for files in the cache.

An example of a typical httpd.log file is shown below.

abc.cs.xyz.edu - - [21/Nov/1994:21:58:58 +0000] "GET /music.html HTTP/1.0" 200 4375
gps0 - - [21/Nov/1994:21:59:48 +0000] "GET / HTTP/1.0" 200 2782
abc_pc99.leeds.ac.uk - - [21/Nov/1994:21:59:47 +0000] "GET http://www.leeds.ac.uk/ HTTP/1.0" 200 2782
abc.nt.com - - [21/Nov/1994:22:00:03 +0000] "GET /music/NetInfo/MusicFTP/ftp_sites.html HTTP/1.0" 200 13175
Figure 7-5 Example of a httpd.log File.

Note that the names of the machines accessing files from the server have been altered in the diagram. This has been done because it could be argued that such information should be confidential.

Caching

Many clients provide client-side caching. This means that if you retrieve a file and then retrieve another file, when you return to the initial file it will be retrieved from the client's cache, thus saving a subsequent network transfer.

A number of servers also support caching by the server. This is illustrated in Figure 7-6.


Figure 7-6 Caching By The Server.

Caching can improve the performance of a WWW service by ensuring that frequently requested files will tend to be stored in the local cache. There is, of course, a danger that if the file on the remote server is updated then an out-of-date file will be retrieved from the cache. In practice, however, httpd server software which supports caching can deal with this issue by, for example, looking at the date of the file on the remote server and, if the remote file is newer than the file in the cache, replacing the file in the cache with the new version of the file.

[Proxy Information]
http_proxy: www.leeds.ac.uk
gopher_proxy: www.leeds.ac.uk
wais_proxy: www.leeds.ac.uk
Figure 7-7 Client Configuration File To Support Caching.

It order for a client to make use of a cache on a server, the client configuration file (e.g. the MOSAIC.INI file) must be suitably configured. Figure 7-7 illustrates the relevant options for the MOSAIC.INI file.

Accesses of the cache are recorded in the cache log file. A typical log file is illustrated in Figure 7-8.

xyz_pc77.leeds.ac.uk - - [21/Nov/1994:00:43:35 +0000] "GET http://white.nosc.mil/gif_images/NM_Sunrise_s.gif HTTP/1.0" 200 18673
xyz_pc77.leeds.ac.uk - - [21/Nov/1994:00:43:38 +0000] "GET http://white.nosc.mil/gif_images/glacier_s.gif HTTP/1.0" 200 6474
xyz_pc77.leeds.ac.uk - - [21/Nov/1994:00:43:40 +0000] "GET http://white.nosc.mil/gif_images/rainier_s.gif HTTP/1.0" 200 18749
Figure 7-8 The httpd_cache Log File.

Note that the names of the machines accessing files from the cache have been altered in the diagram. This has been done because it could be argued that such information should be confidential.

Caching Strategies

As well as using a local server cache, it is also possible to use a national caching service. The Unix HENSA service at the University of Kent at Canterbury run a national caching service. To use this service the local client should define www.hensa.ac.uk as the proxy. Another national caching service is available at sunsite.doc.ic.ac.uk Further information is available at the URL http://src.doc.ic.ac.uk/WWW-Cache.html

An institution will need to decide whether to use a caching service and, if so, whether to have caching services running on a number of departmental system, to have an institutional caching service, or to use the national caching service at HENSA. In the future it may be possible to chain caches. The possibility in the long term of having institutional, metropolitan, national and continental caches should be considered.

Proxy Gateways

In many academic institutions off-campus access to the Internet is restricted to authorised computers. Depending on the institution's local policy, authorisation may be restricted to computers located in offices in which there is an individual who is responsible for use of the machine. Such a policy may be enforced in order to provide some means of security against hacking remote services. However this policy would appear to prevent students from accessing remote information services from computers in open access cluster areas.

In practice there is a technique known as proxy gateways which can be used to provide access to services off-campus, without compromising local security. With a proxy gateway a trusted system (typically a Unix system which is more secure to hacking than a desktop machine) will have Internet access. Machines in open access clusters can point to the proxy gateway, which will then retrieve information from off-campus services.

It should be noted that with increasing usage of Internet services such as the World-Wide Web, the author believes that the provision of security mechanisms, such as proxy gateways, will be increasingly important.

Further information

Further information on caching and proxies is available at the following URLs:

Security

The httpd server also handles a number of security issues. It is common practice to restrict access to a certain area of the filestore. For example if the server configuration files contains the lines:

map /* file:/apps/WWW/*
pass file:/apps/WWW/*
fail *
Figure 7-9 Server Configuration File.

then clients will only be able to access files held under the directory /apps/WWW/.

Note This statement refers to clients running on remote machines. If the client is running on the same machine as the server, the client will normally be able to access files on the server to which it has read access.

Additional levels of security can also be specified:

The method of implementing such security tends to be server dependent, and will not be described in this document.

A WWW Security FAQ is available at the URL http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq

Information on making your NCSA httpd server more secure is available at the URL http://hoohoo.ncsa.uiuc.edu/docs/tutorials/index.html

SHTTP

Secure NCSA httpd is a World-Wide Web (WWW) server supporting transaction privacy and authentication for Secure WWW clients over the Internet using the Secure HyperText Transfer Protocol (S-HTTP). Fuirther information is available at the URL http://www.commerce.net/software/Shttpd/Docs/manual.html

Netscape and Security

The SSL protocol has been submitted to the IETF as an Internet Draft. Netscape is actively pursuing the standardization of SSL within the framework of the IETF standards process and is also working with industry consortium groups to ensure that open and interoperable security standards exist now and in the future. Further information is available at the URL http://home.mcom.com/newsref/std/SSL.html

Summary of Server Software

A brief summary of server software is given below. This summary is based on Thomas Boutell's WWW FAQ.

Unix Servers

CERN httpd

Information about CERN's server is available at the URL http://www.w3.org/hypertext/WWW/Daemon/Status.html

NCSA httpd

NCSA's server is available at the URL ftp://ftp.ncsa.uiuc.edu/Web/ncsa_httpd

Apache httpd

The Apache server is available at the URL http://www.apache.org/

EIT httpd

EIT have created a Webmaster Starter's Kit which installs their server using a forms interface from a WWW browser. Further information is available at the URL http://wsk.eit.com/wsk/doc/

GN Gopher/http

The GN server can serve both WWW anbd Gopher clients. It may be useful for sites wishing to migrate from Gopher to WWW, although it does not have the server-script capabilities of the CERN and NCSA servers. Further information is available at the URL http://hopf.math.nwu.edu/

Plexus perl server

The Plexus server is written in Perl. Further information is available at the URL http://bsdi.com/server/doc/plexus.html

WebWorks Enterprise server

This is a commercial server marketed by Quadralay Inc. Further information is available at the URL http://www.quadralay.com/products/WebWorks/Server/index.html

Netsite Communication Server and Netsite Commercial Server

These servers have been developed by Netscape Communications Corporation. Further information is available at the URL http://home.mcom.com/MCOM/products_docs/server.html

Macintosh Servers

MacHTTP

Information about the MacHTTP server for the Apple Macintosh is available at the URL http://www.biap.com

Novell Netware Servers

httpd nlm

The httpd NLM server for Novell Netware is available at the URL ftp://ftp.glaci.com/pub/netware/http/

Microsoft Windows and Windows NT Servers

https

HTTPS is a Windows NT server developed at Edinburgh University which runs on Intel, MIPS and Alpha CPUs. It is available at the URL ftp://emwac.ed.ac.uk/pub/https/

NCSA httpd For Windows

The NCSA httpd for Windows server provides most of the features of the Unix version, including scripts (which generate pages on the fly). It is available at the URL ftp://ftp.ncsa.uiuc.edu/Web/ncsa_httpd/contrib/

SerWeb

SerWeb is a Microsoft Windows server. It is available at the URL ftp://emwac.ed.ac.uk/pub/serweb/

Web4Ham

Web4Ham is a Microsoft Windows server. It is available at the URL ftp://ftp.informatik.uni-hamburg.de/pub/net/winsock/

Server Strategies

An institution needs to decide on its server hardware strategy. For example, should it support:

  1. A central server
  2. A number of departmental servers

If option 2 is chosen then how is indexing across servers to be achieved, and what caching strategy is to be adopted? What are the skills levels needed by the server administrator? An institution needs to recognise that adopting a server strategy is more than simply installing the server software.

Which Server?

The most widely used servers are probably those developed at CERN and NCSA for the Unix platform. Unix is probably the best platform for running an institutional WWW service, since it is a mature, pre-emptive multi-tasking operating system. In addition, Unix provides a wide range of tools which can be used to assist in system administration. Servers are available for the PC and Macintosh platform, but, due to the inherent deficiencies in the operating system environments which are currently used on the platform, such servers are probably not recommended if you wish to run a large-scale, stable service.

Servers have been developed for the Windows NT environment. This may provide a robust operating system environment which can be used for providing a WWW server on an Intel platform.

Further Information

Further information about HTTP is available at the URL http://info.cern.ch/hypertext/WWW/Protocols/

Information about HTTP/NG is available at the URL http://info.cern.ch/hypertext/WWW/Protocols/HTTP-NG/http-ng-status.html

The HTTP/1.0 specification has been submitted as an Internet-Draft and is available for comment at the following URLs: http://www.ics.uci.edu/pub/ietf/http/draft-fielding-http-spec-00.txt and ftp://www.ics.uci.edu/pub/ietf/http/draft-fielding-http-spec-00.txt

The document Setting up a World-Wide Web Server, which is available at the URL http://scholar.lib.vt.edu/reports/Servers-web.html , gives advice on setting up a server.

A collection of utilities intended especially for WWW system administrators is available at the URL ftp://src.brunel.ac.uk/WWW/managers/

A list of server software is available at the URL http://www.cern.ch/hypertext/WWW/Daemon/Overview.html

A list of server software is available at the URL http://www.charm.net/~web/Vlib/Providers/Servers.html

A list of server software is available at the URL http://www.yahoo.com/Computers/World_Wide_Web/HTTP_Servers/

A hypermail archive of the HTTP-WG mailing list is available at the URL http://www.ics.uci.edu/pub/ietf/http/hypermail/

A WWW server comparison chart is available at the URL http://sunsite.unc.edu/boutell/faq/chart.html

A review of WWW servers is available at the URL http://wais.wais.com:80/techweb/iw/521/21olweb.htm

A review of MacHTTP is available at the URL http://www.ziff.com/~macweek/mw_webedge/webedge.html


<< , >> , Title