The anonymous FTP catalog

The Archie system is designed to maintain several different information catalogs, of various types. Nonetheless, it was originally conceived to maintain a catalog of files available by anonymous FTP, and this is still the application for which it most popular today.

Now that the arserver, arexchange, and arretrieve programs have been configured (according to the instructions in “Configuring the Basic System” on page ), you can go about setting up the system to maintain an anonymous FTP catalog.

Overview

In general, to set up the anonymous FTP catalog, anonftp, the following steps must be followed:

Choosing Data Hosts to enter

You need only enter into the Host Databases, those Data Hosts for which you plan to be directly responsible. You need not (and should not) enter other sites. Once you start participating in the global inter-Archie data exchanges, those data hosts for which you are not responsible will be entered into your database automatically. Similarly, you need not delete those sites for which you are not responsible. The exchange subsystem will propagate this information from the “master” Archie system responsible for that site.

Support for ls-lR.gz

Archie can now retrieve from anonymous ftp sites pre-generated ls-lR.gz files In order to activate this you will need to setup the file in the following way.

l





Where is where the program is located on your system.

You also need to fix the file by replacing the line

by

Hence when using the option in retrieve mode

Archie will try to first locate the ls-lR.gz file. If it can’t it will look for ls-lR.Z, ls-lR in that order and as a last resort dynamically create the new listing.

Adding sites

Parser failures

When the parsing phase of the anonftp catalog fails on a particular data host the temporary parse file (with the parse_t suffix) is not removed from the holding directory (). In addition, the filtered file is renamed with the suffix .filtered to allow the system administrator to see both the unfiltered and filtered versions. The system administrator may, if desired, manually fix the input data if desired.

The system provides the administrator with the approximate location of the parsing error and displays the line that caused the problem. This error can be viewed through the use of the host_manage program after the update phase of the cycle has been completed. Alternatively, the Archie log file contains a more detailed explanation of the error. However, as illustrated in the example Figure 5, the parse_t file is not the one actually parsed since the filter program first runs on the input. As a result, the error line generated is that from the output of the filter.

The parsing filters

By default, the distribution is configured to use perl language scripts for the filter_anonftp_unix_bsd filter (which is a soft link to the file ). The perl interpreter is available on many anonymous FTP archive sites. If you do not have perl installed at your site, you can change this soft link to point instead to the file , which is an alternative filter based on the standard UNIX sed(1) program. This second filter is less efficient than the perl filter so we recommend that you install perl and use that in preference.

Testing things out

To ensure the system is properly set up, the following programs can be tested by running them from the command line. The results will be written to stdout. Recall that, in normal operation, each of these steps would be run from the cron(8) daemon at predetermined times (see “Configuration” on page ). Almost all programs in the Archie system will accept a -v (verbose) command line option and you may want to invoke the programs with this flag when testing out the system.