File Serving and Information Distribution in the Isaac Newton Group, La Palma

Gary F Mitchell, Computing Facilities Group, 19-DEC-1996

About this document

This document describes how it is intended that file serving facilities be implemented within the ING on La Palma. The file serving of shared data areas is expected to become an important part of the distrubution of information within the ING.

This document takes as a starting point PMF's document "ING Management Information Systems 02-AUG-1996". Ideas presented in that document are adopted, modified, developed or substituted. In particular much more importance is given to the role of Hyper-Text Markup language (HTML) files composed to provide infomation via Internet and Intranet.

Here is a local copy of the "Netscape Gold White Paper of 1996" (quite possibly a little biased) which describes the advantages of web-based information systems compared to the following alternatives

Note:

Examples are included but information present in them is fictititious or simplified. Similarly, links within the example documents may be only for demonstration purpose and may not connect to extant documents.

As time goes by this document is expected to evolve into a description of current file serving facilities (with items described as implemented or in progress), and finally into a definitive guide to File Serving Facilites.

The following is an index of the headings present in the document. There is no need to scroll down this document. Just click on the heading below.

Who should read it

Anyone who wants to access ING information whether they be employees on La Palma or at RGO or observers who might want to consult information. Anyone who wants to distrubute information (notes, manuals, schedules, minutes) within the ING.

Server locations and classes of data

There are at Sea Level Office (SLO) 2 servers

At the observatory site on the Roque we have 1 server

As I see it there 3 classes of information which are stored

Current File Serving Facilities

Up at the roque we have at the moment on one server with an area popularly known as "pcstore1". That area is expected to become obsolete.as alternative areas properly structured become available.

Future File Serving Facilities

The servers and the information they provide will be of this structure.

The confidential data areas

The admin server is described in "xxx - link or reference to be set here by Tony".

The unshared or single-user areas

Users are at liberty to create a directory within their home directory tree and do what they like. It is even possible that the directory could be in a scratch area if it is not intended to be retained and the data is ok to put at risk of deletion. A network disk conncted to a directory in the home directory of user's unix account is most useful to those who don't want to ever have to back up their hard disk drive because network drives are backed up by the Computing Facilities Group. Using a network drive may also be preferable to a local hard disk drive which is of very low capacity. How users use or abuse unshared single-user areas is their business but some notes on using a network drive on your PC will be composed giving starting points for those unfamiliar to PC-NFS. It will also contain at least the follwoing sections.

The shared data areas.

PMF's document "ING Management Information Systems 02-AUG-1996" goes some way to defining the location of servers and classes of information and a directory tree. It is proposed that this tree structure be adopted. The directory tree for the shared data area is therefore presented again below

/slb-info/managem

/slb-info/forms

However some proposals are modified in this documement. For example PMF presented the idea of the presence simultaneously in several formats of the same document (eg text and microsoft word). I believe that HTML format should precede all others for information which is expected to be read. If people want a hard copy or a PostScript file the conversion process is trivial. Obviously information which is designed to be picked up and incorporated into user's spreadsheets or databases will need to be retained in a PC (probably propriatary) exchange format. Put simply what we often distribute on paper goes into HTML and access databases stay as access databases. Here is a local copy of a Netscape White Paper paper (quite possibly a little biased) which describes the advantages of web-based information systems compared say to Lotus Notes.

Example

It is my proposal that La Palma Observatory Notices (LPONs) be present as html files only. Similary telephone directories, notes to all users etc.

Mirroring of Shared Data Areas

The description above all talk about Sea-level Base (slb) directories. If the information is important to the every day running of the observatory it would be inconvenient if the Santa Cruz - Roque link were to become unavailable. It is proposed that a mirror be automatically maintained of the shared data areas. Of course users would be free to consult either copy but only the original would be writable. This avoids problems about where is the definitive copy.

Example

See the table below. The objective of the naming scheme is that there is always a simple relationship between the reference to the original and the reference to the mirror. Because the mirror has the difference that it is read-only to all users then it is also important that the users can easily distinguish between the two and, for example, do not waste time attempting to update a mirror. The scheme also has to have symmetry such that if the mirror copy on the roque of the slb original is called /slb-mirror then it follows that the mirror copy at the slb of the roque original is called /roque-mirror.

original reference mirror reference
directory reference

as might be used in a PC-NFS mount of a network disk drive

ing-slo.iac.es

/slb-info/forms

ing.iac.es

/slb-mirror/slb-info/forms

ing.iac.es

/roque/visitors

ing-slo.iac.es

/roque-mirror/roque/visitors

web-page reference

as would be used in a web-browser

http://ing-slo.iac.es/slb-info/forms/fax.htm http://www.ing.iac.es/slb-mirror/slb-info/forms/fax/htm

At first sight it might seem there is a redundancy in the directory names. If all the slb-xxx directories are on ing-slo.iac.es why not call them xxx instead of slb-xxx. However it does mean in quoting references we only need to quote the directory name and not the machine on which it is held. The user can deduce the machine and the mirror copy if he finds that more convenient.

Note I worry more about long file name elements like "roque-mirror" (more than 8 chars could all PCs and their PC-NFS cope?) we might need to create links support short expressions like smirror and rmirror.

Detailed directory structure

For each directory I expect there to be a file in that directory called AAREADME.txt or some such name. This text file will have only brief information about the content -or even none - but it will have the important reference to an HTML file which has the following format. The sole purpose of the plain text file is that anyone who for any reason looks into the data area using a plain terminal emulator or connects that directory as a network drive can be told that he really should explore the dirtectories using a web browser such as netscape or microsoft explorer. However if people want to guess the contents of files from the file name they can.

The proper and fully supported access route will be through two HTML files

aareadme.htm which will have have the following headings.

index.htm which will be a list of files and sub-directories each with a short paragraph saying what the file (or directory) is about.

These two files will be guarenteed to exist - a shell script will check each directory for the presence of these files and report to helpdesk any directory which does not have both. In the case of a missing aareadme.htm file this will usually be restored from a backup tape or generated by the CFG possibly by copying from the parent directory if this is appropriate. In the case of a missing index.htm file the CFG will identify who was the other of the documents and who could therefore provide the descriptive paragraph.

Other files will usually be present - in particular templates useful in the construction of documents similar to those extant in the directory.

Example

In this example we use the /slb-info/managem

Ownership of files.

There is problem when files are modified or replaced if users do so via PC-NFS. A more lengthy discussion of how it occurs and what can be done to fix it is available but in summary it is capable of being solved by automatic procedure running on the unix machine periodically. This will avoid a repeat occurrence of some of the problems with the current pcstore1.

Templates for commonly used forms or publications

These would initially be avialable as html files of one (or more) of the following types

A form with blanks in all the answers

A form with help where the answers go and links to help

each blank space where the answer would go would be occupied by A true HTML form

If the templates had to be changed in the future this would require asking CFG to change the read-only status and/or ownership temporarily and then the user modifying the template. The automatic means for correcting the ownsership and protection of the file would revert the file to read-only in a few hours if the author did not do so himself. If the volume of such changes is small the CFG might offer a composition and/or editing service.