OBS-TALK-4

User's Guide to the Message Log and Message Display

Guy Rixon
Issue 1.4 made on 21st August 1998

Introduction

Purpose of this document

Engineers will find below advice on how to install and configure the message logging system when building an observing system. Observers and other end-users will find a section describing the use of the message display or Talker.

Scope of the software

The message-logging system collects progress and error messages from all over the observing system and stores them in a central location on the system computer. The messages are displayed to the users in real time, and the display is sensitive to the priority of messages.

The logging sub-system consists in

The talker package, which is the message display.
The daemon syslogd on the system computer (as supplied with Solaris).
The daemon syslogd on the data-acquisition computer (as supplied with Solaris).
A third-party version of syslogd for the telescope computer.
Support code compiled into clients and servers throughout the observing system.

Logging of observations done by the ING Data Manager which is a separate system.

References

Advanced Programming in the Unix Environment (pp421-423)
by W. R. Stevens, pub. Addison-Wesley 1992, ISBN 0-201-56317-7
Configuration Management Standards for ING Systems
ING document SOF-STD-3 by Guy Rixon.
Design for a message display and logging system
ING document OBS-TALK-3 by Guy Rixon.
Generic Talker: User Guide and Design notes
ING document OBS-TALK-2 by Guy Rixon.

Revisions to this document

1998-07-16: issue 1.1: original document.
1998-07-17: issue 1.2: newtalkerlogs.csh was corrected to newtalkerlogs.sh.
1998-07-17: issue 1.3: the description of the Talker is updated to match v9.1.
1998-08-21: issue 1.4: the instructions for managing log files were improved.

Overview of the sub-system

This is a precis of the system design layed out in OBS-TALK-3 [3].

Messages come from applications throughout the observing systems. On the INT and JKT, the sender of messages are server programs on the telescope, data-acquisition, autoguider or system computers, and client programs on the system computer. When the logging sub-system extends to the WHT an even wider range of computers will be involved.

The messages are dispatched, by library functions compiled into the applications, to an instance of syslogd, the logging daemon [2]. For the Unix and VMS computers, a copy of syslogd is available locally. For VxWorks computers, the boot host has syslogd. Whether syslogd is local or remote, the library functions send the messages as datagrams to UDP port 514, which is the syslog standard [2].

Most of the syslogd forward their messages to syslogd on the system computer. They may also keep local copies. On the system computer, syslogd writes the messages to a set of three log files, filtered by priority.

A new set of log files is created each day at noon. The old logs are kept for eight days and are then discarded. The current log may be viewed by the Talker program (a descendent of the DRAMA-based Talkers in earlier systems [4]) which is sensitive to the details of the messages and can display them with a little intelligence.

Installation and configuration

syslogd for the system computer

This program comes with Solaris; you can't avoid it, and you don't need to install it specially for the observing system. However, the daemon syslogd uses a configuration file /etc/syslog.conf which must be adapted to suit the logging sub-system.

Refer to the manual pages syslogd(1M), syslog.conf(4) and logger(1) for details of the facility. Stevens [1] gives some more information on syslog. What these pages don't tell you is how to start and stop syslog:

   sh /etc/rc2.d/S74syslog start
   sh /etc/rc2.d/S74syslog stop

In general, it is better not to restart syslogd to force it to reconfigure. Instead, send it a HUP signal:

   # Find out the process number of syslogd:
      cat /etc/syslog.pid

   # Reconfigure syslogd
      kill -HUP syslog-pid

This forces syslogd to act on changes to the configuration file.

The system computer for the telescope must log the observing-systems messages locally. The general loghost for the control computers may be the cluster server and not the system computer. This is acceptable, but a local copy of the messages must be made. In syslog.conf, these lines should appear:

   *.notice			/var/log/talker.notice
   *.info			/var/log/talker.info
   *.debug			/var/log/talker.debug

or, alternatively:

   user.notice;local1.notice	/var/log/talker.notice
   user.info;local1.info	/var/log/talker.info
   user.debug;local1.debug	/var/log/talker.debug

The first form logs messages from all sources; the second logs only those messages from facility-codes used by DRAMA programs and is the absolute minimum required for the logging sub-system to operate. Levels of coverage between these extremes are possible, and the necessary syntax is described in the manual pages. Arguably, it's better to log everything, as messages from the kernel and system daemons may be important in debugging.

There are three files because the messages are filtered by priority. /var/log/talker.debug contains all messages, /var/log/talker.info contains those at info priority and above and /var/log/talker.notice contains those at notice priority and above.

syslogd for the data-acquisition computer

The syslogd for the data-acquisition computer is configured in the same way as for the system computer; this applies to both the Data-Cell and UltraDAS versions. However, the required contents of syslog.conf are different.

The DAS computer must pass on its messages to the system computer. This is the syntax for the DAS computer (lpss15 currently) at the INT forwarding to the system computer (lpss13 currently):

   *.debug			@lpss13

There is no filtering by priority at this stage.

syslogd for the telescope computer

Syslogd isn't supplied with VMS. For the TCS computers, a third-party implementation of syslogd was obtained from the FTP site at the University of West Kentucky (http://www2.wku.edu/www/fileserv/fileserv-software.html). This turns out to be a port of v8.3 of syslogd from BSD Unix, and the Regents of the University of California hold the copyright. It's free software with no warranty and the version on the FTP site has a bug that degrades our logging sub-system. The bug has been mended locally and the required version has this prologue comment:

**      14-Jul-1998 The logic in logmsg() for storing and printing "new"
**                  lines was corrected.  Specifically, it no longer
**                  corrupts the priority.  From_len is declared as
**                  a size_t throughout.  MAX_LINE is undefined before
**                  redefinition.  gtr@ast.cam.ac.uk

I reproduce here the installation instructions that came with the VMS syslogd.

-----------------------------------------------------------------------------
1. Create a VMS account under which the syslogd service will run.
   The privileges required are: TMPMBX, SYSPRV and OPER. You could also
   install syslogd.exe with the privileges.  SYSPRV is needed to create
   a port < 1024 and OPER is needed to send broadcast messages.
   Make sure you create the directory for the account.  Try logging in
   as the account you have created to make sure you have everything set up
   correctly. If you make the account captive or restricted make sure you
   create a login.com.

   The account I have been using looks like this:

Username: UCX_SYSLOGD                      Owner:  SYSLOGD
Account:  SYSLOGD                          UIC:    [375,30] ([UCX$AUX,UCX_SYSLOGD])
CLI:      DCL                              Tables: DCLTABLES
Default:  SYS$SPECIFIC:[UCX_SYSLOGD]
LGICMD:   LOGIN
Flags:  Restricted
Primary days:   Mon Tue Wed Thu Fri
Secondary days:                     Sat Sun
Primary   000000000011111111112222  Secondary 000000000011111111112222
Day Hours 012345678901234567890123  Day Hours 012345678901234567890123
Network:  ##### Full access ######            ##### Full access ######
Batch:    -----  No access  ------            -----  No access  ------
Local:    -----  No access  ------            -----  No access  ------
Dialup:   -----  No access  ------            -----  No access  ------
Remote:   -----  No access  ------            -----  No access  ------
Expiration:            (none)    Pwdminimum:  6   Login Fails:     0
Pwdlifetime:         90 00:00    Pwdchange:   7-MAY-1995 14:00
Last Login:            (none) (interactive),  4-JUN-1995 15:42 (non-interactive)
Maxjobs:         0  Fillm:        50  Bytlm:        52200
Maxacctjobs:     0  Shrfillm:      0  Pbytlm:           0
Maxdetach:       0  BIOlm:        18  JTquota:       4096
Prclm:           8  DIOlm:        18  WSdef:          350
Prio:            8  ASTlm:       100  WSquo:          512
Queprio:         0  TQElm:        15  WSextent:      2048
CPU:        (none)  Enqlm:       100  Pgflquo:      10240
Authorized Privileges:
  NETMBX    OPER      SYSPRV    TMPMBX
Default Privileges:
  NETMBX    OPER      SYSPRV    TMPMBX



-----------------------------------------------------------------------------
2. Define the SYSLOG service under UCX.  This is done with a command
   similar to:
 
$!
$!  DEFINE_SYSLOG_SERVICE - Defines the SYSLOG Service in the UCX service
$!                          database.
$!
$ UCX SET SERVICE SYSLOG -
    /PROTO=UDP/PORT=514/FLAGS=NOLISTEN -
    /USERNAME=UCX_SYSLOGD/PROCESS=SYSLOGD -
    /ACCEPT=(NETW:127.0.0.0,NETW:205.133.96.0) -
    /FILE=SYS$SYSROOT:[UCX_SYSLOGD]UCX_SYSLOGD_STARTUP.COM -
    /LOG=(ALL,FILE:SYS$SYSROOT:[UCX_SYSLOGD]UCX_SYSLOGD.LOG)

   You will need to change (or omit) the /ACCEPT= line to specify your domain.
   You may also need to change the username, device/directory etc.



-----------------------------------------------------------------------------
3. You have to create the command procedure which the SYSLOG service
   executes.  This command procedure should look like this:
$!
$!  UCX_SYSLOGD_STARTUP - Start the SYSLOGD Service
$!
$  DEFINE SYSLOGD_CONFIG SYS$LOGIN:SYSLOGD.CFG
$!
$  RUN SYS$SYSROOT:[UCX_SYSLOGD]SYSLOGD.EXE
$!




-----------------------------------------------------------------------------
4. Enable the service on the nodes which should/could run syslogd.
   This is done with the command:

$ UCX ENABLE SERVICE SYSLOG

   You will have to put this command in your system startup so that it
   is executed when you boot.




-----------------------------------------------------------------------------
5. Create a syslog configuration file.  This is very similar to a Unix
   syslog configuration file.  Each line consists of one or more facility/
   severity combinations of the form . (you can use
   an * to signify all facilities or all severitys)  If you specify more
   than one facility/severity separate them with commas.

   After the facility/severity is AT LEAST ONE TAB and then the destination
   for messages which match that facility/severity.  The first character of
   the destination defines what type of destination it is.

   / = Log to a file
   @ = Forward to another node
   % = Send OPCOM message, % should be followed by a comma separated list
       of OPCOM classes.
   Anything else is assumed to be a comma separated list of usernames



#
#  This is a SYSLOGD configuration file
#
*.*			/SYS$LOGIN:SYSLOGD.LOG
local0.*,local1.*	/SYS$LOGIN:LOCAL.LOG
*.err			JOHN,JOE
local1.*		@othernode.mydomain.com
local1.*		%CENTRAL,TAPE



-----------------------------------------------------------------------------
6. Use LOGGER.EXE to send a message to syslogd and see if it works.
   For example:

$ LOGGER:==$your_exe:logger.exe
$ LOGGER "This is a test message"

   You should see the SYSLOGD process start.  If you have problems, look
   in the file specified in the /LOG= qualifier when you defined the service.


-----------------------------------------------------------------------------
7. You also use LOGGER to control the SYSLOGD process.  The -c option
   is used to send a command to syslogd.  The current commands are s
   and r, s means shutdown and r means reopen the logs.

   For example:

   $ logger -c s
-----------------------------------------------------------------------------

This syslogd should forward messages to the system computer in the same way as the DAS computer:

   *.debug		@lpss13

It may also be desirable to log the TCS' messages locally.

The talker package

The messages displays - Talkers - are instances of the program talker which is built from the package of the same name. V9 or later of the package is part of this logging sub-system; v8 and earlier is different.

The talker package is provided as a normal part of the observing system and can be built in the standard way using bom [2]. Before building the package you must have Tcl 7.6 and Tk 4.2 installed in the observing system of choice.

There should be one Talker for each user of the telescope; that is, one for the observer and one for the TO and an assisting engineer can start his or her own if so wished. The CIA start-up scripts for each instrument should start these Talkers and the shut-down scripts should stop them. (Stopping Talkers doesn't stop the logging of messages; it only stops the display.)

To start a talker, use this command line:

  talker -display d -geometry g &

where d should be one of the symbols defined in, e.g., nodes.INT and g is a normal, X-windows geometry-specification.

Stopping Talkers from a script is harder. The cleanup utility doesn't affect them as they are not DRAMA programs. The Tk send command can be used. Here is Tk code to terminate all Tk applications on the display on which it is run:

   set apps [winfo interps]
   foreach app $apps {
      if { $app != [winfo name .] } { send $app exit }
   }
   exit

This fragment, if named close_tk.tcl, can be run as:

  wish4.2 close_tk.tcl

Managing the log files

If no special precautions are taken, the system log files build up until they fill the disk and disable the system. The standard Unix way of dealing with this is a cron job that renames log files at intervals, keeping only a limited number of back issues. The current log-files - the ones being written to - have a fixed name and the old copies are given the prefix .n where n is the number of times the file has been renamed.

This scheme is recommended for the Talker logs. The shell-script newtalklogs.sh is a modified copy of newsyslog (the latter is part of Solaris) that operates on the Talker logs instead of /var/log/syslog. A separate script is provided instead of an extension to newsyslog so that the two sets of logs can be changed on different schedules.

To use this feature, install newtalkerlogs alongside newsyslog:

   sccs -d /opt/INGsrc/src get newtalkerlogs.sh
   cp newtalkrlogs.sh /usr/lib/newtalkerlogs
   chmod 0755 /usr/lib/newtalkerlogs

Now schedule the operation to run at 1155hrs each day by adding a line to root's cron table:

   tcsh
   setenv EDITOR vi
   crontab -e

The line to be added is

   55 11 * * * /usr/lib/newtalkerlogs

The use of crontab instead of directly editing the cron table makes the cron daemon aware of the changes and removes the need to restart the daemon.

The file-change is timed at 1155hrs instead of 1200hrs to make a safety margin. If the talker displays are running during the day, they change to new files a few seconds after 1200hrs. If syslogd changes after the talkers, then the latter will spend the next 24hrs looking at the old files! Conversely, if syslogd changes much earlier, then the talkers won't see the new new files until noon. The five-minute break in message-reporting seems a reasonable price to pay to ensure that the displays are correct for most of the day.

Using the Talker: advice for end users

Starting up

When you start the central intelligence of the observing system (the startobssys command on the system computer), the Talkers are started for you. If you want to restart a Talker, or to start an extra one, or to view the message logs when the observing system isn't running, or to monitor operations remotely, use the commmand

   talker &

with no arguments (note that the Talker runs as a background process). This starts the Talker on the display defined by your environment variable DISPLAY, just like any other X-windows program.

You can also specify the display on the command line. e.g.:

   talker -display lpss14:0

if you want the talker on the INT's data-reduction console. (Talker is a Tcl script interpreted by wish4.2 and understands all the usual command-line options of that version of wish; see man wish.)

Interpreting the display

Here is a Talker display with examples of messages at all priorities:

The background colour of the messages indicates the urgency. The blue messages are a technical record for support staff; you won't see any of these messages unless you turn them on using the Filtering menu (see below). Normal process messages have a white background. The orange messages (the background may be a goldish-yellow on some screens) are events that the system is trying to bring to you special attention. Red messages (actually an unpleasant shade of pink on some displays an in the example above) are alarms that you should acknowledge and react to urgently. Any alarms that you haven't acknowledged yet are displayed in a beeping, flashing dialogue box. If the dialogue box is asleep (and hence iconified) any alarms in the main display are already acknowledged and the panicc is probably over. The alarm-type messages QUASH and QUIET are the evidence of alarms being acknowledged.

The left-hand part of each message displays the logging details. After the date and time of dispatch is the name of the computer it was sent from and the name of the sender. The priority of the message is shown by a keyword: DEBUG, INFO (the normal priority), NOTICE or ALARM.

The name of the log file is shown at the bottom of the talker. This is a normal Unix text-file and you can view it with more, copy it, print it, etc. The file is changed at noon every day (the system will do this automatically if it is left running).

The display is scrollable, but the oldest messages may not be viewable. The Talker discards the earliest messages if the log becomes very long. In any case, you cannot scroll back to messages from a previous night.

The Font menu

You can select the type size for the main display from this menu; you cannot change the typeface. The default on start-up (to save screen space) is 10pt type.

The Filtering menu

This menu controls which messages are displayed.

Show all messages displays everything that is logged by syslogd.
Hide debug messages eliminates DEBUG (blue) messages from the display (but not from the log).
Show xxceptional messages only displays only NOTICE (orange) and ALARM (red) messages (but the others are still logged).

The Hide debug messages setting is recommended for general use, and this is the default on start-up.

Selecting any of the options on this menu (even reselecting the filtering already in force) causes the talker to re-open its input file. This is a way of ensuring that the current file is displayed if the automatically-scheduled change to a new file breaks down.

Alarms

Messages containing the ALARM tag in the text are urgent and are displayed forcefully in the alarm dialogue. This dialogue box is iconified when the Talker is started and springs to life whenever a new alarm comes in.

The alarm box beeps once per second and flashes its background from red to white at the same rate; it really wants your attention. You can stop the beeping, on your Talker (and every other Talker that sees the same logs) by pressing the Quiet button.

Pressing the Acknowledge button clears the current alarms from the alarm box (they remain the log and in the main Talker-display) and iconifies the box. The acknowledgement propagates to all the Talkers watching the same log files and clears their alarm boxes too. If you don't acknowledge an alarm, any subsequent alarms are added to the same dialogue. Unlike previous versions of the Talker, there is only one alarm box per display.

Closing down

You can close a Talker with the Exit command in its File menu. Standard brutalities - closing it from the window manager, attacking it with a signal, etc. - will also work. The Talkers should go away automatically if you run the shutdownobssys command.