TCS alpha crashes 
 
 Introduction
     These notes are mainly aimed at computing staff who have
to investigate why an Alpha has crashed. The section about general 
 recovery procedures may be of interest to the Duty Engineer. 
   
 
 General Recovery Procedures
 
      Try to establish if the Alpha is responding. From
a sparc 
            ping lpasn
      Check on system console (monitor by alpha in WHT, VT220
in INT) whether there is 
      any activity (when an Alpha crashes it will write messages
to screen and go on to dump
      memory to disk (crash dump). After writing the crash
dump, the alpha reboots by itself. 
       If there is no response at the console, one can
interrupt the system by pressing the halt 
       (reset) button or control-P on the console keyboard.
Then follow the instructions to take
       a crash dump
      If there is still no response then power off the alpha
and power back on. Bear in mind that
      it may also be necessary to power cycle CAMAC. 
       
        
 
Common causes of a Crash
 
      INVEXECPTN 
 
 Investigating a Crash
 
       Log in as SYSTEM or other privileged account.
 
 1) SHOW SYSTEM shows processes running
and how long the system has been up. 
 
 2) The operator log shows 
            SET DEF DSA0:[SYS0.SYSMGR]  
             
TYPE/PAGE OPERATOR.LOG 
     Can look at previous logs with file version number 
             TYPE/PAGE OPERATOR.LOG;-1 
 
 3) The system error log, will note hardware errors and crashes
              SET DEF DSA0:[SYS0.SYSERR] 
      Convert format 
              ANAL/ERR/ELV CONVERT ERRLOG.SYS       
              
ANAL/ERR/SINCE=dd-mm-yyyy ERRLOG.CVT 
 
             SHO ERROR
will show device errors
 
              
An explanation of messages can be found in: 
              
OPENVMS 
System Messages
     
 4)  CLUE  will  analyze a  dump when machine is rebooted.
When crash happens, the machine will write memory to 
       the dump file and then reboot. Sometimes the
(duty) engineer will power cycle/press reset button on the alpha
       before dump can be completed.  
 
              
SET DEF DSA0:[SYS0.SYSCOMMON.SYSERR] 
              
DIR /SINCE=dd-mmm-yyyy  CLUE*.*;* /DATE 
 
              
TYPE/PAGE CLUE$LPASn_ddmmyy_hhmm.LIS
 
 5)   
 
  
 
 TCS Software Manager 
 Last modified: FJG 12 Mar 2007