Using the Sun v20z at the ING

Audience

These notes are intended for Systems Administrators at the Isaac newton Group A simple description of the hardware is given. Some simple diagnostics are described.

Description of hardware.

Market positioning

The v20z were designed by Sun to be entry level servers. They can no longer be ordered new. They were superceded by the x4140 Server. The Sun Fire X4140 Server is however listed (2009) at www.sun.com from 3200 dollars. That is to say at 10x the price. When the v20z was current the next level up (which can't be ordered new either now) was the v40z. We might start using the v40z soon. The v40z is "two times bigger" than the v20z. It typically has twice as many CPUs ie 4. It typically has twice as much memory ie 8GB. It has more PCI slots. There is a mixture of speeds and a total of 7. The v40z is also is available on ebay secondhand. Prices are typically double too. One can find on ebay used v40z on sale from European suppliers at prices from 500 to 1250 euros each. (2009 prices).

Why we use the Sun Fire V20z at ING

The ones we have are second hand. The v20z are available at suprisingly low-cost for example on ebay one can find used v20z on sale at prices from 150 to 250 euros each. (2009 prices).

Features

As entry level servers they have the following advantages

As entry level servers they have the following disadvantages

They are x86 based. Specifically the AMD Opteron chip. At the ING we can use them as file servers and for other services eg LDAP, print, DNS, remote access and ZFS backup. They are not suitable fo use with applications only compiled for SPARC chip eg an ICS or a DAS.

The v20z is available with many models of CPU. The models manufactured later have multiple cores. At the ING we use the units with dual cores and keep a pair of simpler units as spares. Just what the CPU configuration is can be seen at boot and also from the running system.

root@lpss32# memconf
hostname: lpss32
Sun Microsystems, Inc. Sun Fire V20z (Solaris x86 machine) (2 X Dual Core AMD Opteron(tm) Processor 270 1993MHz)
socket cpu0.mem0: 7f40000000000000 1GB DDR400 (PC3200) ECC
socket cpu0.mem1: 7f40000000000000 1GB DDR400 (PC3200) ECC
socket cpu1.mem0: 7f40000000000000 1GB DDR400 (PC3200) ECC
socket cpu1.mem1: 7f40000000000000 1GB DDR400 (PC3200) ECC
empty memory sockets: cpu0.mem2, cpu0.mem3, cpu1.mem2, cpu1.mem3
total memory = 4096MB (4GB)

The Service Processor SP

As servers they are designed to be easily contacted remotely and diagnosed remotely. As such apart from the Opteron CPU for the OS there is another CPU. This other CPU is called the Service Processor or SP. The SP has a tiny linux kernel and operates entirely independently. It has it's own network socket labled "Management". It has obviously it's own IP. Typically the ING this will be lpssNN-rsc (Remote Systems Console).

Access to the Service Precessor Console

The power switch at the back of the machine should normally always be switched on. When power is switched on here the SP automatically boots. It can then be contacted by ssh on the IP of the lpssNN-rsc interface. The username is specific to ING. The username is displayed on a sticker on the v20z unit. The manager account is not identified in these notes. In these notes it is expressed as ing_manager_account but of course we do not type all that. The same sticky label has a password hint for the forgetful. This account name is configurable just once when the v20z is installed. The initial setup is not described here. These notes describe how to access the v20z and do some basic maintenance or diagnosis ie how to use it. If you want notes on how to start with a blank v20z look up the "Sun Fire v20z Server Installation Guide" on the web. That will tell you how to setup the initial SP accounts.

At the ING we only use accounts in the manager group. The accounts on the SP recognise SSH keys. ING system administrators may add their personal SSH public key. However the SP is very simple. In the initial simple setup used at ING there is only one manager account - and each account has just one key. When I set up the machines I added my key for ing_manager_account.

The following is taken from "Sun Fire V20z and Sun Fire V40z Servers--Server Management Guide: Appendix B"

Access Add Public Key Subcommand

Description: Installs a public key for SSH authentication, which enables SSH logins and 
remote command execution without being prompted for a password. You must first generate a 
key pair (RSA or DSA), which you can generate using the ssh-keygen command included with 
OpenSSH.

    * Only local users can install public keys (not users who gain authorization through 
      a mapping of a directory-services group).

    * Manager-level users can add keys for any local user.

    * Admin-level users can add only themselves.

    * Service-level users can not add anyone.

    * Up to 10 users can install public keys; each user can install only one key.

    * The maximum key length supported is 4096 bits.

To work around this limitation ING System Administrators should add a user with their own username (in the example below my_username) to the manager group of users.
localhost $ access add user -g manager -p ********* -u my_username
Confirm by displaying the accounts
localhost $ access get users
Group   User
monitor
admin
manager ing_manager_account my_username
service
Now add one key to the personal account for the ING system administrator
localhost $ access add public key -k /tmp/mykey -u my_username
So now there should be two users who have a public key access. Confirm by listing them.
localhost $ access get public key users
ing_manager_account
my_username
The SP command interface is limited to just what you need to do a few operations on the V20z. A starting point is to type help.
localhost $ help
Available Commands: platform, access, sp, sensor, inventory, ipmi.
Each of these commands includes a help option (--help).
And so for example we can get detailed help on a single command
localhost $ platform --help
Usage: platform console {-h|--help}
       platform get console {-h|--help}
       platform get hostname {-h|--help}
       platform get mac {-h|--help}
       platform get power state {-h|--help}
       platform get product-id {-h|--help}
       platform get os state {-h|--help}
       platform set console {-h|--help}
       platform set os state boot {-h|--help}
       platform set os state reboot {-h|--help}
       platform set os state shutdown {-h|--help}
       platform set os state update-bios {-h|--help}
       platform set power state {-h|--help}

What you can do with the SP interface

From this interface we can
Power off and power on the "platform"
There are commands to power off and power on the "platform" ie the Opteron processor and the Solaris OS. power off and power on the "platform" ie the Opteron processor and the Solaris OS.
localhost $ platform set power state off
let the output go to the serial port.
To use a terminal server on the serial line:
Select "Platform COMA". To discover the setting use the command platform get console.
The setting for use with a portserver corresponds to this answer:
localhost $ platform get console
Rear Panel
Platform COMA
If this is not the setting then you can correct it by
localhost $ platform set console -s platform
capture the output which would otherwise go to the serial port.
We usually use this setting ie we use the SP. (The alternative is on the serial line via the portserver connection ie "Platform COMA" above.)
To use an ssh session to the remote system console (eg lpss84-rsc) and then "platform console" to connect to the serial line:
Select "SP Console". To discover the setting use the command platform get console.

This setting for use the Service Processor (SP) corresponds to this answer:
localhost $ platform get console
Rear Panel Console Redirection Speed Pruning Log Trigger
SP Console Enabled             9600  No      244 KB
If this is not the setting then you can correct it to the Service Processor (SP) by
localhost $ platform set console -s sp -S 9600
For a serial connection the eeprom value should be ttya
root@lpss84# eeprom
ata-dma-enabled=1
atapi-cd-dma-enabled=0
ttyb-rts-dtr-off=false
ttyb-ignore-cd=true
ttya-rts-dtr-off=false
ttya-ignore-cd=true
ttyb-mode=9600,8,n,1,-
ttya-mode=9600,8,n,1,-
lba-access-ok=1
prealloc-chunk-size=0x2000
bootpath=/pci@0,0/pci1022,7450@a/pci17c2,10@4/sd@0,0:a
keyboard-layout=UK-English
console=ttya
Begin by connecting the serial port to the SP console - but at just 9600 baud - and tap the keyboard a few times to get a login prompt. Next we open a console session
localhost $ platform console


[Enter `^Ec?' for help]


lpss57 console login:
lpss57 console login:

To get out of that console session we do need that help so type "^E" that is to say "control-E" then "c" then "?"
lpss57 console login: 
help]
 .    disconnect                        ;    move to another console
 a    attach read/write                 b    send broadcast message
 c    toggle flow control               d    down a console
 e    change escape sequence            f    force attach read/write
 g    group info                        i    information dump
 L    toggle logging on/off             l?   break sequence list
 l0   send break per config file        l1-9 send specific break sequence
 m    display the message of the day    o    (re)open the tty and log file
 p    replay the last 60 lines          r    replay the last 20 lines
 s    spy read only                     u    show host status
 v    show version info                 w    who is on this console
 x    show console baud info            z    suspend the connection
 |    attach local command              ?    print this message
  ignore/abort command              ^R   replay the last line
 \ooo send character by octal code
so to close the terminal session type "^Ec." that is to say "control-E" then "c" then "."
other options for console
To see the other options use --help
localhost $ platform set console --help
platform set console {--serial|-s} sp [{{--enable|-e}|{--disable|-d}}]
         [{{--prune|-p}|{--noprune|-n}}] [{--speed|-S}
         {1200|2400|4800|9600|19200|38400|57600|115200}] [{--log|-l} size] [{-h|--help}]
 or
platform set console {--serial|-s} platform [{-h|--help}]
{-S|--speed} {1200|2400|4800|9600|19200|38400|57600|115200}
                    Select the port speed on the SP to use to connect to the platform
                    console.  BIOS, the platform OS and the SP must all be configured for
                    the same speed.  This setting does not affect the platform
                    configuration
                    Cannot be used with: -s=platform
{-d|--disable}      Indicate that the platform console monitor is inactive
                    Cannot be used with: -e -s=platform
{-e|--enable}       Indicate that the platform console monitor is active
                    Cannot be used with: -d -s=platform
{-h|--help}         Print the usage message.
{-l|--log} size     Select the trigger size in KB for console log rotation. Minimum 64KB,
                    Maximum 1024KB
                    Cannot be used with: -s=platform
{-n|--noprune}      Indicate that the platform console log should be the raw console data
                    Cannot be used with: -p -s=platform
{-p|--prune}        Indicate that the platform console log is to be cleaned of ANSI
                    sequences and pruned of duplicated information
                    Cannot be used with: -n -s=platform
{-s|--serial} {sp|platform}
                    Specify whether the serial port is connected to the platform COMA port,
                    or the SP serial console
                    Cannot be used with: -e [platform] -d [platform] -p [platform] -n [plat form] -S [platform] -l [platform]
software inventory
To check the BIOS software revision level
At the time of writing an up to date listing looks like this
localhost $ inventory get software
Name         Revision  Install Date             Description
BIOS-V20z    V1.35.5.1 Mon Nov  9 15:32:58 2009 Platform BIOS for V20z servers
SP Value-Add V2.4.0.20 Tue Nov 10 11:41:37 2009 SP Value-Add Software
SP Base      V2.4.0.20 Tue Nov 10 11:41:37 2009 SP Base Software
An out of date listing might look like this:
localhost $ inventory get software
Name           Revision  Install Date             Description
BIOS-V20z      V1.33.7.2 Fri Oct  7 22:25:31 2005 Platform BIOS for V20z servers
Operator Panel V1.0.1.2  Fri Oct  7 19:13:20 2005 Operator Panel Firmware
PPCBoot        V2.3.0.1  Fri Oct  7 19:13:20 2005 PPCBoot Software
SP Value-Add   V2.3.0.15 Fri Jul 29 20:06:58 2005 SP Value-Add Software
SP Base        V2.3.0.15 Fri Jul 29 20:06:58 2005 SP Base Software