Page Information

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Here are some things Michael Baxter wrote down. Have a look at www.perceus.org for more information.

Server & Configuration Administration

Put in details of standard configuration and installation/partitioning/procedures (such as DNS notification/compass/ etc).

# refers to a root shell
$ refers to a user shell

Please note that the website www.perceus.org contains pretty much everything needed to get things going and keep running in terms of documentation.

Add Users to Cluster

# adduser -m -c "Michael Baxter" mbaxter
# passwd mbaxter
# su - mbaxter
$ ssh-keygen -t dsa -P ""
$ cp ~/.ssh/id_dsa.pub ~/.ssh/authorized_keys
$ chmod 400 ~/.ssh/authorized_keys
$ exit

Panel

ssh-keygen -t dsa -P ""; cp ~/.ssh/id_dsa.pub ~/.ssh/authorized_keys; chmod 400 ~/.ssh/authorized_keys

To add the nodes' hosts keys to the new user's know_hosts file run the following command:

Panel
titlebash script

for NODE in 00 01 02 03 04 05 06 07 08 09 10 11; do ssh -o StrictHostKeyChecking=no n00$NODE hostname; done

Occasionally new users will not get added to exported file system /etc/passwd file. Add manually if after 10-15 seconds trying to log into a slave node and get a password prompt (RSA hostkey check will occur, unless know_hosts was added to skel directory when adding the user)

$ ssh n0000

The authenticity of host 'n0000 (192.168.2.2)' can't be established.
RSA key fingerprint is 39:4c:6d:f8:a5:fe:54:dc:a4:6a:cc:f6:95:8f:87:7a.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'n0000,192.168.2.2' (RSA) to the list of known hosts.
meow@n0000's password:

This means the /etc/passwd file on node is not up to date with the master node.

# cp /etc/passwd /etc/perceus/modules/passwdfile/all
# cp /etc/passwd /var/lib/perceus

Push it manually across all nodes
# pdsh -w n00[00-whatever] "\cp /var/lib/perceus/passwd /etc/"

Where "whatever" is the highest node number
# rm /var/lib/perceus/passwd

May have to repeat above process to propagate  /etc/group as well.

Node is Down

First check the node. Sometimes the node is too busy and logging in or submitting jobs can be laggy and times out:

$ wwtop

Cluster totals: 15 nodes, 240 cpus, 563136 MHz, 344.16 GB mem
Avg: 8% cputil, 7819.00 MB memutil, load 1.60, 252 procs, uptime 96 day(s)
High: 14% cputil, 15597.00 MB memutil, load 4.34, 471 procs, uptime 139 day(s)
Low: 5% cputil, 114.00 MB memutil, load 0.00, 232 procs, uptime 1 day(s)
Node status: 15 ready, 0 unavailable, 1 down, 0 unknown
Node name CPU MEM SWAP Uptime MHz Arch Procs Load Net:KB/s Stats/Util
draxx.scien 14% 8% 1% 139.89 25536 x86_64 471 4.34 879 |> |
n0011 12% 32% 0% 139.77 38400 x86_64 238 2.00 0 |> |
n0012 12% 8% 0% 21.05 38400 x86_64 238 2.00 0 |> |
n0013 12% 4% 0% 21.05 38400 x86_64 238 2.00 1 |> |
n0014 --- -- -- --- --- ---- -- --- ----- | READY|

Or just ping the node:
$ ping n0014

Solution: If its down on ssh then power it up (remotely if its on a DRAC ). Check the screen or log files for reasons.

Queueing System not working or Node is not on the Queue

 Say node n0014 is not on the queue. Check with:

$ wwnodes
or
$ pbsnodes

The wwnodes output is easier to see. Check output for status whether its down, free, interactive, or job-exclusive:

NodeName Queue State Running Jobs
n0000 free
n0001 free
n0002 free
n0003 free
n0004 job-exclusive 0/3800, 1/3800, 2/3800, 3/3800, 4/3800, 5/3800, 6/3800, 7/3800
n0005 free
n0006 down

If all States are interactive, than the pbs_server isnt running. On the master node:

# pbs_server

If the state is down, then the node isnt in the queue. Assuming the node is up, start pbs_moms on the slave node:

# pdsh -w n0006 "pbs_mom"

It will take a few seconds to start up. Occasionally this wont be enough (check with wwnodes or pbsnodes after 10-15 seconds) then restart the pbs_server:

# qterm
# pbs_server

After 10-15 seconds the free or job-exclusive status is displayed.

Add Extra Nodes

To add new nodes to clusters, check that there is adequate space/power for the nodes and the node is installed into the rack. Connect the first ethernet connection into the intranet hub. At boot-up change the boot-up sequence on the BIOS to PXE boot as first priority (especially if the node has a local disk for a stateful or partial-stateful cluster configuration). Once PXE starts up, the master node will provision the OS, however before it will add a VNFS to the slave it has to be told about it. So once its fully booted, a message on the new slave node that there is no image for it to load!

On the master type:

# perceus node set vnfs centos-5.6-1.x86_64 n00[00-whatever]

where centos-5.6-1x86_64 is the vnfs image (check /root/ for centos-5.6-1.x86_64.vnfs or something.vnfs and use that) and whatever the new node number (it will be on screen, or type a high number like 100). The slave node will now be provisioned correctly! Sometimes /etc/perceus/modules/ipaddr is not updated so add it, if necessary (see syntax in the file).

Install software

Root is only necessary if the software is to be globally accessable. In the following examples, root is used for ease of demonstration.

Currently installed OS's for all clusters and 90% of HPC's (as of september 2011) are Redhat or Redhat derivatives. Package manager is yum (basically rpm) however if Ubuntu / SuSE or whatever just change the package manager command (i.e apt-get or yast2) but the procedure stays the same. Install via yum/rpm if possible as 1) its easier and 2) it manages software dependencies. In cluster enviroments there are two OS's 1) the master nodes OS and 2) the vnfs (which contains the OS) on slave nodes.

(Note: Its possible to have different OS's at the same time for the nodes). First install the master node

#yum -y install <package_name>

To install the slave nodes, first mount the vnfs (example centos-5.6-1.x86_64 is the vnfs image, see /root

#perceus mount vnfs centos-5.6-1.x86_64

The vnfs of the slave nodes OS is now mounted to the masters node /mnt directory (/mnt/centos-5.6-1.x86_64)

#yum -y -installroot /mnt/centos-5.6-1.x86_64

Unmount the vnfs

#perceus vnfs umount centos-5.6-1.x86_64

This step will take 15-20 secs. Now either reboot the nodes or just sync them (sometimes livesync does not fully work due to OS running in RAM). To sync without rebooting

#perceus vnfs livesync centos-5.6-1.x86_64

If the nodes are in use then it will take a while to complete, ignore references to -> Use of uninitialized value in string comparison (cmp) at
/usr/lib/perceus//Perceus/Nodes.pm line 336.

If livesync does not work then just reboot nodes

#pdsh -w n00[00-whatever_highest_node] "init 6"

Typically /usr/local is exported from the master to slave nodes, to install a tar ball package then just add it to /usr/local directory and now its available on the master and slaves! Typically the slave nodes OS is a cut down version of the master node as the OS runs totally in RAM then check the dependencies packages for the software are also on the slave.

Software on Master but not Slaves

Make sure its not just an evironment path issue and the exported filesystem for the master has not dropped off!

Where are node filesystem and kernel stored

On the master node, always /root. The file name is typically centos-5.6-1.x86_64.vnfs , is manipulate is through perceus. Firstly mount the .vnfs file
#perceus vnfs mount centos-5.6-1.x86_64
this will mount the file structure to /mnt/centos-5.6-1.x86_64. Then just enter into that directory, add/delete/modify whatever  (including exported/mounted drives, kernel info, directories ....) then umount the system
#perceus vnfs umount centos-5.6-1.x86_64
then livesync the system (however anything major like kernel updates will probably need to reboot the node, look at the software installation documentation for the reboot command)
#perceus vnfs livesync centos-5.6-1.x86_64

Push changes to nodes?  Reboot or shut down nodes? Boot up a node?

Via pdsh ( pdsh -w n0000-13 "<whatever command>" ) or via a livesync after editing the vnfs system (look at the software installation and/or the node filesystem documentation above). The command could be a "init 6" for reboot, or "init 0" for halt. To boot the slave nodes, just power cycle them on, they will PXE boot and perceus will boot them back up from there. The pdsh command can reboot a particular node ( pdsh -w n0006 "init 6" ,will reboot n0006 only).