[H]ere are some things [Michael Baxter] wrote down. Have a look at www.perceus.org for more information.
Server & Configuration Administration
Put in details of standard configuration and installation/partitioning/procedures (such as DNS notification/compass/ etc).
# refers to a root shell
$ refers to a user shell
Please note that the website www.perceus.org contains pretty much everything you will need to get things going and keep running in terms of documentation.
Adding Users to a Cluster
# adduser -m mbaxter
# passwd mbaxter
# su - mbaxter
$ ssh-keygen -t dsa -P ""
$ mv ~/.ssh/id_dsa.pub ~/.ssh/authorized_keys
$ chmod 400 ~/.ssh/authorized_keys
Occasionally new users will not be added to exported file system /etc/passwd file. Add manually if after 10-15 seconds you try to log into a slave node and you get a password prompt (RSA hostkey check will occur, unless know_hosts was added to skel directory when adding the user)
$ ssh n0000
The authenticity of host 'n0000 (192.168.2.2)' can't be established.
RSA key fingerprint is 39:4c:6d:f8:a5:fe:54:dc:a4:6a:cc:f6:95:8f:87:7a.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'n0000,192.168.2.2' (RSA) to the list of known hosts.
This means the /etc/password file on the node is not up to date with the master node.
# cp /etc/passwd /etc/perceus/modules/passwdfile/all
# cp /etc/passwd /var/lib/perceus
Push it manually across all nodes
#pdsh -w n00[00-whatever] "\cp /var/lib/perceus/passwd /etc/"
Where "whatever" is the highest node number
Node is Down?
First check the node. Sometimes the node is too busy and logging in or submitting jobs can be laggy and times out:
Cluster totals: 15 nodes, 240 cpus, 563136 MHz, 344.16 GB mem
Avg: 8% cputil, 7819.00 MB memutil, load 1.60, 252 procs, uptime 96 day(s)
High: 14% cputil, 15597.00 MB memutil, load 4.34, 471 procs, uptime 139 day(s)
Low: 5% cputil, 114.00 MB memutil, load 0.00, 232 procs, uptime 1 day(s)
Node status: 15 ready, 0 unavailable, 1 down, 0 unknown
Node name CPU MEM SWAP Uptime MHz Arch Procs Load Net:KB/s Stats/Util
draxx.scien 14% 8% 1% 139.89 25536 x86_64 471 4.34 879 |> |
n0011 12% 32% 0% 139.77 38400 x86_64 238 2.00 0 |> |
n0012 12% 8% 0% 21.05 38400 x86_64 238 2.00 0 |> |
n0013 12% 4% 0% 21.05 38400 x86_64 238 2.00 1 |> |
n0014 --- -- -- --- --- ---- -- --- ----- | READY|
Or just ping the node:
$ ping n0014
Solution: If its down on ssh then you will have to power it up (you can do this remotely if its on a DRAC ). Check the screen or log files for reasons.
Queueing System not working & Node isnt on the Queue
You may get an enquiry saying node 14 (or n0014, or n014) is not on the queue. Check if it is:
The wwnodes output is easier to see. You will see from the output about status whether its down, free, interactive, or job-exclusive:
NodeName Queue State Running Jobs
n0004 job-exclusive 0/3800, 1/3800, 2/3800, 3/3800, 4/3800, 5/3800, 6/3800, 7/3800
If you get all States are interactive, than the pbs_server isnt running. On the master node:
If the state is down, then the node isnt in the queue. Assuming the node is up, start pbs_moms on the slave node:
#pdsh -w n0006 "pbs_mom"
It will take a few seconds to start up. Occasionally this wont be enough (check with wwnodes or pbsnodes after 10-15 seconds) then restart the pbs_server:
After 10-15 seconds you should see the free or job-exclusive status.
Add Extra Nodes to Cluster?
To add new nodes to clusters, check that there is adequate space/power for the nodes and you have put the node into the rack. Connect the first ethernet connection into the intranet hub. At bootup change the bootup sequence on the BIOS to PXE boot as first priority (especially if the node has a local disk for a stateful or partial-stateful cluster configuration). Once PXE starts up, the master node will provision the OS, however before it will add a VNFS to the slave it has to be told about it. So once its fully booted you will get a message on the new slave node that there is no image for it to load!
On the master type:
# perceus node set vnfs centos-5.6-1.x86_64 n00[00-whatever]
where centos-5.6-1x86_64 is the vnfs image (check /root/ for centos-5.6-1.x86_64.vnfs or something.vnfs and use that) and whatever the new node number (it will be on screen, or type a high number like 100). The slave node will now be provisioned correctly! Sometimes /etc/perceus/modules/ipaddr is not updated so add it, if necessary (see syntax in the file).
Install software onto the Cluster?
First rule about installing software:
-> You dont need to be root
Second rule about installing software:
-> You dont need to be root to install software
You only need to be root if the software is to be globally accessable. In the following examples i will be root just for ease of demonstration
Currently installed OS's for all clusters and 90% of HPC's (as of september 2011) are Redhat or Redhat derivatives. Package manager is yum (basically rpm) however if Ubuntu / SuSE or whatever just change the package manager command (i.e apt-get or yast2) but the procedure stays the same. Install via yum/rpm if possible as 1) its easier and 2) it manages software dependencies. In cluster enviroments you have two OS's 1) the master nodes OS and 2) the vnfs (which contains the OS) on slave nodes.
(Note: You can have different OS's at the same time for the nodes, however no one has asked for this). First install the master node
#yum -y install <whatever the package is>
To install the slave nodes, first mount the vnfs (example centos-5.6-1.x86_64 is the vnfs image, see /root and you will see it)
#perceus mount vnfs centos-5.6-1.x86_64
The vnfs of the slave nodes OS is now mounted to the masters node /mnt directory (/mnt/centos-5.6-1.x86_64)
#yum -y -installroot /mnt/centos-5.6-1.x86_64
now you unmount the vnfs
#perceus vnfs umount centos-5.6-1.x86_64
This step will take 15-20 secs. Now either reboot the nodes or just sync them (sometimes livesync does not fully work due to OS running in RAM). To sync without rebooting
#perceus vnfs livesync centos-5.6-1.x86_64
If the nodes are in use then it will take a while to complete, ignore references to -> Use of uninitialized value in string comparison (cmp) at
/usr/lib/perceus//Perceus/Nodes.pm line 336.
If livesync does not work then just reboot nodes
#pdsh -w n00[00-whatever_highest_node] "init 6"
Typically /usr/local is exported from the master to slave nodes, if you need to install a tar ball package then just add it to /usr/local directory and now its available on the master and slaves! Typically the slave nodes OS is a cut down version of the master node as you usually run the OS totally in RAM then you will have to check the dependencies packages for the software are also on the slave.
Software on Master but not Slaves
Make sure its not just an evironment path issue and the exported filesystem for the master has not dropped off! NFS is like that.
Where are all the node filesystem and kernel information actually stored?
Its all contained in a file on the master node, always /root. The file name is typically centos-5.6-1.x86_64.vnfs , how you manipulate is through perceus. Firstly you mount the .vnfs file
#perceus vnfs mount centos-5.6-1.x86_64
this will mount the file structure to /mnt/centos-5.6-1.x86_64
then you just enter into that directory, add/delete/modify whatever you want (including exported/mounted drives, kernel info, directories ....)
then you umount the system
#perceus vnfs umount centos-5.6-1.x86_64
then you livesync the system (however if you do anything major like update the kernel you will probably need to reboot the node, look at the software installation documentation for the reboot command)
#perceus vnfs livesync centos-5.6-1.x86_64
How do i push changes to the slave nodes? as well as reboot or shut them down? And how to bring them back up?
Via pdsh if you want ( pdsh -w n0000-13 "<whatever command>" ) or via a livesync after you editted the vnfs system (look at the software installation and/or the node filesystem documentation above). The command of course could be a "init 6" for reboot, or "init 0" for halt. To boot the slave nodes back up, just power cycle them on, they will PXE boot and perceus will boot them back up from there. If you want to using the pdsh command you can reboot a particular node ( pdsh -w n0006 "init 6" ,will reboot n0006 only).