Search This Blog

Friday, January 14, 2011

Customize LSF configuration files in OCS/PCM


Customize LSF configuration files in OCS/PCM
Document Number
1-1GMJYZ
Product
Platform OCS/PCM
Version
1.2, 1.2a, 1.2b, 1.2.1, 2.0.1
OS
Linux
Category
Product Usage
Date Created
Apr 02 2009, 04:37 PM
Last Update
Oct 22 2010, 02:01 PM
Keywords
Customize LSF configuration files in OCS/PCM

Topic

Customize LSF configuration files in OCS/PCM
Problem Description

After PCM LSF Kit is installed, the first thing you usually do is to customize LSF configuration according to the actual needs of your site.
For example, you may want to customize lsb.queues file to add your own queue definition, or you may need to define a special resource which is only available on several nodes. In the later case, you will need to customize lsf.shared file and lsf.cluster file.
This article provides you with the steps and some examples of how to customize LSF configuration in different PCM/OCS versions.

Solution Detail

Customize LSF in PCM 2.0, PCM 1.2a, PCM 1.2b, PCM 1.2.1
In the versions listed above, all the LSF configuration changes should be done through template files.
The templates for lsf.* files are under /etc/cfm/templates/lsf/.
# pwd
/etc/cfm/templates/lsf
# ll
total 32
-r--r--r-- 1 lsfadmin root  1770 Dec  1 03:11 default.lsf.cluster
-r--r--r-- 1 lsfadmin root  1479 Dec  1 03:11 default.lsf.conf
lrwxrwxrwx 1 root     root    16 Mar 12 14:53 default.lsf.conf.master -> default.lsf.conf
lrwxrwxrwx 1 root     root    16 Mar 12 14:53 default.lsf.conf.slave -> default.lsf.conf
-r--r--r-- 1 lsfadmin root 12766 Dec  1 03:11 default.lsf.shared
drwxr-xr-x 3 lsfadmin root  4096 Mar 12 14:53 ego
drwxr-xr-x 3 lsfadmin root  4096 Mar 12 14:53 lsbatch
The templates for lsb.* files are under/etc/cfm/templates/lsf/lsbatch/default/configdir/.
# pwd
/etc/cfm/templates/lsf/lsbatch/default/configdir
# ll
total 80
-r--r--r-- 1 lsfadmin root 21388 Dec  1 03:11 lsb.applications
-r--r--r-- 1 lsfadmin root  2912 Dec  1 03:11 lsb.hosts
-r--r--r-- 1 lsfadmin root  1196 Dec  1 03:11 lsb.modules
-r--r--r-- 1 lsfadmin root  1547 Dec  1 03:11 lsb.nqsmaps
-r--r--r-- 1 lsfadmin root  1454 Dec  1 03:11 lsb.params
-r--r--r-- 1 lsfadmin root 20801 Dec  1 03:11 lsb.queues
-r--r--r-- 1 lsfadmin root  6025 Dec  1 03:11 lsb.resources
-r--r--r-- 1 lsfadmin root  1279 Dec  1 03:11 lsb.serviceclasses
-r--r--r-- 1 lsfadmin root   899 Dec  1 03:11 lsb.users
If you need to customize any file, make changes in the relative template file. After you save the changes to the template file, run addhost -u which will propagate the changes to nodes and restart LSF services.
What needs to be highlighted is default.lsf.cluster file. In this template file, you cannot find the host information like what you have in /opt/lsf/conf/lsf.cluster.<cluster_name> file.
If you need to modify the host section of lsf.cluster file, you have to first manually type in the host information or copy the host information over from /opt/lsf/conf/lsf.cluster.<cluster_name> file, then make changes.
By default, the default.lsf.cluster file looks like this:
Begin   Host
HOSTNAME  model    type        server r1m  mem  swp  RESOURCES    #Keywords
......
#orange   !        SUNSOL       1     3.5  1    2   (sparc bsd)   #Example
#prune    !        !            1     3.5  1    2   (convex)
XXX_lsfmc_XXX   !   !   1   3.5   ()   ()   (mg)
End     Host
Let's take an example that you need to change the r1m value of a compute node compute-0 from the default value 3.5 to 4.0. You need to manually type the following information for compute-0 in the file:
Begin   Host
HOSTNAME  model    type        server r1m  mem  swp  RESOURCES    #Keywords
......
#orange   !        SUNSOL       1     3.5  1    2   (sparc bsd)   #Example
#prune    !        !            1     3.5  1    2   (convex)
XXX_lsfmc_XXX   !   !   1   3.5   ()   ()   (mg)
compute-0       !   !   1   4.0    ()   ()   (mg)
End     Host
Run addhost -u after you make the change.

Customize LSF in OCS 5.3/PCM 1.2 and OCS 5.1
Due to some bug in LSF Kit shipped with OCS 5.3/PCM 1.2 and OCS 5.1, you need to follow different steps to customize LSF configuration files from PCM 2.0, PCM 1.2a, PCM 1.2b and PCM 1.2.1.
Steps to customize lsf.* files such as lsf.conflsf.shared, except for the host section oflsf.cluster file:
1.     Make changes to the relative files in /etc/cfm/templates/lsf/
For example, modify default.lsf.conf if you need to customize lsf.conf file.
2.     Run addhost -u
Steps to customize lsb.* files such as lsb.hostslsb.users:
1.     Make changes to the files in /etc/cfm/installer-rhel-5-x86_64/opt/lsf/conf/lsbatch/<clustername>/configdir/
2.     Run cfmsync -fp
Steps to customize lsf.cluster file:
Due to the bug mentioned above, there are problems in customizing the host section oflsf.cluster file in OCS 5.3/PCM 1.2 and OCS 5.1. For other sections, you can follow the same steps with customizing lsf.conf file.
To customize the host section of lsf.cluster file, you can do that through /etc/cfm/installer-rhel-5-x86_64/opt/lsf/conf/lsf.cluster file and run "cfmsync -fp". But the changes will be overwritten if addhost -u is called. Please contact Platform PCM Support to require a fix if you really need to customize the host section of lsf.cluster file. Upgrading your cluster to a higher version can also solve the issue and it is recommended.
Note:
Changing the files in /opt/lsf/conf only works temporarily. Configuration changes are rolled back if cfmsync is called by other actions, such as ngedit.

No comments:

Labels