Patch-ID# 118689-03 Keywords: sunwscsge Synopsis: Sun Cluster 3.0/3.1/3.1_x86: HA N1 Grid Engine patch Date: May/18/2005 Install Requirements: NA Solaris Release: 8 9 9_x86 10 10_x86 SunOS Release: 5.8 5.9 5.9_x86 5.10 5.10_x86 Unbundled Product: Sun Cluster Unbundled Release: 3.0/3.1/3.1_x86 Xref: Topic: Sun Cluster 3.0/3.1/3.1_x86: HA N1 Grid Engine patch Relevant Architectures: all BugId's fixed with this patch: 6201724 6202374 6203844 6204419 6206996 6207004 6232370 6251352 6255868 6255871 Changes incorporated in this version: 6201724 Patches accumulated and obsoleted by this patch: Patches which conflict with this patch: Patches required with this patch: Obsoleted by: Files included with this patch: /opt/SUNWscsge/bin/functions /opt/SUNWscsge/bin/functions.common53 /opt/SUNWscsge/bin/functions.common6 /opt/SUNWscsge/bin/sge_commd/functions.sge_commd /opt/SUNWscsge/bin/sge_commd/probe_sge_commd /opt/SUNWscsge/bin/sge_commd/start_sge_commd /opt/SUNWscsge/bin/sge_qmaster/functions.sge_qmaster /opt/SUNWscsge/bin/sge_qmaster/probe_sge_qmaster /opt/SUNWscsge/bin/sge_qmaster/start_sge_qmaster /opt/SUNWscsge/bin/sge_qmaster/stop_sge_qmaster /opt/SUNWscsge/bin/sge_qmaster6/functions.sge_qmaster /opt/SUNWscsge/bin/sge_qmaster6/probe_sge_qmaster /opt/SUNWscsge/bin/sge_qmaster6/start_sge_qmaster /opt/SUNWscsge/bin/sge_qmaster6/stop_sge_qmaster /opt/SUNWscsge/bin/sge_schedd/functions.sge_schedd /opt/SUNWscsge/bin/sge_schedd/probe_sge_schedd /opt/SUNWscsge/bin/sge_schedd/start_sge_schedd /opt/SUNWscsge/bin/sge_schedd/stop_sge_schedd /opt/SUNWscsge/bin/sge_schedd6/functions.sge_schedd /opt/SUNWscsge/bin/sge_schedd6/probe_sge_schedd /opt/SUNWscsge/bin/sge_schedd6/start_sge_schedd /opt/SUNWscsge/bin/sge_schedd6/stop_sge_schedd /opt/SUNWscsge/etc/SUNW.n1ge /opt/SUNWscsge/util/sge_config.example /opt/SUNWscsge/util/sge_register /opt/SUNWscsge/util/sge_remove Problem Description: 6255871 usage of pgrep should include whitespace behind ${RESOURCE} in searchpattern to avoid false hits 6255868 sge_register should not hardcode SUNW.gds RT version 6251352 sge agent should use pgrep restricted to the zone it is running in on S10 to avoid false pid (from 118689-02) 6232370 sge_register does not detect SUNW.gds:3.1 (from 118689-01) 6201724 agent for N1 Grid Engine should support version 6.0 6202374 sge 5.3 functions contain false components in debugging messages for validate_{qmaster,schedd,execd) 6203844 SUNWscsge/bin/sge_schedd sources non-existent functions file which makes it non-working 6204419 In SUNWscsge/bin/functions stop_sge_qmaster() and stop_sge_schedd() return "failme" instead $failme 6206996 SUNWscsge/bin/functions uses undefined variable $ARCH in SGE_init() causing wrong library path 6207004 dummy RTR file for SUNW.n1ge should contain keyword SERVICE_NAME to support upgrade via scinstall Patch Installation Instructions: -------------------------------- There are three (3) possible procedures for installing patches on Sun Cluster 3.0 and 3.1. The proper method to use and any additional instructions, for this patch are specified below in the "Special Install Instructions" section. Refer to the chapter entitled "Patching Sun Cluster Software and Firmware" in the Sun Cluster 3.0/3.1 System Administration Guide for a description of the different install processes and instructions on how to install Sun Cluster 3.x patches. For Solaris 8/9 release, refer to the man pages for instructions on using 'patchadd' and 'patchrm' scripts provided with Solaris. Special Install Instructions: ----------------------------- I. Updating an existing SGE 5.3 configuration within Sun Cluster: ----------------------------------------------------------------- If you are updating an existing Sun Cluster data service for Sun Grid Engine 5.3 configuration, then you need to disable all SGE 5.3 resources before applying this patch: 1. # scswitch -n -j 2. # scswitch -n -j 3. # scswitch -n -j Then apply this patch to each cluster node. After successful installation, enable all SGE 5.3 resources again: 1. # scswitch -e -j 2. # scswitch -e -j 3. # scswitch -e -j The config file in /opt/SUNWscsge/util/sge_config will be automatically converted to the new keywords introduced with this patch. Please review it to make sure it is reflecting your configuration. There is no need to reregister the date service after applying the patch, this conversion makes sure that the /opt/SUNWscsge/util/sge_remove script will work with the new keywords. II. Configuration of a new Sun Grid Engine environment within Sun Cluster: --------------------------------------------------------------------------- This patch introduces support for Sun Grid Engine 6.0 while still supporting Sun Grid Engine 5.3. The following restrictions apply to Sun Cluster HA for Sun Grid Engine 6.0: - do not configure or use a Berkley DB spooling server - do not configure to start the daemons when the system is booted - do not configure a shadow master host - sge_execd can not be configured as resource within Sun Cluster - make sure you have the following patches for Sun Grid Engine 6.0 installed (an unpatched version will not work): All packages of a Sun Grid Engine 6.0 distribution must have the same patch level. Refer to the patch matrix below to find out which patch belongs to which package. It is not supported to mix different patch levels in a single Sun Grid Engine cluster. E.g. if you install patch 118094-01 all other patches must have revision -01 as well. Also, make sure to update the "common" and "arco" packages. 1. Patches for packages in Sun pkgadd format -------------------------------------------- Package name* OS* Architecture* Patch-Id ---------------------------------------------------------------- SUNWsgee Solaris, Sparc, 32bit sol-sparc 118094 SUNWsgeex Solaris, Sparc, 64bit sol-sparc64 118130 SUNWsgeex Solaris x86 sol-x86 118131 SUNWsgeec all common 118132 SUNWsgeea all arco 118133 *Package Name = see pkginfo(1) *OS = Operating system *Architecture = N1 Grid Engine binary architecture string or "common" = architecture independent packages "arco" = Accounting and Reporting console 2. Patches for packages in tar.gz format ---------------------------------------- OS* Architecture Patch-Id ---------------------------------------------------- Solaris, Sparc, 32bit sol-sparc 118082 Solaris, Sparc, 64bit sol-sparc64 118083 Solaris, x86 sol-x86 118084 Linux kernel2.4/2.6, x86 lx24-x86 118085 Linux kernel2.4/2.6, AMD64 lx24-amd64 118086 IBM AIX 4.3 aix43 118087 IBM AIX 5.1 aix51 118088 Apple MAC OS/X darwin 118089 HP HP-UX 11 hp11 118090 SGI Irix 6.5 irix65 118091 all common 118092 all arco 118093 The restrictions for Sun Grid Engine 5.3 are documented in Sun Cluster Data Service for Sun Grid Engine Guide for Solaris OS. Please follow the installation instructions for the Sun Grid Engine version you plan to install. The Sun Cluster Data Service for Sun Grid Engine Guide for Solaris that describes the procedures for SGE 5.3 can still be used for guidance. The major change for SGE 6.0 is that there is no longer the sge_commd daemon, therefore this resource will not be configured when using SGE 6.0. The keyword-value pairs in /opt/SUNWscsge/util/sge_config have partly changed and are described as follows: COMMDRS=sge-commd-rs Specifies the name that you are assigning to the resource for the Sun Grid Engine communications daemon sge_commd. This is only needed for SGE 5.3 and can be left empty for SGE 6.0. QMASTERRS=sge-qmaster-rs Specifies the name that you are assigning to the resource for the Sun Grid Engine queue master daemon sge_qmaster. This must be defined. SCHEDDRS=sge-schedd-rs Specifies the name that you are assigning to the resource for the Sun Grid Engine scheduling daemon sge_schedd. This must be defined. MASTERRG=sge-rg Specifies the name of the resource group that contains the Sun Cluster HA for Sun Grid Engine resources (sge_commd (5.3), sge_qmaster and sge_schedd). This name must be the name you assigned when you created the resource group as documented within the Sun Cluster Data Service for Sun Grid Engine Guide for Solaris OS. This keyword needs to be defined and was formerly named "RG". MASTERPORT=portno Specifies the port number that is configured for sge_qmaster in /etc/inet/services (normally 536). While this value is not used by the dataservice it is good practice to document it here. It must be an integer and needs to be always defined. This keyword was formerly named "PORT". MASTERLH=sge-lh-rs Specifies the name of the logical host name resource for Sun Grid Engine. This name must be the name you assigned when you created the resource group as documented within the Sun Cluster Data Service for Sun Grid Engine Guide for Solaris OS. This keyword needs to be defined and was formerly named "LH". SGE_ROOT=sge-root-dir Specifies the root directory of the Sun Grid Engine file system. This directory must be the directory that you created for root of the Sun Grid Engine file system as documented in the Sun Cluster Data Service for Sun Grid Engine Guide for Solaris OS. This keyword needs to be defined. SGE_CELL=cell-name Specifies the cell that Sun Grid Engine references. This keyword needs to be defined. SGE_VER=5.3|6.0 Specifies the version of the installed Sun Grid Engine configuration. This keyword needs to be defined and can have the value of "5.3" or "6.0". Once you configured the keyword-value pairs in the sge_config file you can proceed as documented in the Sun Cluster Data Service for Sun Grid Engine Guide for Solaris OS. README -- Last modified date: Wednesday, May 18, 2005