Patch-ID# 114525-04 Keywords: sun_fire firmware flashprom security update 5.18.3 scapp rtos Synopsis: Hardware/PROM: Sun Fire E6900/E4900/E2900/6800/4800/4810/3800 and V1280 Systems Firmware Update Date: Jun/07/2005 Install Requirements: Additional instructions may be listed below Solaris Release: 8 9 10 SunOS Release: 5.8 5.9 5.10 Unbundled Product: Hardware/PROM Unbundled Release: ScApp:5.18.3,RTOS:41,SC POST:41:G Xref: Topic: Sun Fire system controller and flashprom update 5.18.3 NOTE: See Special Instructions: Watchdog Timer information and configuration instructions. Relevant Architectures: sparc BugId's fixed with this patch: 4500490 4640435 4650932 4657904 4667629 4683268 4690339 4738507 4793171 4824109 4828481 4832310 4834392 4853771 4866713 4880599 4882017 4911531 4915870 4924264 4939856 4953801 4953811 4955947 4957835 4964577 4965384 4966931 4968493 4969956 4981483 4982034 4982170 4984203 4984780 4985737 4987176 4987457 4987854 4988128 4992950 4993271 4993985 4994112 4994488 4994905 4996008 5000947 5001728 5003539 5004331 5005360 5005640 5005655 5006810 5006812 5007818 5007831 5009788 5009856 5009864 5010205 5010616 5010772 5011243 5011320 5012130 5012317 5014581 5015109 5015363 5018002 5019052 5020501 5020606 5020704 5020887 5021417 5022423 5022479 5023405 5025518 5027547 5028333 5028915 5028916 5028917 5029117 5029722 5029847 5029856 5030395 5031658 5031871 5034739 5034767 5034786 5034881 5035234 5035293 5035517 5035667 5036290 5036321 5037074 5039408 5039565 5039905 5040267 5040732 5041545 5041600 5041656 5042076 5042555 5042636 5043373 5044000 5045210 5049265 5050000 5050697 5050725 5050732 5051257 5051422 5053287 5053926 5055997 5057330 5057869 5058001 5058313 5060659 5060748 5061593 5061593 5062510 5062717 5062914 5065337 5066585 5067307 5068391 5068436 5068851 5068926 5070035 5070429 5072938 5074972 5076179 5077697 5080862 5081679 5083664 5088868 5089309 5089914 5090178 5090906 5091506 5091556 5093903 5099024 5099206 5101931 5106212 6175704 6183416 6189121 6190958 6193290 6202816 6217224 6217337 6225904 6269048 Changes incorporated in this version: 6269048 Patches accumulated and obsoleted by this patch: 111346-04 112127-03 112494-08 112883-07 112884-06 113751-05 114523-02 800054-01 Patches which conflict with this patch: Patches required with this patch: Obsoleted by: Files included with this patch: Install.info README.114525-04 Sun_Fire_Entry-Level_Midrange_System_Administration_Guide.pdf Sun_Fire_Entry-Level_Midrange_System_Controller_Command_Reference_Manual.pdf Sun_Fire_Entry-Level_Midrange_System_Firmware_5.18.0_Release_Notes.pdf Sun_Fire_Midrange_System_Controller_Command_Reference_Manual.pdf Sun_Fire_Midrange_Systems_Firmware_5.18.0_Release_Notes.pdf Sun_Fire_Midrange_Systems_Platform_Administration_Manual.pdf copyright lw8cpu.flash lw8pci.flash sgcpu.flash sgiowci.flash sgiowci_sp.flash sgpci.flash sgrtos.flash sgsc.flash Problem Description: (From 114525-04) 6269048 MICRON DIMM Boot Up Failure (From 114525-03) 4828481 Console messages "addRecord: Segment TH Insufficient space Need 35 have 25" 4964577 local-mac-address? flag seems to be ignored by qfe adapter on a V1280 5070035 Alarm 3 on Lightweight 8 needs to be user programmable for backward compatiblity 5089914 RFE : need new power budget with Uniboard fully loaded 2 GB dimm. 5101931 XMITS3.0/PCIX/3.3V Slot: Data comparison failures with SunVTS iobustest 5106212 PS Failure Causes false FT Failure 6175704 2 E2900's got TO (Time Out) panics CPU513 during or very shortly after DR-ing in SB0 6183416 Certain DIMM failures cannot be isolated 6202816 add warning for incompatible dimm sizes on V1 and V2 uniboards 6217224 Copyright file needs updated for 2005. 6217337 Need to update the COBP banner to reflect the year 2005. 6225904 POST banner is not updated for 2005 (From 114525-02) 4690339 domain error isolation CM_EACK in C accompanied by ConsolePortError in D. 5010772 Jasper320 HBA not working in a Starcat/XMITS 5v slot 5058313 takes a long time to synchronize failover status after "setfailover force" 5091506 6800 System fails to boot with 6 JAG's loaded with 2g memory config. 5093903 SIGBUS error occurred during multiple showenvironment commands. 5099206 Seprom addRecord errors are not actionable 6189121 REGRESSION: inventory not showing the correct "powered on" time 6190958 Change Vcore voltage from 1.225 to 1.25 volts for Jag 3.x 6193290 V1280, 5.18.0/5.17.3 service mode contains engineering mode only commands. (From 114525-01) 4500490 Flash Prom Binary Image utilties should be under source control 4640435 Portion of SNMP OID address space should be reserved for other projects 4650932 POST memory error messages do not provide enough information for debug. 4657904 DFRUID: updatePowerRecs() needs to write Event & Summary records independently. 4667629 RFE: "setdate" cmd should be forbidden if ntp server is configured 4683268 SC POST should reset M48T59 watchdog at boot 4738507 src/Makefile has obsolete hardcoded hostname 4793171 'setfail force' doesn't always force a failover 4824109 showboards -v command reports wrong DIMM conponent slot where no DIMM inserted 4832310 Failed to create, change password using eeprom to set security-mode=command 4834392 scapp "install" target is broken 4853771 main and spare SCs need to synchronize time-of-day when not using SNTP 4866713 .properties on cpu node lists incorrect clock-freq 4880599 When calculating ambient temp, an HPU w/ null sensors results in 0 degrees 4882017 To have the ability to tune sntp or change the default error reporting threshold 4911531 SC drifts time of about 1 to 2 seconds 4915870 get ip addr 0.0.0.0 when two serengeti boot diskless 4924264 misalign string in POST 4939856 Change syslog logger "level" to local0.warning for SNTP messages 4953801 sepromupdate should support 8MB Ecache 4953811 sepromupdate should accept D$ & E$ component names 4955947 SC panic after flashupdating the f/w on LW8 4957835 Automatically enable shell keys when in engineering/manufacturing mode 4965384 SCApp shall enforce proper FT and PS configuration rules 4966931 Enabling SNTP on Serengeti SC disrupts domain clock 4968493 "boot net:dhcp" breaks in the tftp phase requesting wrong boot file 4969956 CSTHl functionality should be integrated into SCapp 4981483 java.lang.ArrayIndexOutOfBoundsException: 2 - domain does not reboot 4982034 Ecache tag ecc err test needs to handle THCE 4982170 chmsgs reports unknown and unused message tags in 5.17-gate 4984203 Frame fan tray and RTS status are not logged 4984780 ScApp does not provide sc board revision to SunMC 4985737 error events are being reported after an automatic restoration has initiated. 4987176 panic trying to lock ISM page that isn't there 4987457 DX Safari Port Error Status registor dump maps incorrect type of error 4987854 repetitive message "the error buffer is full" can overwrite persistent logs 4988128 CPU Time-out (TO) from system bus during POST is not evaluated 4992950 console input does not resume on a failed cycle keyswitch 4993271 lpost 5.17.0 build 5.0 is mishandling fast_ecc_error_trap for the UCC case 4993985 Interconnect fails but all FRUs are still included in the domain. 4994112 after bootup, "Enable Sun Fire Link?" is not enabled even when it says yes 4994488 problem when using seprom-frutype-substitution 4994905 functioning A184 PS are not detected and acknowledged as powered on 4996008 java.lang.NullPointerException after multiple key off/on, then failover 5000947 illegal option -d shows up on help for showlogs 5001728 CHS implicate processor on uniboard when faulty DIMMs is really the problem 5003539 tunable 'Persistent logging' is not working right 5004331 incorrect data used for amazon fan tray power consumption. 5005360 sgcpu and sgpci is flashed with flashupdate (E2900) 5005640 ScApps for LW8 Amazon will not support 2N power. 5005655 fsr_test not cover enough bits 5006810 Domain level POST need to handle failed CPUs(master/slave) effectively 5006812 Update Artesyn D149 2.5 voltage margin for XMIT board. 5007818 lom[service]> help testinterconncet - not very helpful on V1280 5007831 lom[service]> testinterconnect -d A should not work on a V1280/Netra 1280 5009788 RFE: POST implementation for the new memory refresh rate ( shorten in 1/2 ) 5009856 Implement new Vcore setting for Jaguar 3.x 5009864 SC Failover service port needs to be private 5010205 POST support for 1200 MHz USIV processors 5010616 'setchs -s faulty ...' does not disable all related components on V3 CPU 5011243 "The error buffer is full" message is misleading 5011320 showboards displays invalid Cpu Mask for Jaguar 3.0 5012130 SC accepts packets with private IP addresses. 5012317 Error data information not displayed when FPU 5014581 F4800 domain, 5.15.3 firmware stuck in "Active - Panicking" state 5015109 COD and SSH is not enabled for lw8 5015363 SBBC Reset Reason(s): Peer Reset, SC Reset button (rebooting the SC) 5018002 LW8: cannot enter Service when in Manufacturing mode 5019052 4800 cannot poweron domain in dual partition with 2 PS 5020501 TelnetServer.run: sock.accept() failed: java.net.SocketException: S_errno_ECONNR 5020606 cod setup should be moved to setupsc 5020704 disablecomponent command run, message said cannot enable component 5020887 "marked as Failed!" is inconsistent with the AD event message 5021417 downgrade allows connection type to be set to unavailable ssh option 5022423 error message every 30 seconds on the console 5022479 main sc did not come back in 60 seconds when rebooting, main became spare 5023405 usage statement for setescape differs from same on sg 5025518 confused msg shown when 'poweron grid0' issued on lw8 5027547 OBP needs to check for domain keyswitch state before dropping to OK prompt. 5028333 Remove sgiowci_sp.flash from future releases 5028915 lpost clears the i-cache microtags on the master cpu 5028916 the second dtlb t8_1 is not initialized properly at diag-level>init 5028917 2-way e-cache associativity is not always determined correctly for jaguar tests 5029117 ERROR: Missing text resource (flash downgrade e2900) 5029722 false msg:IB6/FAN0 Faulty: replacement required 5029847 SCApp needs to support JG3.1 5029856 PS failure caused I2C problems with other FRUs. 5030395 Serengeti POST does not use new FPROM Access Timing during domain level tests 5031658 Need to handle setBytes failure when write log messages to the persistent log. 5031871 Thread deadlock when clearing bad segment in E$ on poweron 5034739 show-post-results does not recognize Xmits IO boards 5034767 regression test stopper: POST fails SB0 and excludes it 5034786 Line numbers not included in stack traces 5034881 cobp generates incorrect replies to RARP requests 5035234 SC panic during boot after failover initiated from the SpareSC 5035293 Jar's Manifest isn't treated properly by SC JVM 5035517 redundent warning msg shown on console 5035667 outgoing telnet from SC causes ClassCastException 5036290 pci_lpost routine does not clear iommu entries for pci leaf B 5036321 RFE to the SCAPP command "sepromupdate" with a new option 5037074 License key interpreted as decimal instead of hex 5039408 fail IO test but processor gets the blame and marked bad 5039565 Variables VERSION and SUNW_PRODVERS should not be the same SMS release value 5039905 Killing a repeat thread causes error to be displayed 5040267 VXWORKS_BASEDIR needs to reference the new /ws/sg-firmware/vxworks-master-2.0 5040732 shownvram CLI with wrong args causes stack trace 5041545 Setkey off turn off sequence incorrect after a failed setkey on 5041600 Lightweight 8 flashupdate file issue 5041656 POST fails on ERROR: TEST=CPU Functional,SUBTEST=D-Cache Parity Tag Test ID=41.9 5042076 setk on failed to standby mode 5042555 ADM1023A -128C temp sensor warning needs to be reported as a temp sensor failure 5042636 Faulty system board causes NullPointerException and causes an impression that se 5043373 "ERROR: wrong value on timing register:..." should be removed 5044000 shownvram does not show contents of E$ & D$ 5045210 Interconnect retest failed signals always passed. 5049265 SC hangs at Boot with virgin U2106 5050000 SC prints warning for replacing PS/FT on 4800 machine 5050697 SC should detect mixed PSs and print warning immediately after the PS inserted 5050725 confused warning message printed on console when CPU V3 inserted into 4800 5050732 PS-FT warning message printed on console does not match the real system config 5051257 mem2 /N0/SB4/P3/Cx timeout, has no hearbeat, not responding 5051422 SCApp needs to support JG3.1 and JG2.4 5053287 E2900 gives "I2c error: slave did not ACK" message after resetsc 5053926 Panic stacktrace should be logged 5055997 State bits in ecache tags not included in ecache tags test 5057330 inconsistency of dx name output for IO and SB/RP 5057869 setupnetwork - disable network should not have network option 5058001 Failed Echache Functional and SBBC Dev Error Status 200(5.18.0) 5060659 Incorrect scan chain length for Jaguar_3.0 5060748 upgrade from 5.15.0 to 5.18.0 firmware changes connection type on spare SC 5061593 Switching network settings clobbers SNTP server setting 5061593 Switching network settings clobbers SNTP server setting ./ 5062510 DomainBufferWriter thread error. 5062717 Timing requirements for D150/D105/D151 converters 5062914 sntp setting is lost during upgrade to 5.18.0 5065337 Reset of Domain causes CHS disable board 5066585 ScApp should not power on processors on unsupported boards. 5067307 Possible regression due to fix for 4985737. 5068391 postTestList may be null during startCpu/stopCpu 5068436 showchs incorrectly parses the component string. 5068851 serengeti platform obp get wrong mac address of router foroff subnet tftp server 5068926 Detecting board failure during POST causes inconsistent display of board power 5070429 System controller, PANIC: Out of Memory, 5.17.1 firmware 5072938 Signal Dispatch: signal 10 in thread 5074972 Starcat post should implement rio i/o test functions 5076179 shell keys ^A ^X are not usable 5077697 "6800-SC[service]> testinterconnet" passes despite missing centreplane pin. 5080862 system does not not have SC failover automatically enabled after power cycle 5081679 NVRAMRC buffer is too small 5083664 backout changes for bug#5060748 which can cause ssh disabled. 5088868 4900 SC incorrectly indicates FT as non high volume, eventhough they are 5089309 Add jtag to support Jaguar 3.2 5090178 Request to undo the change for bug fixed 5045210. 5090906 Workaround needed for Schizo 2.2 problem 5091556 SC panics runs out of memory 5099024 Persistent Msg Log Error count corrupted. Patch Installation Instructions: -------------------------------- Please refer to the Install.info file for instructions on updating the firmware using the files included in this patch. Special Install Instructions: ----------------------------------- Watchdog Timer - Sun Fire Entry-Level Midrange Systems 5.18.2 - 4/18/2005 ========================================================================= This text gives information on the application mode of the watchdog timer on the Netra 1280 server. The enhancement allows users to: o Configure the watchdog timer - User applications running on the host can configure and use the watchdog timer, enabling customers to detect fatal problems from their applications and to recover automatically. o Program Alarm 3 - This enables users to generate this alarm in case of critical problems in their applications. This README text provides the following sections to help you understand how to configure and use the watchdog timer and program Alarm3: o Upgrading the Firmware Using the lom -G Command o Understanding the Watchdog Timer Application Mode o Using the ntwdt Driver o Understanding the User APIs o Setting the Time-out Period o Enabling or Disabling the Watchdog o Rearming ("Patting") the Watchdog o Getting the State of the Watchdog Timer o Finding and Defining Data Structures o Using the Sample Watchdog Program o Programming Alarm3 o Understanding Error Messages o Knowing Unsupported Features and Limitations Upgrading the Firmware Using the lom -G Command ----------------------------------------------- 1) Upgrade the firmware on the system controller (SC): #lom -G sgsc.flash #lom -G sgrtos.flash 2) Escape to lom> and reset the SC: lom> resetsc -y To get to the Lights Out Management (lom) prompt, you can telnet directly into the Ethernet port of the SC (this is different from the Solaris IP address), or you can attach a console to the serial port on the SC. If you are remote from the system, configure the SC's Ethernet port, or attach the SC serial port to a network terminal server. 3) Upgrade the firmware on the system boards: #lom -G lw8cpu.flash #lom -G lw8pci.flash 4) Shutdown the Solaris(TM) Operating System (OS). 5) Power off the system. lom poweroff 6) Power on the system. lom poweron Understanding the Watchdog Timer Application Mode ------------------------------------------------- The watchdog mechanism detects a system hang, or an application hang or crash, should they occur. The watchdog is a timer that is continually reset by a user application as long as the operating system and user application are running. When the application is rearming the application watchdog, an expiration can be caused by: o Crash of the rearming application o Hang or crash of the rearming thread in the application o System hang When the system watchdog is running, a system hang, or more specifically, the hang of the clock interrupt handler causes an expiration. The system watchdog mode is the default. If the application watchdog is not initialized, then the system watchdog mode is used. The "setupsc" command, an existing command on the SC Lights Out Management can be used to configure the recovery for the system watchdog ONLY: lom> setupsc The system controller configuration should be as follows: SC POST diag Level [off]: Host Watchdog [enabled]: Rocker Switch [enabled]: Secure Mode [off]: PROC RTUs installed: 0 PROC Headroom quantity (0 to disable, 4 MAX) [0]: The recovery configuration for the application watchdog is set using Input/Output Control codes (IOCTLs) that are issued to the ntwdt driver. Using the ntwdt Driver ---------------------- To use the new application watchdog feature, you must install the ntwdt driver. To enable and control the watchdog's application mode, you must program the watchdog system using the LOMIOCDOGxxx IOCTLs, described in the section "Understanding the User API". If the ntwdt driver, as opposed to the system controller, initiates a reset of the Solaris OS on application watchdog expiration, the value of the following property in the ntwdt driver's configuration file (ntwdt.conf) is used: ntwdt-boottimeout="600"; In case of a panic, or an expiration of the application watchdog, the ntwdt driver reprograms the watchdog time-out to the value specified in the property. Assign a value representing a duration that is longer than the time it takes to reboot and perform a crash dump. If the specified value is not large enough, the SC resets the host if reset is enabled. Note that this reset by the SC occurs only once. Understanding the User API --------------------------- The ntwdt driver provides an application program interface by using IOCTLs. You must open the /dev/ntwdt device node before issuing the watchdog IOCTLs. -------------------------------------------------------------------------------- NOTE: Only a single concurrent instance of open() is allowed on /dev/ntwdt. Any subsequent open() generates the following error message: EAGAIN - (The driver is busy, try again.) -------------------------------------------------------------------------------- You can use the following IOCTLs with the watchdog timer: o LOMIOCDOGTIME - Set time-out period for watchdog timer o LOMIOCDOGCTL - Enable or disable watchdog timer o LOMIOCDOGPAT - Rearm ("pat") watchdog timer o LOMIOCDOGSTATE - Get state of watchdog timer o LOMIOCALCTL - Set value of Alarm3 o LOMIOCALSTATE - Get state of Alarm3 Setting the Time-out Period --------------------------- The LOMIOCDOGTIME IOCTL sets the time-out period of the watchdog. This IOCTL programs the watchdog hardware with the time specified in this IOCTL. You must set the time-out period (LOMIOCDOGTIME) before attempting to enable the watchdog timer (LOMIOCDOGCTL). The argument is a pointer to an unsigned integer. This integer holds the new time-out period for the watchdog in multiples of 1 second. You can specify any time-out period in the range of 1 second to 180 minutes. If the watchdog function is enabled, the time-out period is immediately reset so that the new value can take effect. An error (EINVAL) is displayed if the time-out period is less than 1 second or longer than 180 minutes. ----------------------------------------------------------------------------- NOTE: The LOMIOCDOGTIME is not intended for general purpose use. Setting the watchdog time-out to too low a value might cause the system to receive a hardware reset if the watchdog and reset functions are enabled. If the time-out is set too low, the user application must be run with a higher priority (for example, as a real time thread) and must be rearmed more often to avoid an unintentional expiration. ----------------------------------------------------------------------------- Enabling or Disabling the Watchdog ---------------------------------- The LOMIOCDOGCTL IOCTL enables or disables the watchdog, and it enables or disables the reset capability. (See the "Data Structures" section for the correct values for the watchdog timer.) The argument is a pointer to the lom_dogctl_t structure (described in greater detail in the "Data Structures" section). Use the reset_enable member to enable or disable the system reset function. Use the dog_enable member to enable or disable the watchdog function. An error (EINVAL) is displayed if the watchdog is disabled and reset is enabled. -------------------------------------------------------------------------------- NOTE: If LOMIOCDOGTIME has not been issued to set up the time-out period prior to this IOCTL, the watchdog is NOT enabled in the hardware. -------------------------------------------------------------------------------- Rearming, or Patting, the Watchdog ---------------------------------- The LOMIOCDOGPAT IOCTL rearms, or pats, the watchdog so that the watchdog starts ticking from the beginning; that is, to the value specified by LOMIOCDOGTIME. This IOCTL requires no arguments. If the watchdog is enabled, this IOCTL must be used at regular intervals that are less than the watchdog time-out, or the watchdog expires. Getting the State of the Watchdog Timer --------------------------------------- The LOMIOCDOGSTATE IOCTL gets the state of the watchdog and reset functions and retrieves the current time-out period for the watchdog. If LOMIOCDOGSTATE was never issued to set up the time-out period prior to this IOCTL, the watchdog is not enabled in the hardware. The argument is a pointer to the lom_dogstate_t structure (described in greater detail in the section on "Data Structures"). The structure members are used to hold the current states of the watchdog reset circuitry and current watchdog time-out period. Note that this is not the time remaining before the watchdog is triggered. The LOMIOCDOGSTATE IOCTL requires only that open() be successfully called. This IOCTL can be run any number of times after open() is called, and it does not require any other DOG IOCTLs to have been executed. Finding and Defining Data Structures ------------------------------------ All data structures and IOCTLs are defined in lom_io.h, which is available in the SUNWlomh package. The data structures for the watchdog timer are shown here: 1. The watchdog/reset state data structure is as follows: typedef struct { int reset_enable; /* reset enabled if non-zero */ int dog_enable; /* watchdog enabled if non-zero */ uint_t dog_timeout; /* Current watchdog time-out in seconds */ } lom_dogstate_t; 2. The watchdog/reset control data structure is as follows: typedef struct { int reset_enable; /* reset enabled if non-zero */ int dog_enable; /* watchdog enabled if non-zero */ } lom_dogctl_t; Using the Sample Watchdog Program ----------------------------- Following is a sample program for the watchdog timer: #include #include #include #include #include int main() { uint_t timeout = 30; lom_dogctl_t dogctl; int fd; dogctl.reset_enable = 1; dogctl.dog_enable = 1; fd = open("/dev/ntwdt", O_EXCL); /* Set timeout */ ioctl(fd, LOMIOCDOGTIME, (void *)&timeout); /* Enable watchdog */ ioctl(fd, LOMIOCDOGCTL, (void *)&dogctl); /* Keep patting */ while (1) { ioctl(fd, LOMIOCDOGPAT, NULL); sleep (5); } return (0); } Programming Alarm3 ------------------ Alarm3 is available to Solaris Operating System users irrespective of the watchdog mode. Alarm3 or system alarm ON and OFF have been redefined (see the table below.) Set the value of Alarm3 using the LOMIOCALCTL IOCTL. You can program Alarm3 like you set and clear Alarm1 and Alarm2. The following table presents the behavior of Alarm3: Alarm3 Relay System LED (Green) --------------------------------------------------------------------- Poweroff ON COM -> NC OFF Poweron/LOM up ON COM -> NC OFF Solaris running OFF COM -> NO ON Solaris not running ON COM -> NC OFF Host WDT expires ON COM -> NC OFF User sets to ON ON COM -> NC OFF User sets to OFF OFF COM -> NO ON Alarm3 ON = Relay(COM->NC), System LED OFF Alarm3 OFF = Relay(COM->NO), System LED ON When programmed, you can check Alarm3 or the system alarm with the showalarm command and the argument "system". For example: sc> showalarm system system alarm is on The data structure used with the LOMIOCALCTL and LOMIOCALSTATE IOCTLs is as follows: #include #define ALARM_NUM_1 1 #define ALARM_NUM_2 2 #define ALARM_NUM_3 3 #define ALARM_OFF 0 #define ALARM_ON 1 typedef struct { int alarm_no; int alarm_state; } lom_aldata_t; Understanding Error Messages ---------------------------- Following are the error messages that might be displayed and what they mean: EAGAIN This error message is displayed if you attempt to open more than one instance of open() on /dev/ntwdt. EFAULT This error message is displayed if an incorrect user-space address was specified. EINVAL This error message is displayed if a nonexistent control command was requested or invalid parameters were supplied. EINTR This error message is displayed if a thread awaiting a component state change is interrupted. ENXIO This error message is displayed if the driver is not installed in the system. Knowing Unsupported Features and Limitations -------------------------------------------- 1) In the case of the watchdog timer expiration detected by the SC, the recovery is attempted only once; there are no further attempts of recovery if the first attempt fails to recover the domain. 2) If the application watchdog is enabled and you break into the OpenBoot(TM) PROM (OBP) by issuing the "break" command from the system controller's "lom" prompt, the SC automatically disables the watchdog timer. -------------------------------------------------------------------------------- NOTE: The SC displays a console message as a reminder that the watchdog, from the SC's perspective, is disabled. -------------------------------------------------------------------------------- However, when you reenter the Solaris OS, the watchdog timer is still ENABLED from the Solaris Operating System's perspective. To have both the SC and the Solaris OS view the same watchdog state, you must use the watchdog application to either enable or disable the watchdog. 3) If you perform a dynamic reconfiguration (DR) operation in which a system board containing kernel (permanent) memory is deleted, then you must disable the watchdog timer's application mode before the DR operation and enable it after the DR operation. This is required because Solaris software quiesces all system IO and disables all interrupts during a memory-delete of permanent memory. As a result, system controller firmware and Solaris software can not communicate during the DR operation. Note that this limitation affects neither the dynamic addition of memory nor the deletion of a board not containing permanent memory. In those cases, the watchdog timer's application mode can run concurrently with the DR implementation. You can execute the following command to locate the system boards that contain kernel (permanent) memory: sh> cfgadm -lav | grep -i permanent 4) If the Solaris Operating System hangs under the following conditions, the system controller firmware cannot detect the Solaris software hang: o Watchdog timer's application mode is set o Watchdog timer is not enabled o No rearming is done by the user 5) The watchdog timer provides partial boot monitoring. You can use the application watchdog to monitor a domain reboot. However, domain booting is not monitored for: o Bootup after a cold powerup o Recovery of a hung or failed domain In the latter cases, a boot failure is not detected and no recovery attempts are made. 6) The watchdog timer's application mode provides no monitoring for application startup. In application mode, if the application fails to start up, the failure is not detected and no recovery is provided. -------------------------------------------------------------------------------- Copyright 2005 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. This product or document is protected by copyright and distributed under licenses restricting its use, copying, distribution, and decompilation. No part of this product or related documentation may be reproduced in any form by any means without prior written authorization of Sun and its licensers, if any. Third party software, including font technology, if any, is copyrighted and licensed from Sun suppliers. Sun, Sun Microsystems, Solaris, the Sun Logo, Sun Fire, OpenBoot, and SPARC are trademarks or registered trademarks of Sun Microsystems, Inc in the U.S. and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. Federal Acquisitions: Commercial Software - Government users subject to standard license terms and conditions. DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS. REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID. -------------------------------------------------------------------------------- Copyright 2005 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. Ce produit ou document est protege par un copyright et distribue avec des licences qui en restreignent l'utilisation, la copie, la distribution, et la decompilation. Aucune partie de ce produit ou document ne peut etre reproduite sous aucune forme, par quelque moyen que ce soit, sans l'autorisation prealable et ecrite de Sun et de ses bailleurs de licence, s'il y en a. Le logiciel detenu par des tiers, et qui comprend la technologie relative aux polices de caracteres, est protege par un copyright et licencie par des fournisseurs de Sun. Sun, Sun Microsystems, Solaris, le Sun logo, Sun Fire, OpenBoot, et SPARC sont desmarques de fabrique ou des marques deposees de Sun Microsystems, Inc. aux Etats-Unis et dans d'autres pays. Toutes les marques SPARC sont utilisees sous licence et sont des marques de fabrique ou des marques deposees de SPARC International, Inc. aux Etats-Unis et dans d'autres pays. Les produits portant les marques SPARC sont bases sur une architecture developpee par Sun Microsystems, Inc. LA DOCUMENTATION EST FOURNIE "EN L'ETAT" ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES EXPRESSES OU TACITES SONT FORMELLEMENT EXCLUES DANS LA MESURE AUTORISEE PAR LA LOI APPLICABLE, Y COMPRIS NOTAMMENT TOUTE GARANTIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE, A L'APTITUDE A UNE UTILISATION PARTICULIERE OU A L'ABSENCE DE CONTREFACON. README -- Last modified date: Tuesday, June 7, 2005