Patch-ID# 116468-12 NOTE: *********************************************************************** READ THE TERMS OF THE AGREEMENT ("AGREEMENT") IN THE LEGAL_LICENSE.TXT FILE CAREFULLY BEFORE USING THIS SOFTWARE. BY USING THE SOFTWARE, YOU AGREE TO THE TERMS OF THIS AGREEMENT. IF YOU DO NOT AGREE TO ALL OF THE TERMS, PROMPTLY DESTROY THE UNUSED SOFTWARE. *********************************************************************** Keywords: sndr disk queue rdc remote mirror Synopsis: Availability Suite 3.2 SNDR Patch Date: Dec/22/2005 Install Requirements: Reboot after installation, an alternative may be in Special Install Instructions Install in Single User Mode Solaris Release: 8 9 SunOS Release: 5.8 5.9 Unbundled Product: Sun StorEdge Availability Suite Unbundled Release: 3.2 Xref: Topic: Relevant Architectures: sparc NOTE: After applying patch 116468-03 on both primary and secondary servers and rebooting, you must perform a full synchronization on all Availability Suite Remote Mirror asynchronous sets to ensure the data on the secondary volumes is consistent with the primary data volumes. For instructions to perform a full synchronization (sndradm -m) refer to Sun StorEdge Availability Suite 3.2 Remote Mirror Software Administration and Operations Guide (817-2784-10). For configurations where network latency and dataset size make a full synchronization prohibitive, the secondary may be synchronized with the primary via the tape based backup/restore coupled with an sndradm -E. NOTE: Problem Statement: In a Sun Cluster OE, when using Remote Mirror in combination with a Point-in-Time Copy to establish a ndr_ii pair for use during auto synchronization, the Point-in-Time Copy set should be preenabled by the system administrator, verses dynamically enabled by the SNDR auto-synchronization daemon. Failure to do so may cause the SNDR configured, Sun Cluster resource group to hang during failover processing. Please see BugId:5094206 or SRDB:77917 for detailed description Resolution: To prevent the Sun Cluster resource group hang, the Point-in-Time Copy set that is to be used by the SNDR synchronization daemon needs to be pre-enabled prior to turning on SNDR`s auto-synchronization (sndradm -a on) and enabling an SNDR ndr_ii pair (sndradm -I a ....). Repair: If an existing Sun Cluster configuration containing an SNDR light-weight resource group, with an ndr_ii pair appears to be hung, the Solaris processing running the following script needs to be identified and terminated. /usr/opt/SUNWesm/cluster/sbin/reconfig BugId's fixed with this patch: 2130820 4892753 4914957 4930424 4938202 4940318 4942385 4942997 4943413 4950370 4950802 4952176 4952178 4952920 4957445 4962068 4967629 4970042 4974911 4976889 4977645 4981223 4993281 4995602 4997398 5000951 5004765 5007944 5009144 5010349 5013414 5013757 5014238 5014239 5015987 5018806 5022892 5027558 5034369 5037654 5038271 5038552 5040685 5041365 5049952 5050438 5075457 5077630 5086741 6173700 6173736 6204207 6218008 6222650 6223102 6242272 6245800 6267284 6276243 6288855 6288876 6307873 6324470 6325644 Changes incorporated in this version: 2130820 6288876 6325644 Patches accumulated and obsoleted by this patch: Patches which conflict with this patch: Patches required with this patch: 116466-08 (or greater) Obsoleted by: Files included with this patch: /etc/init.d/rdc /etc/rc0.d/K79rdc /etc/rc2.d/S00trdc /usr/kernel/drv/rdc-5.8 /usr/kernel/drv/rdc-5.9 /usr/kernel/drv/sparcv9/rdc-5.8 /usr/kernel/drv/sparcv9/rdc-5.9 /usr/kernel/misc/rdcsrv-5.8 /usr/kernel/misc/rdcsrv-5.9 /usr/kernel/misc/sparcv9/rdcsrv-5.8 /usr/kernel/misc/sparcv9/rdcsrv-5.9 /usr/lib/mdb/kvm/rdc.so /usr/lib/mdb/kvm/sparcv9/rdc.so /usr/opt/SUNWesm/SUNWrdc/man/man1rdc/sndradm.1m /usr/opt/SUNWesm/SUNWrdc/sbin/sndradm /usr/opt/SUNWesm/SUNWrdc/sbin/sndrboot /usr/opt/SUNWrdc/lib/sndrd-5.8 /usr/opt/SUNWrdc/lib/sndrd-5.9 /usr/opt/SUNWrdc/lib/sndrsyncd /usr/opt/SUNWscm/lib/librdc.so.1-5.8 /usr/opt/SUNWscm/lib/librdc.so.1-5.9 Problem Description: 6325644 SNDR an error on disk-queue shutdown, results in recursive mutex_enter 6288876 Processing an SNDR suspend in a Sun Cluster via scswitch and the link going down (rdc_health) fails 2130820 failover of greater than 64 Remote Mirror sets causes cluster deadlock condition (from 116468-11) 6324470 SNDR panic during the disable of many sets, while disk queue replication is active 6288855 4 node cluster with heavy i/o load on the primary vols causes reference count maxed out WARNINGS (from 116468-10) 6307873 The change to use 'clinfo' instead of the Availability Suite 'dsclinfo' broke our SMF scripts 6242272 raw kstat display data output from Availability Suite (sndr, ii, sv, sdbc), contains non-ASCII chars (from 116468-09) 6276243 AS3.2 + latest patches: SNDR with disk queue, scswitch -z -g hangs. (from 116468-08) 6267284 AVS3.2 + 116466-06, 116467-07, 116468-07 - SNDR suspend code for disk queue hangs. (from 116468-07) 6245800 nskernd ( looping in _rdc_sync ) consumes excessive cpu cycles during sndr update sync 6223102 AVS3.2 latest patches on SC31u4: sndradm -P hang, ii_boot resume failed 6222650 writes in disk queue does not get applied to the secondary when in REP state 6218008 AS3.2 + latest patches: SNDR with disk queue + ndrii, scswitch -z -g hangs. (from 116468-06) 6173736 SNDR 3.2 - Notice of pending IOs printed at system shutdown 6204207 failed diskq disk hangs mount on boot (from 116468-05) 5075457 synchronous writes should happen until all members of group are done syncing 5086741 corrupted dscfg configuration database panic-ed solaris 6173700 sndradm -B dumps core (from 116468-04) 4976889 unable to delete SNDR set when logical host can't be found 5022892 enhance sndradm ds.log entries for TUNABLES and HEALTH 5027558 sndradm man page missing -R r (role reverse) usage and description 5034369 sndradm (-u) (-m) entries missing from ds.log 5037654 sndr dropped into logging with almost empty queue 5040685 deleting an ndr_ii config entry via sndradm -I d is not recorded in ds.log file 5049952 sndradm -h set: usage statement missing diskq parameter 5050438 sndradm -C not checking validity of cluster tag when adding disk queue to set 5077630 Deadlocks when {sndr/ii/sv}adm and {sndr/ii/sv}boot are invoked in Sun Cluster (from 116468-03) 4940318 Add logic to support the use of aliases for host or logical host 5010349 sndr bitmaps in one to many not getting updated 5013414 failed enable of a sync set with a disk q not atomic 5013757 diskq block/noblock operations not reported in ds.log 5014238 sndr should dump diskq if queue is full + link down 5014239 sndradm man page needs info on queuing state 5015987 update sync of async sets can drop network writes leaving secondary out of sync 5018806 cmn_err() needed when ref count is maxed out 5038271 diskq failure causes application to hang 5038552 disk queue not getting written when queueing 5041365 SNDR 3.2 Unit tests fail (GroupOrderedWrites) (from 116468-02) 4914957 lock contention for disk queues limit performance 4930424 enabling sndr with a diskqueue of 1TB or greater should fail 4938202 sndradm can be very slow when enabling more than 1500 RM sets 4942385 Long volume names cause warning messages to be cut off 4942997 sndr: sndradm unknown host:vol printed in ds.log 4943413 cluster failover during reverse sync makes mounted volume unusable 4950370 sndradm -A #threads[sndr-set] fails to report # to /var/opt/SUNWesm/ds.log 4950802 sndr bitmap count does not show that bits are set until sync or reboot 4952178 misleading disable message on timeout 4952176 iokstats broken 4952920 NHAS bitmap api can panic with 8k bitmaps 4957445 r_net_writeN should negative ack if secondary is logging 4967629 rdc_error_str is local, should be global 4970042 BAD TRAP: panic AVS 3.2 patch testing 4974911 sndradm help output missing a space for diskq removal 4977645 sndradm -e fails on 2'nd logical host 4981223 sndr async mode with many sets sharing a disk queue eats up cpu 4892753 flusher get stuck with diskq set to blocking mode and heavy I/O 4993281 Availability Suite 3.2 using sndr causes system hangs 4995602 double dec in _rdc_remote_flush() can access freed mem 4997398 failure removing diskq from multiple resource groups in SunCluster 5004765 writes to RM vols with full diskq causes incoming threads to be block 5000951 _rdc_async_throttle needs to print disk queue full message 5007944 Data replciation on middle hop of multihop config fails due to overlapping i/o 5009144 one to many with diskq and memory queue may not queue (from 116468-01) 4962068 disk queue upgrade results in 'WARNING: disk queue alloc failed(28)' Patch Installation Instructions: ----------------------------- Since this patch updates modules that live in the kernel, it is necessary for the user to boot the system up in single user mode to apply the patch and then reboot the system. Availability Suite is enhanced by this patch to operate effectively in an SNDR environment using disk queues, where the configured volume exhibits a high rate of change as a result of a "hot-spot" on one or more blocks of a replicated volume set. Through the use of a resized SNDR bitmap volume, performance issues pertinent to the "hot-spot" are avoided. Details follow. Availability Suite`s Remote Mirror Copy software uses a portion of each bitmap volume configured to maintain a per-chunk reference counter. This functionality is required so that multiple revisions of the same disk block or blocks within a 32KB chunk can be queued up for replication, an architectural requirement necessary to maintain write-order consistency between configured SNDR replicas. With the introduction of disk queues in Availability Suite Version 3.2, there is now the potential for tens to hundreds of instances of the same data block or blocks within a 32KB chunk to be queued up for replication. A typical scenario where this is being seen is during a database reload, or complex update operation, where multiple sections of the database experience "hot-spots", or places where large numbers of I/O writes occur to the same disk block or blocks in a relatively short period of time. The behavior is not limited to any database, file system, or application I/O, all which can exhibit "hot-spots" of activity in a configured replicated volume. Prior to this patch, the per-chunk reference counter was maintained in a single byte (8-bits), allowing from 1 to 255 instances of the same block or blocks to be queued up for replication. With memory based queues and early usage of disk based queues, it was presumed that the 255 limit was sufficient, and that reaching this value would only be a momentary, infrequent occurrence. For certain customers and installations using Availability Suite Remote Copy Version 3.2, this is not the case, and the behavior causes poor application performance, which may be viewed as a system hang, along with the following message being seen in /var/adm/messages, from tens to tens-of-thousands of times. WARNING: SNDR: bitmap reference count maxed out for During periods of seeing this message, I/O write operations are being delayed, as long as replication is still active and there is room in the disk queue for other I/Os needed to maintain write order consistency. The problem with these "hot spot" I/Os being "delayed", is that database application writes are now occurring at network I/O replication rates, not disk I/O rates, causing the database, file system or application to appear to be hung or slow. This is further problematic since these "hot spots" are assumed to be a "commit write", meaning that all database, file system or application I/O is serialized and now delayed while this single write I/O waits for the per chunk reference counter to be less then 255 maximum value. This patch resolves this problem by detecting the ability to increasing the reference counter from an 8-bit byte value, to a 32-bit value, allowing the maximum value to go from 255 to 4,294,967,295, a value while is presently unreachable with Remote Mirroring 3.2 software. Each time a Remote Mirror replication set transitions from logging mode to replicating mode, if it has not already done so, it will determine if the current bitmap volume is sized large enough to use 32-bit reference counters, verses 8-bit reference counters, and use the larger size if possible and if only when disk queues are configured. There are various means in which to allow this larger size volume (as determine by the updated dsbitmap utility) to be detected by the Remote Mirror software: 1). The current bitmap volume is already sized large enough based on prior rounding up, or oversizing of the original bitmap volume at its initial configuration time. 2). While in logging mode the bitmap volume is replaced with a larger volume, followed by the use of the SNDR "reconfigure bitmap" commands as follows: sndradm [opts] -R b p [set] reconfig primary bitmap 3). While in logging mode the bitmap volume is resized and then replaced with the "reconfigure bitmap" commands as follows: sndradm [opts] -R b p [set] reconfig primary bitmap 4). At enable time, the bitmap volume is sized based on the "disk queue" sizing as reported by the `dsbitmap` utility, which has been enhanced to indicate the two sizes of Remote Mirror bitmap volumes to configure, one without disk queues, one with disk queues. For installations unaffected by the bitmap reference count maxed out issue, there is no need to increase their current bitmap volume size, even if using disk queues. Once the patch has been applied to the system and the affected SNDR replicas have their bitmap volumes resized, the bitmap reference count maxed out problem will no longer exist for those volumes. In view of the above changes, be aware of the fact that if the maxed reference count limit is no longer a restriction, then the next limit is an SNDR disk queue full condition. In this case the disk queue operational modes of blocking or non-blocking are relevant. If blocking mode is selected and the disk queue becomes full, then the same performance issues with the bitmap reference count maxed out exists. If non-blocking disk queues are enabled, he SNDR set will automatically drop into logging mode once the disk queue full condition is reached. Special Install Instructions: ------------------------------------ None. README -- Last modified date: Thursday, December 22, 2005