CON5repeat - stress disk controllers and device drivers
CON5repeat -nofsck [ options ] source_dir target_dir iterations | duration [ instances ]
CON5repeat is a test program designed to stress disk controllers and their device drivers, and to test that the system functions correctly under heavy load. The CON5repeat test runs many concurrent instances of a basic test that copies and compares directories, and reports trouble if any of the copies fails or does not compare as expected.
CON5repeat also can function as an NFS stress test if an NFS mount is specified for target_dir.
- -n
- No commit. Do the preliminary calculations and testing for preconditions, but do not do the copying. If there is an error in any of the arguments, then exit(1). If there are no errors, then print the number instances that will actually be used to start, and then exit(0).
- -v
- Verbose. Print various extra trace messages to standard error. This is used primarily for debugging. The nature of the messages produced is subject to change and is not part of the supported CON5repeat interface.
- -nofsck
- Do not run fsck. Normally, CON5repeat runs fsck -n on the file systems that contain the source directory, the target directory, and root, after every iteration. The -nofsck option prevents CON5repeat from running fsck.
Note: Always specify -nofsck. If fsck is not disabled, CON5repeat aborts after one iteration.
- -nofsck-root
- Do not run fsck on the root file system. This is useful when you want to run fsck on the source and destination directories, but you know that there is other activity on the system, such as other tests. Results from fsck are not useful if the file system is not in a quiescent state.
- -disk-reserve kbytes
- Reserve disk space.
Note: If you reserve disk space, the system will behave as if the amount of disk space available is kbytes less than it really is, and that is reflected in the calculation of the number of instances that are supportable by the available disk space.
- -processes-per-instance
- Override number of processes per instance. The CON5repeat test autoconfigures the number of instances based on available space and available processes. It computes a budget based on its knowledge of how many processes each instance uses. The -processes-per-instance option overrides the number of processes per instance, and so, indirectly changes the number of instances that are configured.
All options that start with -fault- are part of a mechanism for injecting artificial faults into CON5repeat. These options are used to run quick, controlled tests of the behavior of the higher layers in the face of failures. These kinds of failures should be rare. If CON5repeat is not tested using fault injection, many code paths might not be exercised adequately.
- -fault-max-instances instances
- Tells CON5repeat to artificially inject a failure of a fork() system call, with errno of EAGAIN for all instances above the specified number. This option is used to test behavior in the face of resource exhaustion, without really having to run tests that are large enough to exhaust system resources.
- -fault-cpio
- Cause a thread, chosen at random, to pretend that cpio failed, even if there was no error.
- -fault-diff
- Cause a thread, chosen at random, to pretend that diff failed, even if there was no error and there were no differences.
- -fault-file-add
- For a thread, chosen at random, after cpio runs but before diff runs, add an extra file to the target directory.
- -fault-file-drop
- For a thread, chosen at random, after cpio runs but before diff runs, remove an arbitrary file from the target directory.
- source_dir
- The source_dir is the path name of the directory from which CON5repeat reads. The CON5repeat test does not modify anything in this directory, so it can reside in a file system that is mounted read-only.
A good directory to use for source_dir is a copy of /kernel. Although, CON5repeat does not modify /kernel explicitly, it still causes the root file system to be modified because all the files that are read have their access times updated. If an error occurs while modifying a block of inodes, the root file system can be lost.
The source_dir directory should reside on the disk connected to the controller being tested.
- target_dir
- Target copies are written to subdirectories of target_dir. Do not put anything you want in this directory. Data in this directory might be lost or overwritten.
The target_dir directory should reside on the disk connected to the controller that is being tested. If CON5repeat is being used as a network stress test, the target_dir directory should be NFS mounted from a remote system.
- iterations
- The number of times to repeat the whole process, sequentially. The length of time an iteration lasts depends on many characteristics of the configuration of the system and its load. You can usually specify 100 iterations, and then tell CON5repeat to terminate later.
- duration
- Limit the test to the specified duration. The format of duration is hh:mm:ss. This duration might be exceeded to allow completion of the last iteration started.
- instances
- If instances is specified, then CON5repeat does not exceed instances number of concurrent instances. The CON5repeat test calculates the largest number of instances that the available disk space allows. If instances is specified on the command line, then CON5repeat uses the smaller of the two numbers. If instances is not specified, then CON5repeat uses the calculated value.
The test first outputs summary information about the duration, directories to be used, and the number of concurrent instances. After that are produced results, line by line, showing the progress of the test.
Each progress line has four fields. The first field is the time (in UNIX seconds), and the second field is the time in the format YYYYMMDDhhmmss. The third field is the time since the test started, and the last field is a text entry that describes the status of the test.
About 24 hours
This test is designed to use most of the resources of the machine, so it does not run well with other tests.
Run CON5repeat with the -nofsck option to avoid problems running fsck after each iteration.
Create a source directory. Note that it is safer not to use /kernel due to the possibility of file system corruption. For example:
# cp -r /kernel /tmp/kernelcopy
If CON5repeat fails, it creates a file called CON5.failed. You must remove this file for CON5repeat to start normally.
This example runs CON5repeat using /tmp/CON5src as the source, and /tmp/CON5dest as the actual test directory. It should run for 1 hour. Because each loop of the test (in this case) takes 17 to 18 minutes to run, the test actually lasts 1 hour, 11 minutes.
# ./CON5repeat -nofsck /tmp/CON5src /tmp/CON5dest 1:00:00 Source directory: /tmp/CON5src Destination directory: /tmp/CON5dest Time limit : 1:00:00 (3600 sec) Concurrent instances: 12 950098596 20000209121636 0:00:00: Start. 950099636 20000209123356 0:17:20: Iteration 001 complete. 950100754 20000209125234 0:35:58: Iteration 002 complete. 950101811 20000209131011 0:53:35: Iteration 003 complete. 950102866 20000209132746 1:11:10: Iteration 004 complete. 950102866 20000209132746 1:11:10: !PASS: Done.
cpio(1), diff(1), fsck(1M)
Copyright 2005 Sun Microsystems, Inc. All rights reserved.