             Pong3
   (the still un-named utilty)

     Written by Derek Balling
       (dredd@megacity.org)


NOTE: This version is heavily in an alpha state 
and neither myself nor my employer accept any
liability for any damage this suite may cause your
system. 

You have been warned. 


GENERAL NOTES
-------------

Pong3 is a program, based partly on [and cannibalizing portions of
the source code from] Big Brother and SPong (Son of Pong). While
both of these programs were excellent in their own right, they
tended to rely very heavily on listening to network sockets, which
forced overhead in both network and CPU resources if you were
monitoring a large number of machines.

One other feature shared by both was a history of all changes, and the
concept of "outdated" data. (Purple alarms for users of either of these
suites).

So what we have is a system that works by having a server use
finger to execute a custom script on the client machine and comparing
that data to "acceptable norms" and responding accordingly.

The client script is ALSO intelligent enough to be able to use
OTHER arguments to finger so that it will generate output that is 
in a format compatible with the very popular MRTG package. This suite
has been tested with versions 2.4 through 2.5.3 of MRTG and works fine
with all such versions. There is a new version 3.0 of MRTG in development,
but it is currently not in a state where I can confirm that this does or
does not work with it.


INSTALLATION
------------


Server Installation
--------------------

(NOTE: These installation instructions completely ignore the whole concept
of CGI and such. It's assumed if you're a sysadmin you know that much
already)

(FURTHER NOTE: It has been noted that a number of Linux and other
distributions tend to include incomplete Perl installations, leaving
out important Perl modules and such. The modules in "required_installs"
are the ones which are NOT part of the normal Perl installation. Any
other modules you need can be retrieved from 
ftp://ftp.perl.org/pub/CPAN/modules )

Create a web directory "pongiii" and place all of the Pong3 files in that
directory. Also copy the images directory to pongiii/images.

Copy pong3.conf.dist to pong3.conf and edit it to suit your system.
	
The @notify_addresses line contains a list of all e-mail addresses you wish
to receive updates whenever a service changes to a worse color (green to
yellow or red, yellow to red). If you wish to disable this, use a line:

@notify_addresses = ();

A sample list of multiple addresses might look like

@notify_addresses = ('user1\@foobar.com', 'user2\@nowhere.com');

A host is defined in the conf file with two lines:

$services {'hostname'} = 'ping smtp dns client ssh';
$groups{'hostname'} = 'NOC';

and an optional one:

$odd_http{'hostname'} = '401'; # or some other code

The services line contains a space-delimited list of services which
should be checked on this host.

The groups line contains a space-delimited list of "machine groups"
which this machine belongs to. You must have at least one group defined
for each machine. If you are a single-site, you will probably want to
either put all the machines in the same group, or break them out like
"Routers", "Servers", "Workstations", etc.

The odd_http line contains a code, used on the http service, that pong3
will ALSO accept as valid, in addition to a 200-series code. (e.g. if you
are querying a web server which is completely password protected, a 401
would indicate to you that "yes, your web server is up and running".

The @groups contains a list of all the groups included in all of the
$groups{} lines above it.

$DISPLAY_HEADER_IMAGE is the filename for the image you want centered at
the top of your pong3display.pl page. This can either be absolute, or
relative to the pong3display.pl CGI script's location.

Once you have edited the pong3.conf file, it is time to set up your server
to execute the server file on startup. This is easy to do with a line in
your rc.local (or wherever) doing

/path/to/pong3server.pl &

If at any time you CHANGE the pong3.conf file, you can do:

# ps -auxw | grep pong3server.pl

and kill -HUP the PID of it to have it reload the configuration file.

e.g.

# kill -HUP 213

The server also (as of 0.85) has the ability to send any changes in status
to an external command, defined in pong3.conf as LOG_EXTERNAL_CMD.
Passed as arguments to this command are:

# command <HOSTNAME> <SERVICE> <COLOR> "<SUMMARY>"

You can use this to dump this to a file, a database system, whatever type of 
system you prefer to use to record any history of changes.

This can be turned on or off by setting LOG_CHANGES to 1 or 0.


CLIENT INSTALLATION
-------------------

Step 1: Set up the finger daemon

In your /etc/inetd.conf file, there is a line similar to the following:

finger  stream  tcp     nowait  root    /usr/sbin/tcpd  in.fingerd

Replace that line with

finger  stream  tcp nowait root  /usr/sbin/tcpd  /usr/local/bin/pong3client.pl

You will then need to send a SIGHUP to inetd to tell it to reload its .conf
file.

Step 2: Install Perl 5

Install Perl 5. This software has been tested with v5.001 and above and
works with them fine.

Step 3: Install the client

Install pong3client.pl as /usr/local/bin/pong3client.pl. Make sure the mode
of the file is 0555.

Step 4: Configure the client

For a basic install the only lines you need to worry about are the 
lines where $df, $free, $uptime, and $bsddf are defined. The first three
variables should contain the path to the commands in question. The third
should be uncommented (setting the value to 1) if your DF gives output
in the BSD-style format.

Since the client replaces your in.fingerd client, and by default would
allow fingering of normal users, there are two other options you need to
configure:

$FINGER_ALLOWED = 0

This is a boolean. If set to any non-zero value, failure to find the
selected Pong3 service in the client will allow the script to pass
the argument to the local finger agent for Fingering (and display to 
remote users).

$BOUND_CHECK = 16

This is a security check to ensure that it is nearly impossible to
accidentally overwrite the stack and do bad things. Any argument passed
longer than this value will be substr'ed to this length. You should only
need to change this if (a) you wish to allow fingering of local users
using the FINGER_ALLOWED flag, and (b) you have users with login ID's 
greater than 16 characters long. 

Step 5: Test the client

Doing

# /usr/local/bin/pong3client.pl pong3
<or>
# /usr/local/bin/pong3client.pl<cr>
pong3<cr>

Should return you output similar to the following:
88
4752
0.31
97.4

Where 88 is the "percentage used" of the "Most full drive".
Where 4752 is the amount reported by vmstat (see the BUGS/TO-DO list
  regarding this figure)
and 0.31 is the HIGHEST load average of the 1, 5 and 10 minute load-avgs,
and 97.4 is the current CPU Idle state.

If you do NOT get output that resembles this, check to see that your 
paths are correct to the commands. Also try setting/unsetting bsddf if
your percentage comes out wrong.

If you still have problems... Hey, its an alpha release! :) Try to figure
it out and if its a problem in my code, let me know what you had to do to
make it work so that I can update the client.


SPECIAL FEATURES
----------------

Use the Pong3 client as a source of MRTG data.

In addition to the system monitoring capabilities of the Pong3 client, it
is also capable of being used to generate statistics about the client system
by fingering DIFFERENT fake users. 

Example:

# finger df@host.domain.com
[host.domain.com]
86
0
3 days, 22:23
This system

Which reports the "percent used" of the fullest drive in a format that MRTG 
can use. You only need to do a "grep -v" on the domain portion (so that the
[host.domain.com] line will get stripped before MRTG sees it and tries to
interpret that as the first line.

The client script recognizes four basic services without any additional
configuration "df", "free", "idle" and "load". It is important to note 
that due to limitations in MRTG not handling floating point variables, load is
multiplied by 100 before it is reported to MRTG (otherwise MRTG would simply 
drop the fractional portions and your graphs would be very very boring).

The system is also capable, with additional configuration of monitoring:

	o 	Web Server Usage, in hits, total bytes sent and KB/sec
	o	Mail Server Usage, in both messages and bytes
	o	Mail Spool Sizes
	o	FTP Throughput
	o	Name Server Usage
	o	What version of the client is installed 

The Web Server Usage requires the following at the present time:
	o	Apache to be the web server of choice
	o	Changing "$APACHE_CONF" to point to your Apache
		configuration file. The client will use this file
		to find all of your web logs and process them.
	o	Setting WEB_CHECK_INTERVAL to however often you have
		MRTG polling the client, so that the client knows how
		many minutes "back in time" to count.
	o	The GNU textutils package (specifically the "tac"
		command) needs to be installed so that the client
		can quickly look at an "inverted" version of your 
		web logs.

	Fingering "http5min" will yield the total bytes sent over
	the interval specified by WEB_CHECK_INTERVAL.

	Fingering "httpkps" will yield the average bytes per second
	sent over the interval specified by WEB_CHECK_INTERVAL 
	(essentially this is (http5min/(WEB_CHECK_INTERVAL*60)) )

	Fingering "httphits" will yield the number of hits over
	the interval defined by WEB_CHECK_INTERVAL.

The Mail Server Usage requires the following at the present time:
	o	Sendmail to be the mailer of choice
	o	MAIL_LOG_FILE needs to be set to where syslog
		dumps all statistics about mail usage. You should
		confirm ahead of time that you are seeing actual
		data about the message sizes, etc. in this log.
		It is not uncommon for mail data to be logged in
		several different syslog files based on differing
		levels of "severity" being reported to syslog.
	o	MAIL_CHECK_INTERVAL needs to be set to the interval
		with which you use MRTG to poll the client.
	o	The GNU textutils package (specifically "tac") needs
		to be installed.

	Fingering "mailmsgs" will yield the total number of mail messages
	processed by the mail server over the specified interval.

	Fingering "mailbytes" will yield the total bytes of mail
	processed by the mail server over the specified interval.

The Mail Spool Usage requires the following...

	o	MAIL_SPOOL_DIR should be defined as the directory your
		user mail spool resides in.

	Fingering "mailspl" will return as the "input" the maximum value
	of an individual mail spool file, with the "output" value being
	the average usage.
	
The Name Server Usage requires the following...

	o	NAMED_NDC should be set to the 'ndc stats' command
		possibly including the full path to the file.
	o	NAMED_STAT_FILE should be set to the value defined in
		named.(boot|conf) as "statistics-file"
	o	statistics-interval should be set to 0 in your 
		named configuation.

	Fingering "namedreq" will produce the total number of 
	requests made since the counter was last reset. (Not sure
	precisely when that is).

The FTP Throughput Usage requires a little bit more work...

	o	Create a .netrc file on the client machine that resembles
		the following:

machine remote.notclient.domain.com
login remoteuser
password remotepass
macdef init
bin
get /path/to/remote/file/cp32e404.exe /dev/null
put /path/to/local/file/cp32e404.exe /dev/null
bye
    
		Test this out by issuing the command
		"ftp -v remote.notclient.domain.com" 
		and seeing if it retrieves the remote file (and dumps
		it on your local /dev/null) and sends your local file
		(the same file for best results) to the remote /dev/null.
		The proper mode for the .netrc file is 600, if ftp
		complains.

	o	Edit the pong3ftpfetch.sh, replacing the hostname
		in it with the remote hostname you are retrieving 
		stats from. 

	o	Put a cron job in the client's system such that 
		pong3ftpfetch.sh is run as desired, with output
		piped to /tmp/ftp-output, e.g.:

10,40 * * * * /usr/local/bin/pong3ftpfetch.sh > /tmp/ftp-output

		IMPORTANT NOTE: the file that is OUTPUT needs to be
		readable by the finger daemon when it runs. The easiest
		way to achieve this by all accounts is to run the 
		finger daemon as root (configurable in your /etc/inetd.conf).
		You are free to peruse the included source code for the 
		finger daemon and see that there are no security holes in
		it that are being opened up by being run as root.

		ANOTHER IMPORTANT NOTE: Do the FTP by hand a couple times
		and take note of how long it takes to complete. Double that,
		to account for bad days, and use THAT as your "interval".
		Example: Above we poll the client on the even half hours
		(0 and 30), with the transfer taking about 7 or 8 minutes to 
		complete. We start the FTP about 10 minutes from the 
		poll to ensure that no problems are caused by the variance
		in the clocks (e.g. polling the client while the the fetch
		script is overwriting the tmp file the client needs) but
		far enough away from the NEXT poll that we know the fetch will
		be completed before it is requested.

		Unless you are using very small files to test your throughput,
		it is very difficult to achieve "fine-grain" throughput 
		stats. (And small files yield deceptive throughput data 
		anyway)


KNOWN BUGS and "TO-DO"
----------------------

These are just the ones I'm aware of. :)

	o I have not come up with a CLEAN and ACCURATE 
	  cross-platform method of determining "free memory".
	  The *NIX 'vmstat' command would APPEAR to be it, but
	  the output from it doesn't seem to reflect reality
	  very well, and in some cases seemed to simply 
	  increment continually until it hit some rollover point
	  at which point it wrapped back to zero and started over.
	  Twisted? Yes. Useful? No.

	o If a mail message is being processed at the exact time that
	  the client is being polled for mail information, it is possible,
	  due to only one syslog line having been outputted, that the 
	  message will be skipped by both the current and the subsequent
	  client poll, because of the method the client uses for "stepping
	  back" through the logs.

	o Dependencies - The concept that if "router X" goes down, then
	  hosts Y and Z, behind that router, get flagged a different color
	  because there's no way they can be reached ANYway.


HISTORY
-------

Release Version 1.00
-----------------
# Changes from Beta Version 0.91
# 
# Added code to monitor CPU Idle state (retrieved via the top command)
# I don't have any Sun machines to test on any more so you may need to 
# modify the "default" commands I have included in the config to work 
# properly for you.
#
# Added default values for FreeBSD (thanks to Jake Dias for the work on 
# that front).
#
# Added a subject to the email sent to the notification list which has
# the host and color in the subject. (Thanks to Michael Riedel for the
# suggestion)
#

Beta Version 0.91
-----------------
# Changes from Beta Version 0.90
#
# Optimized some code in the pong3server.pl script and eliminated a bug which 
# could cause the number of red items to get screwed up and cause false 
# alarms. Thanks to Josh McMinn (jmcminn@speedchoice.com) for the assist.
#
# Josh also created a "scaled down" pong3display.pl script called 
# pong3simple_disp.pl which basically creates a summary page. The summary 
# contains all the "header" information that pong3display.pl has (color 
# summary, data collection dates, etc.), but instead of the "christmas tree"
# display, it contains EITHER the string "All nodes are green", or
# a list of ONLY what services/hosts are down, along with the usual links
# to the pong3detail.pl script. That script also takes all the standard
# 'exclude' and 'only' parameters that pong3display.pl accepts.

Beta Version 0.90
-----------------
# Changes from Alpha 0.89
#
# Fixed a bug in pong3server.pl that would report green on drive space
# usage and CPU load if the client returned no data. (Hint: Null *IS*
# less than the threshold value, but is NOT a valid response *grin*)
#
# Fixed a bug in pong3display.pl that would cause problems with any
# hostnames containing a hyphen in the hostname.
#
# Allowed standard TCP-based requests to show banners (SMTP, POP-3, etc.)
# as part of the pong3detail.pl script. 
#
# Fixed a bug where if a finger process (client or clientv service) hung,
# pong3server.pl would just silently sit there. It now times out and reports
# status of red after a 60 second timeout. Special thanks to Tim Sailer 
# (tps@major.pita.org) for the insight on the quick-and-easy way out on that
# one.
#
# Fixed a bug in the proxy server code where it would always report green
# status, despite the fact that your proxy server had fallen off the face
# of the earth. In a word - oops.
#
# Fixed (again) the bug that plagued hosts with hyphens in their names. It is
# now really fixed, I can assure you through personal experience since I now
# have hosts with hyphens in my network. :)
#
# Added a feature so that the "Data Collected" line in pong3display.pl will
# feature a red background if the data collected is older than 15 minutes old
# (an indication that the pong3server.pl task may have died and no longer be
# monitoring your services)

Alpha Version 0.89
------------------
# Changes from 0.88
#
# Fixed a bug in email notification where it would never notify anyone.
#
# Made the image name in pong3display.pl (top center) a variable in the
# pong3.conf so that it is more dynamic (and doesn't require people to
# manually edit each version of the display script for upgrades.
#
# Set the column headers in pong3display.pl so that they were regenerated
# each iteration beneath the group. That way if you had a very large
# multi-screen display session, you wouldn't have to scroll back up so 
# far to find out what service was offcolor.
#
# Created a summary at the top of the pong3display.pl page detailing the 
# number of host-services in each color state. Useful for if you know 
# "telnet on machine X is red", and are waiting for it to come back up
# but that prevents you from noticing that "df on machine Y is yellow"
# because the background color shows the worst color.
#
# Added the "mailtotal" fuction, similar to mailspl, which reports total
# mail spool usage, and the largest mail spool file. (Thanks to
# Michael Riedel for this suggestion)
#
# Added the "proxy" service to the server. This will check a proxy server
# to see if it is functional.
# this also requires the line:
# $proxy_port{'hostname.com'} = 'port_num';
# to be entered in pong3.conf. Like http, this also will accept the 
# $odd_http{$host} to look for a response code of other than 200. Currently,
# this test generates a request to www.yahoo.com for its test. I figure
# most people with proxies are using them to cache things of this nature, 
# and yahoo is almost sure to be in your proxy's cache, eliminating any
# excess traffic. :)
#
# This change requires the installation of the libwww package included in
# required_installs. This is especially important, because I'll be rewriting 
# the http service to use that package as well. I am hoping to be able to
# implement an "https" service shortly, (e.g. as soon as I have one on-site
# to test with) and it will require that package as well.

Alpha Version 0.88
------------------
# Changes from 0.87
#
# Finalized support for IMAP service (it was already in the server, the
# configuration file just didn't know about it). Also added FTP service
# checking. (Thanks to Chris Liljenstolpe for pointing out that it was
# missing!)  
#
# Fixed problem with the finger routine where "finger @hostname" would
# return pong3 data instead of falling through to finger routine.
#
# Fixed bug in pong3display.pl where dead/excluded/non-configured
# items would appear in graph or otherwise affect background color.
#
# Made sure that the directory wasn't hard coded anywhere except in
# pong3.conf, the way it should be. 
#
# Made minor changes to documentation, as requested by alpha users.
#
# Added the ability to check for the Telnet service on a remote machine.
# (by request).
#

Alpha Version 0.87
------------------
# Changes from 0.86
#
# Fixed the "df" on linux to ignore iso9660 filesystems (which would
# always report 100% full), thanks to Daniel Lafraia for pointing it
# out. 
#
# Fixed a problem in the server where it wasn't waiting the sleep
# time because I was using two different variables. (Duh!) Thanks again
# to Daniel.
#
# Removed the need for a library proprietary to my employer in order for
# pong3detail.pl to work. Now uses the CGI package that is part of the 
# standard Perl5 distribution (CGI.pm)
#
# Pong3display.pl is now able to be fed a list of exclusions and 
# "only" params, e.g.
#   http://hostname/pongiii/pong3display.pl?exclude=Routers
#  would exclude routers, while
#   http://hostname/pongiii/pong3display.pl?exclude=Routers+Servers
#  would exclude routers and servers. Conversely
#   http://hostname/pongiii/pong3display.pl?only=Routers
#  would ONLY display the routers omitted from the previous graph.

Alpha Version 0.86
------------------
# Changes from 0.85

# Removed the necessity for butt-ugly finger daemon installation. The
# Pong3 client now acts as the finger daemon itself, passing non-pong3
# fingers to the local finger routines if configured to do so.

# Fixed a few more bugs in date-checking.

Alpha Version 0.85
------------------
# Changes from 0.84

# Added the "namedreq" function. This will parse a named.stats file, and
# return the total of all inquiries received. This is a "running" total, and
# thus should NOT be graphed using "gauge" or "absolute".
#
# Added the "mailspl" function. This will report (in) the maximum size of a
# user's mail spool file, and (out) the average size of a mailspool file.

# Added the LOG_CHANGES option to the server, which will pass any status
# changes to an external script for processing.

# Added the "odd_http" config file option to the server which allows it to
# accept another response code in addition to any 200-series code the 
# Web Server may generate.


Alpha Version 0.84
------------------
# Changes from 0.83
#
# Added 'httphits' keyword which allows the script to generate an MRTG file
# containing (in) good filled requests and (out) requests that failed with 400
# or 500 series errors.
#
# Added 'clientv' which allows someone to hit the client and have it
# report its version. Useful for figuring out what versions of the software
# you have running on different clients as you upgrade certain machines to
# take advantage of new features, but DON'T upgrade others as they don't
# need new features. :) 
#
# Fixed a bug in the mail and web server log parsing which crapped out during
# the reverse searches and wouldn't stop til it reached December (oops)

Alpha Version 0.83
-------------------
First Public Alpha - Released 1 January 1998

