  ------------------------------------------------------------------------

sbox: Put CGI Scripts in a Box

Abstract

sbox is a CGI wrapper script that allows Web site hosting services to safely
grant CGI authoring privileges to untrusted clients. In addition to changing
the process privileges of client scripts to match their owners, it goes
beyond other wrappers by placing configurable ceilings on script resource
usage, avoiding unintentional (as well as intentional) denial of service
attacks. It also optionally allows the Webmaster to place client's CGI
scripts in a chroot'ed shell restricted to the author's home directories.

sbox is compatible with all Web servers running under BSD-derived flavors of
Unix. You can use and redistribute it freely.

The current release is 0.98. Download it from the Web at
http://stein.cshl.org/~lstein/sbox/sbox.tar.gz.

Introduction

Poorly-written CGI scripts are the single major source of server security
holes on the World Wide Web. Every CGI script should be scrutinized and
extensively tested before installing it on a server, and subject to periodic
review thereafter.

For Web hosting services, however, this advice is impractical. Hosting
services must sponsor multiple Web authors of different levels of competence
and reliability. Web authors do not trust each other, and the Web hosting
service does not trust the authors. In such a situation, CGI scripts are
even more problematic than usual. Because all CGI scripts run under the Web
server's user ID, one author's scripts can interfere with another's. For
example a malicious author could create a script that deletes files created
by another author's script, or even cause another author's script to crash
by sending it a kill signal. A poorly written script that contains a
security hole can compromise the entire site's security by, for example,
transmitting the contents of the system password file to a malicious remote
user. The same problems are faced by large academic sites which provide Web
pages for students.

For most Web hosting services it would be impossible to subject each and
every author's CGI scripts to code review. Nor is it practical to cut off
CGI scripting privileges entirely. In the competitive world of ISP's,
customers will just move elsewhere.

The most popular solution to this problem is the use of "wrapper" scripts.
In this system, untrusted author's CGI scripts are never invoked directly.
Instead a small wrapper script is called on to execute the author's script,
the target. The wrapper is SUID to root. When the wrapper runs, it subjects
the target to certain safety checks (for example, checking that the script
is not world-writable). The wrapper then changes its process ID to match the
owner of the target and executes it. The result is that the author's script
is executed with his own identity and privileges, preventing it from
interfering with other author's scripts. The system also leads to increased
accountability. Any files that an misbehaving script creates or modifies
will bear the fingerprints of its creator. Without a wrapper, it can be
impossible to determine which author's script is causing problems.

The limitations of wrapper scripts are three-fold:

  1. Wrappers provide little protection against attacks that involve reading
     confidential information on the site, for example sensitive system
     files or protected documents.
  2. Wrappers expose the author to increased risk from buggy scripts. By
     running the author's script with his owner permissions, the wrapper
     grants it the ability to read, write or delete any file in the author's
     home directory.
  3. There is no protection against denial-of-service attacks. A buggy
     script can go into an endless loop, write a huge file into /usr/tmp, or
     allocate an array as large as virtual memory, adversely affecting
     system responsiveness.

A better solution is to box author's CGI scripts. In this solution, the CGI
script is executed in a restricted environment in which its access to the
file system and to other system resources is limited. This is what sbox
(Secure Box) accomplishes. When run, it does several things:

  1. It checks the environment for sanity. For example, the script must be
     run by the Web user and group, and not by anyone else.
  2. It checks the target script for vulnerabilities, such as being world
     writable or being located in a world writable directory.
  3. It performs a chroot to a directory that contains both the script and
     the author's HTML files, sealing the script off from the rest of the
     system.
  4. It changes its user ID and/or group ID to that of the target script.
  5. It sets ceilings on the target script's CPU, disk, memory and process
     usage.
  6. It lowers the priority of its process.
  7. It cleanses the environment so that only variables which are part of
     the CGI protocol are available to the script.
  8. It invokes the target script in this restricted context.

sbox is highly configurable. It can be configured to chroot without changing
its process ID, to change its process ID without performing the chroot, to
change its group ID without changing its user ID, to establish resource
ceilings without doing anything else, or any other combination that suits
you.

System Requirements

sbox is designed to run with any Unix-based Web server. However version 0.90
has only been tested with Apache 1.2 and Linux 2.0.30 (more testing is
pending). The package should compile correctly on any standard Unix system;
however the resource limits use the BSD-specific setrlimit() and
setpriority() calls. If you do not know whether your system supports these
calls, check for the existence of the file /usr/include/system/resource.h.
If this file does not exist, then chances are slim that you can use the
resource limits. You can run sbox without the limits by setting the
preprocessor define SET_LIMITS to FALSE (see below).
  ------------------------------------------------------------------------

Installation

After unpacking the package, you should have the following files:

Makefile
README.html (this file)
README.txt  (this file as text)
sbox.h
sbox.c
env.c

You will first examine and edit the Makefile, then change sbox.h to suit
your site configuration and preferences. It is suggested that you keep
copies of the unaltered files for future reference.

Adjusting the Makefile

Using your favorite text editor, examine and change the value of the
INSTALL_DIRECTORY variable. This is the location in which sbox will be
installed, and should correspond to your site-wide CGI directory.

You may also need to fiddle with the options for the install program. The
default is to make sbox owned by user "root" and group "bin", and installed
with permissions -rws--x--x. This configuration is SUID to root, necessary
in order for the chroot and process ID changing functions to work.

If you wish to adjust the C compiler and its flags, change the CC and CFLAGS
variables as needed.

Adjusting sbox.h

This is the fun part. sbox.h contains several dozen flags that affect the
script's features. These flags are implemented as compile-time defines
rather than as run-time configuration variables for security reasons. There
is less chance that the behavior of sbox can be maliciously altered if it
has no dependences on external configuration files.

You should review sbox.h with a text editor and change the settings as
needed.

General Settings

These variables correspond to general sbox settings such as logging and
environment consistency checking.

WEB_USER (default "nobody")
     This defines the name of the user that the Web server runs under,
     "nobody" by default. If your Web server uses a different user ID, you
     must change this define to match.

WEB_GROUP (default "nobody")
     This defines the name of the group that the Web server runs under,
     "nobody" by default. If your Web server uses a different group ID, you
     must change this define to match.

UID_MIN, GID_MIN (defaults 100,100)
     These define the lowest UID and GID that the script will run a target
     CGI script as. On most systems, low-numbered user and group IDs
     correspond to users with special privileges. Change these values to be
     the lowest valid unprivileged user and group ID. Under no circumstances
     will sbox run a target script as root (UID 0.)

SAFE_PATH (default "/bin:/usr/bin:/usr/local/bin")
     This defines the search path that will be passed to the author's CGI
     scripts, overriding whatever was there before.

Logging Settings

sbox can be set to log all its actions, including both failures and
successful launches of author's scripts. Log entries are time stamped and
labeled with the numeric IDs of the user and group that the target script
was launched under.

LOG_FILE (default none)
     This specifies a file to which sbox will log its successes and
     failures. Set this to the full path name of the file to log to. An
     empty string ("") will make sbox log to standard error, which will
     cause its log messages to be directed to the ordinary server error log.
     Leaving LOG_FILE undefined will cause sbox not to log any messages.

ECHO_FAILURES (default TRUE)
     If this define is set to a true value, any fatal errors encountered
     during sbox's execution will be turned into a properly-formatted HTML
     message that is displayed for the remote user's benefit. Otherwise, the
     standard "An Internal Error occurred" message is displayed.

Chroot Settings

These variables controls sbox's chroot functionality. The path names are
relative to the document root. In the case of virtual hosts, this will be
whatever is specified by the DocumentRoot directive in the server's
configuration file. In the case of user-supported directories, it will be
the user's public_html directory.

DO_CHROOT (default TRUE)
     If set to a true value, sbox will perform a chroot to a restricted
     directory prior to executing the CGI script. Otherwise no chroot will
     be performed.

ROOT (default "..")
     This tells sbox where to chroot to relative to the document root. This
     directory should ordinarily be a level or two above the document tree
     so that the script can get access to the author's HTML documents for
     processing.

CGI_BIN (default "../cgi-bin")
     This define tells sbox where to look for the author's scripts
     directory, relative to his site's document tree. This directory should
     be contained within the directory specified by ROOT. For best security,
     you should specify a directory that is outside the document tree. The
     default is a directory named "cgi-bin" located at the same level as the
     document root.

SUID/SGID Settings

DO_SUID, DO_SGID (defaults TRUE, TRUE)
     These defines control whether the script will perform an SUID and/or an
     SGID to the user and group of the target CGI script. From the author's
     point of view it's safer to perform an SGID than an SUID, and usually
     is more than adequate. If no SUID or SGID is performed, the author's
     script will be run with the Web server's privileges.

SID_MODE (default DIRECTORY)
     This define controls whether sbox should use the ownership of the
     target script or the directory containing the target script to
     determine whose user ID and/or group ID to run under. Use directory
     mode if several users have authoring privileges for a single virtual
     host.

Resource Limitation Settings

SET_LIMITS (default TRUE)
     If set to a true value, sbox will set resource usage ceilings before
     running the target CGI script. You may need to set this to FALSE if you
     are using a system that does not implement the setpriority() and/or
     setrlimit() calls.

PRIORITY (default 10)
     This controls the priority with which target scripts are run. Values
     can range from -20 to 20. Higher numbers have less priority.

LIMIT_CPU_HARD, LIMIT_CPU_SOFT, LIMIT_FSIZE_HARD, LIMIT_FSIZE_SOFT...
     These and similar defines control the resource ceilings. The
     definitions set caps on CPU usage, the number of processes the script
     can spawn, the amount of memory it can use, the size of the largest
     file it can create, and other attributes. For each resource there are
     two caps, one hard, the other soft. Soft resources can be increased by
     any program that desires to do so by making the appropriate calls to
     setrlimit(). Hard limits are inviolable ceilings that cannot be lifted
     once established, even by a privileged user. The hard limits should be
     rather liberal, the soft limits more strict. See the setrlimit() man
     page for details on each of these resources.

Making and Installing the Binary

Compile the sbox binary by typing make. If it compiles successfully, become
root and type make install to install it in your site's cgi-bin directory
(at the location specified in the Makefile.)

You can also install sbox manually by copying it into your cgi-bin directory
and settings its permissions to ---s--x--x.

Configuring the Server and Author Directories

In order for sbox to be effective, CGI scripts should be turned off in all
user-supported directories and document directories. All CGI scripts should
be placed in the main cgi-bin directory. No one but authorized site
administrators should have write or listing privileges for this directory.
If you are using the Apache server, a typical entry for a virtual host will
look like this:

<VirtualHost www.fred.com>
ServerAdmin  fred@fred.com
ServerName   www.fred.com
DocumentRoot /home/fred/pub/html
TransferLog  /home/fred/logs/access_log
ErrorLog     /home/fred/logs/error_log

<Location />
Options Indexes SymbolicLinks
order allow,deny
allow from all
</Location>

</VirtualHost>

sbox enforces a directory-based CGI scripting scheme. Web authors' scripts
must be located in a single directory tree whose position relative to their
document tree is hard-coded in the CGI_BIN define. To avoid the possibility
that an author's scripts can be downloaded by a remote user, I suggest that
the scripts directory be placed outside the author's document root, for
example in "../cgi-bin".

With the virtual host definition given above, the author's HTML documents
will now reside in /home/fred/pub/html, while his scripts will reside in
/home/fred/pub/cgi-bin, entirely outside his virtual site's document root.

When sbox runs, it will chroot() to the directory specified by the ROOT
define, cutting the target script off from most system resources.
Dynamically linked programs (including interpreters and the like) will not
be happy unless they can find the shared libraries they rely on. Therefore,
this directory should be set up like a miniature root directory, containing
whatever is necessary for programs to run. This list is different from
system to system. See tips for some advice on setting it up.

Below is the structure of an author's directory, assuming that the virtual
host uses ~fred/pub/html as its document root.

% ls -l ~fred/pub
total 10
drwxr-xr-x   2 fred   users  1024 Oct 23 06:27 bin/     system binaries
drwxr-xr-x   3 fred   users  1024 Oct 19 20:44 cgi-bin/ CGI scripts
drwxr-xr-x   2 fred   users  1024 Oct 12 16:59 dev/     device special files
drwxr-xr-x   2 fred   users  1024 Oct 19 17:57 etc/     configuration files
drwxr-xr-x   2 fred   users  1024 Oct 22 19:14 html/    HTML document root
drwxr-xr-x   3 fred   users  1024 Oct 19 20:35 lib/     shared libraries
drwxrwxrwt   2 fred   users  1024 Oct 23 05:48 tmp/     temporary files

The same type of directory structure should be used for user-supported
directories. Generally you will want to set it up in the directory that
contains public_html.

You do not have to do any special directory configuration if you do not take
advantage of sbox's chroot feature.

Calling sbox

To use sbox create URLs like this one:

     http://www.virtual.host.com/cgi-bin/sbox/script_name
            ^^^^^^^^^^^^^^^^^^^^              ^^^^^^^^^^^
             virtual host name              author's script

The first part of the URL is the path to the sbox script. The second part is
the path to the author's script, relative to the cgi-bin directory in his
home directory. If the author's script needs access to additional path
information, you can append it in the natural way:

     http://www.virtual.host.com/cgi-bin/sbox/script_name/additional/path/info

For user-supported directories, use this format:

     http://www.virtual.host.com/cgi-bin/sbox/~fred/script_name

Authors are free to organize their script directories into a hierarchy. They
need only modify script URLs to reflect the correct path:

     http://www.virtual.host.com/cgi-bin/sbox/foo/bar/script_name

  ------------------------------------------------------------------------

Tips

Here are a few pieces of advice and tips on making best use of sbox.

Setting up the Chroot directory

Many CGI scripts will not run correctly in a chroot environment unless they
can find the resources they need. Compiled C programs often need access to
shared libraries and/or device special files. Interpreted scripts need
access to their interpreters, for example Perl. Feature-rich programs like
sendmail depend on their configuration files being present in /etc.

As described above, you will need to turn the chroot directory into a
miniature root file system, complete with /etc, /lib, /bin, /tmp and /dev
directories. I recommend that you create and test a chroot directory for one
virtual host, then use it as a master copy for creating new virtual hosts
every time you add a new author account. Both the cpio and the tar commands
can be used to copy shared libraries and device special files safely.

Programs that check file ownerships may need access to password and/or group
files in order for them to translate from numeric uid's and gid's to text
names. In order to support CGI scripts that perform this type of action, you
should place dummy copies of /etc/passwd and /etc/group in the author's /etc
directory. These files should not contain real passwords, and should only
contain standard system user accounts (e.g. "bin" and "mail"), plus any
needed by the script. You probably don't want to make the complete list of
user account names available to authors' CGI scripts!

If CGI scripts require access to the DNS system in order to resolve host
names and IP addresses, you should place a copy of /etc/resolv.conf into the
chroot directory. You may need to copy other configuration files to use
certain feature-rich programs. For example, if scripts send e-mail using the
sendmail program, you will need to install its configuration program,
sendmail.cf.

Many programs redirect their output to the device special file /dev/null.
Other programs need access to /dev/zero or other special files. You can copy
these files from the real /dev directory using either cpio or tar.
Alternately you can create the files from scratch using mknod, but only if
you know what you're doing. You'll need to have superuser privileges to
accomplish either of these tasks.

The Unix time system expects to find information about the local timezone in
a compiled file named /usr/lib/zoneinfo/localtime. You may need to copy this
into your chroot directory in order for the timezone to be correctly
displayed. You can confirm that the correct timezone is being found by
examining the output of the "env" executable.

There are two ways to finesse the problem of shared libraries. For compiled
C scripts, one option is to link the program statically (by providing the
-static flag to the linker). A less laborious solution is to place copies of
the required shared libraries in the new root's /lib directory (or /slib,
for systems that use that directory for shared libraries). Many systems have
a utility that lists the shared libraries required by a binary. Use this
program to determine which shared libraries are required, and copy them over
into each author's /lib directory. In addition to the shared libraries, you
may need to copy the dynamic linker itself into the /lib directory. On my
linux system, this file is "ld-linux.so".

If a executable cannot find its shared libraries at run time, it will
usually fail with a specific error message that will lead you to the problem
-- look in the server error log. If you get silent failures, it's probably
the dynamic linker itself that can't be found.

Linux, and possibly some other systems, uses a cache file named
/etc/ld.so.cache to resolve the location of library files. If this file
isn't found at run time, the system will generate a warning but find the
correct shared libraries nevertheless. The quick and dirty way to get rid of
this warning is to copy the current cache file from the real /etc directory
to the chroot one. However, this may have bad side effects (I haven't
actually encountered any, but I worry about it.) It's better to make this
cache file from scratch in the chroot environment itself. To do this, run
the ldconfig program with the command-line version of chroot. You'll need to
be root to do this:

     # cd /sbin
     # chroot ~fred/pub ./ldconfig

Perl scripts, in addition to requiring the Perl interpreter, will often need
access to the Perl lib directory in order to get at useful modules (such as
CGI.pm). It's easiest to copy the whole Perl library tree to the correct
location in the chroot directory, being careful to get the path right. For
example, if the real Perl library is located in /usr/local/lib/perl5, you'll
need to create a parallel /usr hierarchy in the chroot directory. On my
system, I recompiled Perl to use /lib/perl5 and dumped the modules into that
directory. If things get bolluxed up, you can always tell Perl where to look
for its libraries by appending something like this to the top of CGI
scripts:

     #!/bin/perl
     BEGIN { push(@INC,'/lib/perl5','/lib/perl5/i586-linux/5.004'); }

The Document Root and the chroot() directory

Some CGI scripts act as filters on static HTML documents. Examples include
PHP and various guestbook scripts. Such scripts often include the path to
the static document appended to the end of the script's URL as "additional
path information." For example:

     http://your.site/~fred/guestbook.cgi/~fred/guestbook/data.txt

The script will be passed two environment variables, PATH_INFO, containing
the additional path information, and PATH_TRANSLATED, containing the path
information translated into an absolute filename. In the example above, the
values of these variables might be:

      PATH_INFO /~fred/guestbook/data.txt
 PATH_TRANSLATED/home/fred/public_html/guestbook/data.txt

When sbox is running it interprets the additional path information as
relative to the user's document root. This means that a document located in
Fred's public_html directory can be referred to this way:

     http://your.site/cgi-bin/sbox/~fred/guestbook.cgi/guestbook/data.txt

After performing the chroot(), sbox attempts to adjust PATH_TRANSLATED so
that it continues to point to a valid file. If the user's document root is
located within the chroot directory, then PATH_TRANSLATED is trimmed so that
it is relative to the new root directory:

      PATH_INFO /guestbook/data.txt
 PATH_TRANSLATED/public_html/guestbook/data.txt

However, if the document root is entirely outside the new root directory,
then sbox will simply use the same value for PATH_INFO and PATH_TRANSLATED:

      PATH_INFO /guestbook/data.txt
 PATH_TRANSLATED/guestbook/data.txt

Users and Webmasters should be aware of this behavior, as it can cause some
confusion.

The Resource Limitations

The default resource limits are reasonable. Most authors won't have problems
with them unless they need to do number crunching or manipulate many files
simultaneously. If need be, authors can raise the soft resource limits up to
the levels imposed by the hard limit ceilings, which are very liberal. C
programmers can do this directly by making calls to setrlimit(). Perl
scripters should download and install Jarkko Hietaniemi's BSD::Resource
module from CPAN.

Rewrite-Rule Tricks

If you are running Apache 1.2 or higher, you can take advantage of the
rewrite rule module to make sbox transparent. For virtual hosts, you can add
something like the following to the <VirtualHost> section:

     RewriteEngine on
     RewriteRule ^/cgi/(.*) /cgi-bin/sbox/$1 [PT]

This replaces all URLs that start with "/cgi" with "/cgi-bin/sbox". This
lets authors refer to their scripts with:

     http://www.virtual.host.com/cgi/script_name

and to main Web server scripts with:

     http://www.virtual.host.com/cgi-bin/guestbook

For user-supported directories, this rewrite rule will allow users to refer
to their scripts using http://www.host.com/~username/cgi/script_name:

     RewriteEngine on
     RewriteRule ^/(~.*)/cgi/(.*) /cgi-bin/sbox/$1/cgi-bin/$2 [PT]

The env Script

This distribution comes with a small statically linked binary called "env"
that you can call as a CGI script. It prints out some information about the
current environment, including the user and group ID's, the current working
directory, and the environment variables, to help you determine whether sbox
is configured correctly and working as expected.
  ------------------------------------------------------------------------

Author Information

This utility is 1997-1998 Lincoln D. Stein. It can be used freely and
redistributed in source code and binary form. I request that this
documentation, including the copyright statement, remain attached to the
utility if you redistribute it. You are free to make modifications, but
please attach a note stating the changes you made.
  ------------------------------------------------------------------------

Change History

Version 0.97
     Fixed bugs relating to automounter confusion.

Version 0.95
     Fixes to compile and run on Solaris systems. Still not extensively
     tested, but no bug reports yet.

Version 0.90
     Beta release. Use with caution.

  ------------------------------------------------------------------------
Lincoln D. Stein, lstein@cshl.org
Cold Spring Harbor Laboratory
Last modified: Wed Oct 7 11:39:06 EDT 1998
