Apache module: bwshare
bandwidth throttling and balancing by client IP address

for Apache 1.3.x and 2.0.44

/*-----------------------------------------------------------------------------
Copyright (C) 2000, 2001, 2002, 2003, Alan Kennington.
You may distribute this software under the terms of Alan Kennington's
modified Artistic Licence, as specified in the accompanying LICENCE file.
-----------------------------------------------------------------------------*/
This software package is open and free.
Please read the LICENCE file for precise details.

DISCLAIMER.
The author of this software disclaims any express or implied guarantee of the
fitness of this software for any purpose. In no event shall the author of this
software be held liable for any direct, indirect, incidental, special,
exemplary, or consequential damages (including, but not limited to, procurement
of substitute services; loss of use, data, or profits; or business interruption)
however caused and on any theory of liability, whether in contract, strict
liability, or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.

The bwshare module has been developed and tested under SuSE 6.2, SuSE 7.1 and Redhat 5.2.
Some people have used it with FreeBSD 4.1 and Solaris 2.6.
The current work-in-progress version of bwshare is 0.1.3.
(This means that version 0.1.3 may change every day until it is frozen.)
This should work with Apache versions 1.2.13+, 1.3.x and 2.0.44.

Source files:
mod_bwshare.c bandwidth throttling module.


Other files:
README.html
doc.html
Makefile.tmpl
Makefile.in
config.m4
LICENCE
bwshare-0.1.3.zip
bwshare-0.1.3.zip.asc
the file you are now reading
some minimal documentation of this module
the Makefile template for an Apache 1.3.x module
the Makefile template for an Apache 2.0.16 module
the m4 macro file for an Apache 2.0.16 module
user licence/license
zip file
PGP signature

Here is my PGP public key.
Here is my currently running bwshare module on www.topology.org.

The Apache shaping module ``bwshare'' uses a form of ``statistical shaping''.
This means that the software measures the statistical behaviour of the subscriber in the past, and uses this as a basis for controlling the current access to resources by the user.
The principal algorithm used in mod_bwshare is the ``token bucket''.
This module is quite reliable (with Apache 1.3.26 anyway).
I ran it for 7 months from July 2002 to February 2003 without problems.
The amount of data handled during those 7 months was about 7 GBytes.
The only thing that stops it is an electricity failure or a syntax error in the httpd.conf file when I modify it and restart httpd.

2000-12-20:
It seems that my bwshare Apache module is now superseded by mod_throttle by Anthony Howe.
He's done a more professional job, has many more features and offers better control of the module parameters.
You can't even take a tea-break in this industry without someone overtaking you!

2002-9-7:
I've been told that mod_throttle can be unstable (i.e. it sometimes crashes) and lacks some minor features that mod-bwshare has. One of the advantages of mod-bwshare might be that it has been written very carefully for correctness and efficiency.

Installation for Apache 1.2.13+ and 1.3.x:
To install mod_bwshare with static linkage, create a directory apache/src/modules/bwshare in your apache source tree.
Then copy Makefile.tmpl and mod_bwshare.c to the apache/src/modules/bwshare directory.

The parameter

--activate-module=src/modules/bwshare/mod_bwshare.o
should be used with the "configure" command in the top-level Apache source directory (in addition to any other command line arguments you normally use).

If there are any problems with the Makefile.tmpl file, compare it with the Makefile.tmpl file in other modules, e.g. in apache/src/modules/example.
The file apache/src/modules/example/README may also contain useful hints for installing this module.
Please let me know of any changes you made to build the bwshare module on your system so that I can incorporate them.

The following line may be suitable in the httpd.conf file.
This should go after the other AddModule commands.

AddModule mod_bwshare.c
The following lines may be included in httpd.conf so that the paths /bwshare-info and /bwshare-trace will invoke the corresponding handlers.
Then you can click on http://www.yourdomain.com/bwshare-trace to monitor the progress of the module.
<IfModule mod_bwshare.c>
# Some bandwidth control parameters.
BW_tx1cred_rate         0.067
BW_tx1debt_max          30
BW_tx2cred_rate         1000
BW_tx2debt_max          1000000

<Location /bwshare-info>
    SetHandler bwshare-info
</Location>

<Location /bwshare-trace>
    SetHandler bwshare-trace
</Location>
</IfModule>
The following parameters are defined for control of the mod_bwshare module.
These parameters are applied to each client IP address independently.
  • BW_tx1cred_rate: sets the maximum rate of serving files (files/second).
  • BW_tx1debt_max: sets the maximum files to serve in excess of BW_tx1cred_rate (files).
  • BW_tx2cred_rate: sets the maximum rate of serving bytes (bytes/second).
  • BW_tx2debt_max: sets the maximum bytes to serve in excess of BW_tx2cred_rate (bytes).
Be careful when setting these parameters. The module makes very little attempt to check that your choice of parameters is sane.

See also my notes on installing Apache 1.3.23 with PHP 4.1.2 and bwshare 0.1.2.

Installation of bwshare version 0.1.2 for Apache 2.0.16:

These instructions seem to work with Apache 2.0.44 for bwshare 0.1.3.
Alternatively, these instructions work for Apache 2.0.16 with bwshare 0.1.2.

To install this software, create a directory apache/modules/bwshare in your apache source tree.
Then copy Makefile.in, config.m4 and mod_bwshare.c to the apache/modules/bwshare directory.
It seems like you might have to run "autoconf" in the Apache top directory to remake "configure" from "configure.in".
But I found that this created an erroneous line as follows:

LTFLAGS="$LTFLAGS -export-dynamic"
If "autoconf" causes this line to appear in your "configure" file, you will have to remove it or comment it out.
I have no idea how this bug gets in there.

The parameter

--enable-bwshare
must be used with the "configure" command in the top-level Apache source directory (in addition to any other command line arguments you normally use) if you choose the `no' option in the config.m4 file.

If there are any problems with the Makefile.in or config.m4 files, compare them with the corresponding files in other modules, e.g. in apache/modules/experimental.
The file apache/modules/experimental/README may also contain useful hints for installing this module.
Please let me know of any changes you made to build the bwshare module on your system so that I can incorporate them.

The following line is not suitable in the httpd.conf file.
(Somehow things have changed in Apache 2.0.16, and I can't find any documentation to tell me what on earth they've done to it.)

AddModule mod_bwshare.c
The following lines may be included in httpd.conf so that the paths /bwshare-info and /bwshare-trace will invoke the corresponding handlers.
Then you can click on http://www.yourdomain.com/bwshare-trace to monitor the progress of the module.
<IfModule mod_bwshare.c>
<Location /bwshare-info>
    SetHandler bwshare-info
</Location>

<Location /bwshare-trace>
    SetHandler bwshare-trace
</Location>
</IfModule>
See also my notes on porting an Apache module to version 2.0.16.

bugs
  1. It seems that with dynamic linkage, the httpd.conf parameters do not take effect.
  2. No others as far as I know... Please find more for me.

Purpose, principles, philosophy.
The purpose of this module is to give the web site operator some control over bandwidth utilization by individual client hosts.
The `bwshare' module temporarily blocks access by excessive users.
This is aimed especially at users who download whole websites at great speed.
Excessive speed is considered bad etiquette for search engine robots.

The guiding principle in this module is that two categories of clients should be allowed unhindered access: human users and well-behaved search engines.
Non-human impolite clients should be throttled as soon as possible.
On an Internet where almost everyone is desperately trying to increase their hit rate, it might seem crazy to try to dampen the enthusiasm of visitors to a site.
But in Australia, we pay by the megabyte for our traffic, and anywhere in the world, you don't make much profit out of visitors who download your whole website if it's really big.
(E.g. if you're a mirror for a few huge mega-sites.)

There are many actions which a throttling module can take when an excessive user is identified.
You can slow them down or you can discard their requests.
You can do this with or without a visual indication to the user.
And the throttling can be permanent or temporary.
I have chosen to implement temporary request discard with an indication to the user.
Discard is much easier to implement than delay (slowing them down), and I wanted a real human user to be able to continue access if they were just temporarily over-using the server.
So that's why the discard is temporary and there is an explicit indication to the user.

Currently, the module bwshare has the following features.

  • Records the number of files requested by each client host according to IP address.
  • Records the number of bytes downloaded by each client host according to IP address.
  • Permits the operator to view the recorded data in an HTML table through a handler called bwshare_trace.
    The table shows the number of bytes downloaded by each host, the number of requests made, and various leaky bucket based measures of mean bandwidth.
    In particular, clients who exceed their bandwidth bounds will set off a red light on the management screen, and they get the 503 status code for every download after the first 35 requests or so.
    [Update: I changed 503 to 200.]

    503a.html a typical 503 message for the too demanding client
    screenshot32.html screen shot 32, 2002-3-9
    screenshot33.html screen shot 33, 2002-4-21
    screenshot34.html screen shot 34, 2002-6-21
    screenshot36.html screen shot 36, 2002-10-3
    screenshot37.html screen shot 37, 2003-2-4

  • Calculates the current ``file TX debt'' tx1debt of each client host according to the following rules.
    • If the client requests a file, then tx1debt is incremented by 1.
    • If the client is idle for T seconds, then tx1debt is reduced by T * tx1cred_rate, where tx1cred_rate is a specified permitted long term average file request rate for each host.
      (Example: tx1cred_rate = 0.05 files/second means that one file can be requested every 20 seconds.)
    • If the tx1debt falls to zero, it does not go negative. I.e. credit cannot be accumulated.
    • If the tx1debt exceeds a specified maximum debt tx1debt_max when a new request is made, the client is sent a 503 status code (HTTP_SERVICE_UNAVAILABLE).
      (A typical value of tx1debt_max for a 33.6k modem might be 30 files.)
      [Update: The 503 code seems to make some downloading software just go into an infinite download loop.
      So now I think it's better to use the 200 code so that the infinite-downloading software will be happy and stop.]
    • This means that the limit for file download is tx1debt_max + T * tx1cred_rate files within any time interval of T seconds.
    • As an example, if tx1debt_max = 30 files and tx1cred_rate = 50/1000 files/second, this means that the client can
      • download 30 files as quickly as they like, and
      • download a further file every 20 seconds. (Note: 1000/50 = 20.)
  • Calculates the current ``byte TX debt'' tx2debt of each client host according to the following rules.
    • If the client downloads a file, then tx2debt is incremented by the number of bytes in the file.
    • If the client is idle for T seconds, then tx2debt is reduced by T * tx2cred_rate, where tx2cred_rate is a specified permitted long term average file download rate for each host.
      (Example: tx2cred_rate = 1000 bytes/second means that 1000 bytes can be downloaded every second (obviously).)
    • If the tx2debt falls to zero, it does not go negative.
      I.e. credit cannot be accumulated.
    • If the tx2debt exceeds a specified maximum debt tx2debt_max when a new request is made, then the client is sent a 503 status code (HTTP_SERVICE_UNAVAILABLE).
      (A typical value of tx2debt_max for a 33.6k modem might be 300,000 bytes.) [Update: 503 changed to 200.]
    • This means that the limit for file download is tx2debt_max + T * tx2cred_rate bytes within any time interval of T seconds.
    • As an example, if tx2debt_max = 500,000 bytes and tx2cred_rate = 8000/8 bytes/second, this means that the client can
      • download 500,000 bytes as quickly as they like, and
      • download a further 1000 bytes every second.
In a future version, it may be possible to add a variety of modules to determine whether usage is excessive, as suggested by the numbering, tx1, tx2, tx3 etc.
Just as in IP ``differential service'', a general bandwidth sharing module must (1) classify the traffic (e.g. according to client IP address), (2) measure the traffic (e.g. with leaky buckets tx1 and tx2), and (3) take appropriate actions (e.g. shape/police/arbitrate the traffic).

Current things to do.
  • Write a script to do shared-object linkage to Apache with apsx.
    Must also fix the fact that the httpd.conf parameters are ignored for shared-object linkage apparently.
  • The planned bandwidth balancing features have not yet been added.
    This is planned to determine whether a serious congestion condition exists, and if so, the module will reject requests from the most greedy clients so as to favour the least greedy clients.
    This should help to favour interactive users as against bulk file downloads.
  • ``Network management'' features should be added so that the operator (and only the operator) can control the various parameters used by bwshare.
    These parameters should be settable both in the httpd.conf file, and via a request handler.
  • Many new control parameters should be added.
    In particular, it would not be unreasonable to expect to be able to specify a list of IP hosts which should receive better or worse treatment than other IP hosts.
    Similarly, some sort of IP host grouping would be desirable, so as to treat a group of hosts as having an aggregate set of bandwidth parameters.
  • The ability to throttle dually by client IP host and server virtual host must be added soon.
    This would require separate traffic history records to be kept for each virtual host, and all accesses would then be checked against the particular virtual host and also against the global parameters for the server.
    (The current bandwidth throttling is global only.)
    Similarly, each Location and Directory context could be treated separately.
    And perhaps the sizes of host record tables for each context should be settable in the httpd.conf file, so as to economize on RAM if desired.
  • Should add third and fourth criteria for limiting client hosts by simple total number of files and simple total number of bytes.
    In my case, I don't want this, because I am happy to let search engines read everything over several days.
    But there should also be a facility to specify a set of search engines which have permission to download as much as they want, or at least to give them special treatment.
  • To specifically check for `wget' access, could detect a header of the form
     User-Agent: Wget/1.5.3
    
    to detect that "wget" is being used and react accordingly.
    But "bwshare" is supposed to stop any such access even if they don't play by the robots rules.
    (See A Standard for Robot Exclusion.)
    And besides, wget for subdirectories is fine, as long as that subdirectory is not excessively big.
    So it should be possible to specify in httpd.conf that bwshare should allow wget into subdirectories below a given level, but not across multiple such subdirectories.
  • Should add a Modified-date field to the header for the network management screen so that it doesn't reload unnecessarily.
  • Should make the garbage collection more sophisticated, and also should write the removed information to a log file.

Some notes on development:
The main objective of this software is to provide the means to balance and/or limit bandwidth demands by web clients.
As a by-product, this module also has the ability to inform an operator of the current usage of an Apache web server.
In other words, it has a monitoring function.

A manual may be provided in the form of a TeX document some day.

The tx1debt and tx2debt measures in the bwshare_trace window are measures of `greed'.
They give some idea of how much bandwidth each host is consuming, relative to a given baseline.
If any client host is too greedy, then their requests are refused until their greed subsides.
The main immediate aim of this is to stop people from trying to download my entire 100 Mbyte web site through the 33k modem.
(In Australia, the web server operator pays by the Megabyte for users' downloads - about US$0.04 per Megabyte.)

Previous versions of this package (historical interest only).

bwshare-0.1.2.zip bwshare-0.1.2.zip.asc zipped source + PGP signature
bwshare-0.1.1.zip bwshare-0.1.1.zip.asc zipped source + PGP signature
bwshare-0.1.0.zip bwshare-0.1.0.zip.asc zipped source + PGP signature

Version 0.1.1 works for Apache 1.3.x, but not for Apache 2.0.x.
Version 0.1.2 should work for Apache 1.2.13+, 1.3.x and 2.0.16.
Version 0.1.3 should work for Apache 1.2.13+, 1.3.x and 2.0.44.

Do not use version 0.1.0 for long run-times.
Version 0.1.0 bug: The module does erroneous things after about 3 days of httpd uptime.
When the run-time is very large, there are integer overflow errors.
Integer arithmetic (actually fixed-quotient arithmetic) was used for greater speed.
Version 0.1.1 now uses double-float arithmetic for operations which are liable to integer overflow.
This decision was made after using 64-bit ``long long'' for a while.
The double-float type is preferred because of improved portability and the following speed test results for multiply-add operations (with default optimization).

typeAMD 486DX2-66AMD K6-2/400
long 658 nS 18 nS
long long 1419 nS 70 nS
double 811 nS 48 nS

Clearly the double-arithmetic is faster than long-long.
So there's no space or time advantage in using long-long.

Note that there is some software MM which is designed for portable shared memory usage between httpd child processes.
The MM software might be a good way to ensure portability across operating systems.

For more Apache modules, see the Apache module registry.

For other bandwidth management modules, see the Apache Overview HOWTO.


Go to some (limited) documentation of this module.
Go to Alan Kennington's PGP key.
Go to other software by Alan Kennington.
Go to Alan Kennington's home page.