The Linux kernel offers us lots of queueing disciplines. By far the most widely used is the pfifo_fast queue - this is the default. This also explains why these advanced features are so robust. They are nothing more than 'just another queue'.
Each of these queues has specific strengths and weaknesses. Not all of them may be as well tested.
This queue is, as the name says, First In, First Out, which means that no packet receives special treatment. At least, not quite. This queue has 3 so called 'bands'. Within each band, FIFO rules apply. However, as long as there are packets waiting in band 0, band 1 won't be processed. Same goes for band 1 and band 2.
SFQ, as said earlier, is not quite deterministic, but works (on average). Its main benefits are that it requires little CPU and memory. 'Real' fair queueing requires that the kernel keep track of all running sessions.
This is far too much work so SFQ keeps track of only a number of sessions by
tracking things based on a hash. Two different sessions might end up in the
same hash, which isn't very bad but should not be a permanent situation.
Therefore the kernel perturbs the hash with a certain frequency, which can
be specified on the tc
command line.
This queue is very straightforward. Imagine a bucket, which holds a number of tokens. Tokens are added with a certain frequency, until the bucket fills up. By then, the bucket contains 'b' tokens.
Whenever packets arrive, they are stored. If there are more tokens than packets, these packets are sent out ('dequeued') immediately in a burst transfer.
If there are more packets then tokens, all packets for which there is a token are sent off, the rest have to wait for new tokens to arrive. So, if the size of a token is, say, 1000 octets, and we add 8 tokens per second, our eventual data rate is 64kilobit per second, excluding a certain 'burstiness' that we allow.
The Linux kernel seems to go beyond this specification, and also allows us to limit the speed of the burst transmission. However, Alexey warns us:
Note that the peak rate TBF is much more tough: with MTU 1500
P_crit = 150Kbytes/sec. So, if you need greater peak
rates, use alpha with HZ=1000 :-)
FIXME: is this still true with TSC (pentium+)? Well sort of
FIXME: if not, add section on raising HZ
RED has some extra smartness built in. When a TCP/IP session starts, neither end knows the amount of bandwidth available. So TCP/IP starts to transmit slowly and goes faster and faster, though limited by the latency at which ACKs return.
Once a link is filling up, RED starts dropping packets, which indicate to TCP/IP that the link is congested, and that it should slow down. The smart bit is that RED simulates real congestion, and starts to drop some packets some time before the link is entirely filled up. Once the link is completely saturated, it behaves like a normal policer.
For more information on this, see the Backbone chapter.
The Ingress qdisc comes in handy if you need to ratelimit a host without help from routers or other Linux boxes. You can police incoming bandwidth and drop packets when this bandwidth exceeds your desired rate. This can save your host from a SYN flood, for example, and also works to slow down TCP/IP, which responds to dropped packets by reducing speed.
FIXME: instead of dropping, can we also assign it to a real queue?
FIXME: shaping by dropping packets seems less desirable than using, for example, a token bucket filter. Not sure though, Cisco CAR works this way, and people appear happy with it.
See the reference to IOS Committed Access Rate at the end of this document.
In short: you can use this to limit how fast your computer downloads files, thus leaving more of the available bandwidth for others.
See the section on protecting your host from SYN floods for an example on how this works.