A lot of people use the Cisco IOS rate-limit feature. They have a real need. What they don’t have is real documentation. Cisco itself hasn’t produced a decent explanation of how rate-limit works. Yeah, there are some books that pretend to explain what the ‘normal burst’ and ‘extended burst’ settings do. But they STILL don’t tell you how to figure them out in real life.
I’m going to tell you. And after I tell you, you’re going to feel better. The network will be faster.
Why the burst settings matter?
A normal router interface is limited by the inherent speed of the wire. When incoming packets arrive faster than they can head out the interface, they are placed in the outgoing packet queue. Each interface has one, typically 40 packets in size. As packet flow reaches and surpasses the wire speed, this queue begins filling up. Those packets belong to someone — a particular TCP connection. (We are ignoring other packet types, but that is okay.) When those packets end up in the queue, they are being delayed. This would be similar to ping times starting to rise. The TCP protocol notices this delay (because the receiving computer will take slightly longer to send back an ACKnowledgement packet), and the flow slows slightly. If too many packets fill the queue, they start dropping. When TCP sees dropped packets, it slows more significantly. This is how TCP works, and it works well.
What happens when you rate-limit instead of letting an interface reach line-speed?
When you rate-limit, the outgoing packet queue never comes in to play. When the rate-limit sees a flow exceeding the limit, it simply drops a packet. There is no outgoing queue to slow the packet momentarily. This is why there are two settings: the normal burst size, and the extended burst size. The size of the burst is essentially telling the router how tightly you want to apply the rate-limit. Too loose, and flows may momentarily exceed the limit by an undesirable amount. (What constitutes undesirable is completely up to you; some folks are using rate-limit to sell bandwidth, and others are using it to keep a line from ever reaching saturation and hence keep latency down.) When the extended burst is exceeded, ALL packets are dropped until the flow comes back under the limit. That will happen from time to time, but it is bad for consistent performance. The normal burst drop can occur many times as the flow oscillates around the rate-limit. The smaller the normal burst is, the more drops it can accomplish, which is good in moderation, bad when overdone. A good rate-limit can actually improve performance under heavy load versus no rate-limit. But a bad rate-limit screws up everything.
Usually, the burst settings end up too tight. When they are set too tight, three things may happen: 1) The combined traffic flows will have trouble reaching the limit. 2) TCP flows will be jerky, resulting in un-smooth web page loading or wildly fluctuating download speeds. 3) A single TCP flow (ie a speed test) will NEVER get close to the limit. (Insufficient TCP Window size on server or client can also cause this)
Burst settings end up set too low because people are concerned that setting them higher is going to cause the rate-limit to be too loose. This isn’t true; for every byte sent above the rate-limit, a byte must not be sent (or worse, dropped!) below the rate-limit. So don’t be too tight with burst!
Cisco IOS reference manuals (certain editions, at least) say that normal burst should be (RATE-SPEED/8 * 1.5), and that extended burst should be double the normal burst. These suggested settings are great for smooth TCP, but they are sometimes too loose.
I’m a tightwad; what’s the lowest burst I should use?
There is one more variable you should know in order to figure this out: The typical or highest-typical ping time. If you are in the US, accessing primarily US sites, a ping-time of 150ms is probably a safe tight-wad number.
So here is your formula for normal burst: (RATE-SPEED/8 / 20)
The formula for extended burst: NormalBurst + (RATE-SPEED/8 * 0.150)
Using an example of 20Mbps, your rate limit would be:
rate-limit out 20000000 125000 500000 conform-action transmit exceed-action drop
Cisco says extended burst should be double the normal burst, but that assumes you are also using their formula for normal burst. I don’t assume that you are, which is why I specify extended burst as an amount that should be added to normal burst. In this way, you can use very small normal burst values, and still know what your extended burst should be in order to ensure smooth TCP performance.
Why is that the lowest?
150ms is your round-trip time (RTT). That is how long it takes a packet to be sent, and the acknowledgement to be received. Once a flow reaches this speed, and exceeds for a burst of the size/duration for one twentieth of a second, a packet will be dropped. The extended burst is the normal burst, plus the total bytes that can be transferred in the RTT. That is how long it takes TCP to notice that a packet was dropped and was never acknowledged. So, when you set the parameters thus, you are giving TCP enough time to “take the hint” of a single dropped packet before it gets the major punishment of all packets being dropped.
If you don’t like these burst settings, and they are causing the flow to exceed the rate-limit too much or too often, I would suggest that it is NOT the burst sizes that you want to reduce, but the rate-limit speed itself. Knock it down a percent or two and see if that helps you out.
Oh, so you want to push the envelope, and want to have even lower settings? Well, in that case, you could use 24000 for the normal, and 399000 for the extended. The important thing is that you’ve still got one RTT between the first packet drop and the complete packet drop. You may not get full speed at this level, but it will be smooth. The potential harm is that with only a 24000 burst, the router is looking too closely at the flow. In our 20Mbps example, 24000 bytes is only one-hundredth of the per second speed of the flow. When you measure the flow this tightly, the rate limit will end up being applied too often, when the overall speed is actually not in excess of the rate-limit. For instance, a web browser that suddenly opens a bunch of connections as a page is loaded could (although statistically unlikely) cause a 24,000 byte burst as that handful of connections are opened and their first 2 or 3 packets are sent without waiting for acknowledgement. Use your judgment. If your judgment sucks, use bigger numbers for the burst, and smaller numbers for the speed.
An exception: These rules don’t apply on high speed links with hundreds of TCP (or other) connections. If the router starts to drop all packets, the randomness works in your favor. It is likely that every packet dropped is part of a different TCP connection than every other packet dropped. No single download through this link should expect to use more than 2% of the link capacity in order for this to apply. In this case, the extended burst value can approach the normal burst value without impeding TCP performance.