Post a Comment On: Ben Strong's Blog

"Google and Microsoft Cheat on Slow-Start. Should You?"

64 Comments -

1 – 64 of 64
Comment deleted

This comment has been removed by the author.

November 26, 2010 at 9:42 AM

Blogger Dave said...

Any kernel-tuning tricks to play with slow start options for a linux-based web server?

November 26, 2010 at 10:57 AM

Blogger Ben Strong said...

@Dave - Check out this patch:

http://www.amailbox.org/mailarchive/linux-netdev/2010/5/26/6278007

November 26, 2010 at 11:05 AM

Blogger Dutchmaster said...

That's pretty funny. Makes you wonder how many hacks these companies have going on behind the scenes.

November 26, 2010 at 11:33 AM

Blogger angsuman said...

I ran the same test on our site from a different server over internet and got the following result:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 55529 100 55529 0 0 16.2M 0 --:--:-- --:--:-- --:--:-- 50.4M

real 0m0.006s


That appears to be must faster than Google and replicatable. Am I missing something here?

November 26, 2010 at 12:00 PM

Blogger Ben Strong said...

@angsuman - I should have been more clear about it, but the interesting part is that these tests were run over a home internet connection. My guess is that your test was from a server with a very short RTT to your site.

November 26, 2010 at 12:21 PM

Blogger Ben Smith said...

Angsuman: compare/contrast your network latency before comparing yourself to the google!

Google's ping times average somewhere around 30-60ms for most Internet connections in the USA. Your Intranet will probably have a 5-10 ms ping times since it's all local LAN in the same building. So you get another 20-50ms to respond before Google can even begin. And because of the short round trip, the "slow startup" algorithm picks up the pace much more quickly, as it was designed to do.

November 26, 2010 at 12:23 PM

Blogger John McLear said...

This may be useful

http://www.isi.edu/lsam/publications/phttp_tcp_interactions/node2.html

November 26, 2010 at 12:30 PM

Blogger Chris Riley said...

great article. thanks for posting.


I think on #1 you mean TCP? though.

November 26, 2010 at 12:36 PM

Blogger Adam Rosien said...

Serendipity: http://www.mnot.net/blog/2010/11/27/htracr

November 26, 2010 at 12:49 PM

Blogger Absolute-Z said...

Good post! Thanks!

November 26, 2010 at 1:13 PM

Blogger Adam said...

FWIW, this is somewhat anecdotal, but I've always noticed that large downloads from MS (e.g. service packs) will totally dominate all other traffic on my connection. Everything else will slow to a crawl rather than sharing somewhat nicely as most other connections seem to. That would seem to make sense as that is partly what the slow start is intended to prevent.

November 26, 2010 at 1:21 PM

Blogger Jim said...

Not implementing an optional variant is not cheating.

November 26, 2010 at 1:29 PM

Blogger Simon said...

I'm not sure if this would apply to Google's case, but Linux caches the cwnd per src/dst flow in the route cache (see "ip route show cache"), and so the cwnd for a new connection can be inflated beyond the "initial cwnd", assuming the src:dst pair is in the route cache. This may affect your testing, and provide the results you are seeing.

November 26, 2010 at 1:37 PM

Blogger Zygo said...

I think Google's net neutrality position is that people should be free to do things like this at the network edges, without having to worry about unexpected interactions with some traffic mangling system in between.

In other words, it's OK for Google to play with their own TCP flows (where they're an end point), but it's not OK for Comcast to play with Google's TCP flows as they go by. This sort of implementation variation is the only way that we can get any kind of forward progress on protocols used on live networks.

3*MSS is a number back from the days when an Ethernet packet was a non-trivial chunk of host device driver memory and could take seconds to transmit over slow links. These days, network interface hardware drivers don't want to bother with host CPU interrupt service overhead until there's a few dozen packets sitting in the receive buffer, because the precious microseconds required to process data one packet at a time are a cost too expensive to bear. DSL and cable modems can buffer hundreds of packets (even though they shouldn't, and dammit, almost all of them do now).

If you intend to have users who are still using ancient network hardware, you might want to consider the effects of this kind of tuning carefully. If I throw a naive TBF QoS filter (simulating the behavior of my 1990's-era cable modem, with no bursting or queueing, it simply drops packets that arrive too quickly) between a user and Google, I notice that Google's pages load extremely slowly compared to a standard-compliant web server. Sometimes they don't load at all, and every time I open a Google page, any other traffic through the filter stalls. No problems if the TBF allows bursting or queueing, though.

November 26, 2010 at 1:46 PM

Blogger exploderator said...

In Google's case it's a reliably small size page. All 8K of that data is going to have to get transfered sooner or later. As long as an 8K burst isn't breaking buffers along the pipes, they are saving several rounds of back and forth protocol overhead by skipping the whole soft start. That means not only does their page load faster, but it is also less costly to everyone involved, in the long run.

As for Microsoft, I go with bug.

November 26, 2010 at 1:54 PM

Blogger Dan Jacobs said...

The limiting protocol is HTTP, you say? Then it's no wonder Google is working on their own replacement for the HTTP protocol. Perhaps this gives us some insight as to why they may be doing this...

November 26, 2010 at 1:56 PM

Blogger Ben Strong said...

@Chris - I actually did mean http, but should have provided some more explanation. The main reason that people are doing this is to work around the fact that browsers create large numbers of short-lived http connections, a problem that SPDY is designed to alleviate. At that point, slow-start will be less of an issue.

@Simon - Interesting. Something to look into for sure. Had I known how much interest this post was going to receive, I would have done a lot more research up front.

November 26, 2010 at 1:56 PM

Blogger Sumit said...

Nice observation, Ben.
The other cases do not raise as much issue as Microsoft. It would be interesting to see how Microsoft responds.

November 26, 2010 at 2:39 PM

Blogger David Rodecker said...

Great discovery.

Curious how long that these sites have been doing this... is this perhaps the result of hiring the inventor of TCP/IP?


I wouldn't necessarily say that they "cheat", but that they've optimized for their page sizes and users. There are many cases where the RFC standard isn't optimal in specific cases. Nevertheless, I'd sure be happy if our webserver could take advantage of this easter-egg.

November 26, 2010 at 2:55 PM

Blogger plugsukah said...

Hm, interesting... could make a tangible difference in high latency situations like most wireless connections.

November 26, 2010 at 2:56 PM

Blogger parena said...

As Jim stated: Not implementing an optional variant is not cheating.

November 26, 2010 at 3:00 PM

Blogger Jacob Taylor said...

Great article Ben. Thanks for taking the time to track it down and share it with us.

November 26, 2010 at 3:20 PM

Blogger Justin Grant said...

It's very likely that all the web sites in question have an application delivery controller appliance (such as an F5 Big-IP), terminating all traffic and load balancing it to a server farm.

Given this, you can use two differently tweaked TCP stacks, one WAN side and one LAN side. On the WAN side you would tweak the slow start algorithm, so that short HTML pages could be served within 1-2 windows. Latency (not bandwidth) is the real killer of web applications, so serving a page with minimal round-trips is key.

On the LAN side, you can pretty much abandon slow start. You have <1ms latency, 1Gb or more bandwidth and no packet loss. A correctly configured application delivery controller will aggressively increase the TCP connection server side and cache/buffer the response to WAN based client.

I would agree that Microsoft have got it wrong and it would seem that they have applied a LAN type TCP profile to the WAN client. This is bad in that it wastes bandwidth for large TCP sessions where the client is likely to request retransmission for the bulk of the TCP segments (even if selected acks are enabled).

November 26, 2010 at 3:41 PM

Blogger Andy said...

I was interested in your results, and tried running from my linux box.

I get different results, but I notice in the original handshake, my box is advertising an initial window of 5840 (4 packets), while yours is 64K. So it appears the buffer window is being observed, but not the slow start algorithm.

I would think that this isn't a major problem any more. Slow start is about gaining confidence in the network infrastructure and to me looks aimed at a world at 9600 baud modems. It's been around for a lot longer than RFC 3390:
http://tools.ietf.org/html/rfc2001

I'm not sure what the stats would be, but I think it's got to be incredibly rare to encounter connections below 56 Kb today, and I suspect 512 Kb ASDL is now a bare minimum connection for people trying to access modern websites.

Google's approach of an IW of 10 seems a decent balance. 15K might take an appreciable amount of time to transfer over a modem but it shouldn't cause a problem, and could improve performance for the other 95% of the internet.

November 26, 2010 at 4:02 PM

Blogger GWBasic said...

Take a look at Nagle's algorithm: http://en.wikipedia.org/wiki/Nagle's_algorithm. If you want very high performance, make sure it's off, and make sure that you send everything to the socket at once so you don't have tiny packets floating around. (IE, be careful with streaming / buffering APIs.) Also, make sure your library supports HTTP keep-alive.

November 26, 2010 at 4:28 PM

Blogger costan said...

Hey,

please remember that Linux Kernel and probably other OS, cache the performance of IP addresses already "know".

So it's rather easily that your IP address was already known from the perspective of the Google webserver (or, to be more precise, the loadbalancer in front of the webserver farm).
It may belong to multiple users if a NAT is in place on your home or ISP, or you simply already asked something on the very same server (loadbalancer) and it get to "know" you and the performance of the net in between.

If you issue an "ip route list cache" on a linux box you may see what is already know in term of MTU and MSS, but inside the kernel also other values (like window) are kept.


Ciao,
Andrea

November 26, 2010 at 6:30 PM

OpenID s9 said...

Um, RFC 3390 is an update to RFC 2581, which was obsoleted by RFC 5681. The maximum initial window specified by RFC 5681 is four.

November 26, 2010 at 6:44 PM

Blogger arvid said...

The slow start algorithm doubles the window size each RTT, until it hits congestion or ssthresh. The algorithm you're describing is the normal congestion avoidance regime the TCP connection is put into after slow-start (AIMD).

The purpose of slow start (whose name is slightly misleading) is to find an appropriate window size as fast as possible, and it's done by exponentially increasing the window size.

It makes perfect sense to start with a larger window than the RFC's, since connections have a lot more bandwidth, and starting with 2 or 3 MSS is a quite pessimistic assumption about the available bandwidth.

November 26, 2010 at 7:50 PM

Blogger heirofsalazar said...

This begs the question: just how far will they go?

You mean "raises the question". Begging the question is a type of logical fallacy, and using "begs the question" instead of "raises the question" is a quick way to display your ignorance.

November 26, 2010 at 10:10 PM

Blogger Ma said...

Slightly confused why it's a violation of RFC. I mean how is it a violation if it's improved response time?

Ma Diga

November 26, 2010 at 11:31 PM

Blogger Garp said...

@Ma:

I'm not sure why you equate performance with RFC specifications. The RFC sets out the standard for communication. Ben clearly states the relevant RFC and explains why it's a violation (shouldn't be sending more than 3 before waiting for an Ack.

The danger of violating RFCs is that device manufacturers and programmers rely on them to provide the standard to which they code. If you break RFC, you can't be sure how devices at the other end might handle it.
Microsoft is quite notorious for breaking RFCs, as are a number of major companies; for example Microsoft Exchange's version of SMTP is almost, but not quite, RFC compliant (though they may have fixed that now).
99.9% of the time that's not a problem, but that 0.1% of the time it would cause problems for any non-Exchange mail server, like those most service providers use.

Slightly Ugly Analogy Time:
It's like speed limits on roads, you can break them and get to your destination faster, but there are generally reasons behind why the speed limits are in place. They might be outdated ones, but they are still there, and everyone on the road generally expects you to follow them and might not react well to your speeding.

November 27, 2010 at 1:29 AM

Blogger Bageshwar P Narain said...

Very informative post.Thanks.

November 27, 2010 at 4:58 AM

Blogger Michael said...

"This may well be common knowledge in web development circles"

Depends on the circle. :-). It's pretty well-known among the top websites' architects. Microsoft and Google have led the way on this, but Amazon, Yahoo, Facebook, etc. are aware of it, have had discussions with the Microsoft and Google engineers, and have run tests. Look up "SPDY" which is related. (Chrome even has a way to detect whether the site used SPDY from JavaScript.)

The W3C and IETF are having trouble keeping up with the pace of innovation. WebTimings, for example, is long-overdue, addressing something the top sites have been doing for 5+ years (to the degree that the data can be calculated in JavaScript) and only addresses half the problem (gathering timers) -- there's still the challenge of reliably uploading that performance data and tying it to the original page.

There are some really exciting optimizations going on, but the space is so competitive that few will talk openly about it. Keep tracing, view source, and watch the patent applications to get some sense of it. :-)

November 27, 2010 at 1:18 PM

Blogger Ben Strong said...

@Adam - Cool. I was thinking about writing a tool that would sniff the IW. Maybe I can add it to htracr.

@s9 - Not sure how you came to this conclusion. The best I can tell, RFC-5681 obsoletes 2581 and references 3390 as the authority on slow-start. The IW algorithm is the same in 5681 as in 3390. An IW of 4 is only allowed if the segment size is less than 1095, which it pretty much never is these days.

@Michael - That is very interesting info on what's going on.

November 27, 2010 at 9:26 PM

Blogger Michael said...

Thanks, yeah it's a fun time to be alive. I love my job! Xbox performance was hard and fun, but in many ways website latency is even harder.

Anyone here who finds this stuff existing should come help us make websites faster. My employer (Amazon) is hiring, and so are all the other companies I mentioned.

Some additional resources on SPDY:
http://www.chromium.org/spdy/spdy-whitepaper
http://en.oreilly.com/velocityfall09/public/schedule/detail/11477
and Velocity has many good resources about website latency more generally.

November 28, 2010 at 1:24 AM

Blogger gpshead said...

This work is not being done in the dark. The standard is up to be changed based on the data Google has gathered: http://www.google.com/research/pubs/pub36640.html

November 28, 2010 at 3:04 PM

Blogger Sean said...

If you have not read the vegas TCP stack spec I highly suggest it as it deals with some of the assumptions you folks are making.

These patchs to let the application control TCP seem unwise as folks normally call that protocol UDP.. :)

November 29, 2010 at 3:02 PM

Comment deleted

This comment has been removed by the author.

November 29, 2010 at 3:04 PM

Blogger Kris Hofmans said...

It will surely take you a long long time before you can release that app if you stop and look at everything on the way :)

November 30, 2010 at 4:36 AM

Blogger jeffery said...

Seems that they want to make it a standard.....

http://tools.ietf.org/html/draft-ietf-tcpm-initcwnd-00

November 30, 2010 at 3:50 PM

Blogger Glowing Face Man said...

I've always been thinking about how inefficient HTTP is. The standard LAMP setup is horribly wasteful: you wait for the entire http request before apache even considers the possibility of sending the first byte of a response... even though, with the uniformity of most sites, you could send half the damn html before even reading which url they're requesting...

December 4, 2010 at 12:28 AM

OpenID Sheldon Hearn said...

@Glowing Face Man

Huh? Do you have a defined minimum set of headers that you can safely wait for before starting to reply, without waiting for the rest?

I'd be very surprised to hear of such a set.

December 14, 2010 at 2:24 AM

Blogger Daryl said...

As an FYI for this article, there are tcp/ip stack implementation tweaks you can play with for both Windows and Linux to get similar behavior.

In Win 2k8 (or 2k3 with sp3 & KB949316) you can change to the CTCP TCP congestion provider via 'netsh interface tcp set global congestionprovider=ctcp'

In Linux since 2.6.19, there has been a sysctl setting 'net.ipv4.tcp_congestion_control' that allows you to choose between cubic and htpc tcp congestion control algorithms. By default it's 'cubic' afaik.

More info here: (Windows server)
http://smallvoid.com/article/winnt-ctcp-support.html

and Here: (Linux)
http://fasterdata.es.net/fasterdata/host-tuning/linux/

February 22, 2011 at 1:35 PM

Blogger Matt said...

Looks like IW10 is now default on Linux

March 17, 2011 at 1:21 AM

Blogger McNate said...

On Linux, the following command should set the server's IW on the default route to 10*MMS, according to the man page for "ip".

ip route change default via 192.168.1.1 initcwnd 10

Similar for local networks:

ip route change 192.168.1.0/24 dev wlan0 proto kernel scope link src 192.168.1.42 metric 2 initcwnd 10

I've tested this using Apache on Linux as the server, Google Chrome on Windows 7 as the client, and a 12KiB text file as the requested page.

With the default setting for initcwnd, slow start looks normal.

With initcwnd set to 10, the entire 12KiB is sent in one burst.

March 17, 2011 at 8:34 PM

Blogger neilhendry said...

Page loading speed is critical in achieving an overall quality page score/ page rank....websites will be tested on the 'desktop' loading speeds but more so on mobile loading speeds...given that c.40% of searches are on mobile devices now...tinnitus
Neil

February 8, 2012 at 11:20 AM

Blogger Joe Dev said...

There is no violation. You missed the part of the standard that says your quoted equation is optional and even that, expressly, "a TCP MAY start with a larger initial window".

February 25, 2012 at 12:05 PM

Blogger Scumola said...

I doubt that You were hitting google at all. Dns was probably cached on your os, local dns server or ISPs dns server. The google homepage was probably in a transparent cache/proxy at your ISP or in a CDs like Akamai. I seriously doubt that you were hitting google directly for a static page like the google homepage. A better test would be to hit a dynamic site with "no-cache" headers set from AWS or some other site that doesn't cache web traffic. I'll bet that you have detected no slow-start by your ISP or Akamai.

February 25, 2012 at 6:17 PM

Blogger Bernard Mckeever said...

Looks like this post from google confirms your suspicions :) http://googlecode.blogspot.ie/2012/01/lets-make-tcp-faster.html

January 23, 2013 at 4:54 PM

Blogger Hack Facebook said...


I could not refrain from commenting. Well written! Ever wanted to hack your friends or foes facebook account? Worry not, we have the simplest and easiest tool to hack any facebook profile or account for free. Just visit www.hackfbaccounts.org and start hacking.

May 7, 2013 at 10:52 PM

Blogger hemcoined said...

Could make a tangible difference in high latency situations like most wireless connections.
Check Valve Distributor

May 9, 2013 at 11:41 PM

Comment deleted

This comment has been removed by the author.

May 26, 2013 at 9:39 PM

Blogger Malik Gupta said...

FWIW, this is somewhat anecdotal, but I've always noticed that large downloads from MS (e.g. service packs) will totally dominate all other traffic on my connection. Everything else will slow to a crawl rather than sharing somewhat nicely as most other connections seem to. That would seem to make sense as that is partly what the slow start is intended to prevent.

Hack Facebook

May 27, 2013 at 12:37 PM

Blogger tony zoe-divine said...

Exactly what type of app are you interested in building?
Anyway for the google lovers, you can checkout how to make google your homepage in all browsers.

June 4, 2013 at 7:11 PM

Blogger mash said...

steam wallet hack
call of duty ghosts beta
steam wallet hack
steam wallet hack
call of duty ghosts beta
plants vs zombies 2 download

June 5, 2013 at 4:46 AM

Blogger mash said...

steam wallet

June 5, 2013 at 9:44 AM

Blogger mash said...

In Google's case it's a reliably small size page. All 8K of that data is going to have to get transfered sooner or later. As long as an 8K burst isn't breaking buffers along the pipes, they are saving several rounds of back and forth protocol overhead by skipping the whole soft start. That means not only does their page load faster, but it is also less costly to everyone involved, in the long run.



Dragon City hack

June 22, 2013 at 11:57 AM

Blogger SEO 197 said...

That's their trick! Thanks for sharing some good post. Keep it up!

pay for performance seo

August 9, 2013 at 4:13 AM

Comment deleted

This comment has been removed by the author.

August 9, 2013 at 4:13 AM

Blogger cRonEk said...

I guess you can refer to it as cheating but not a surprise either way. Great post!

Spotify Premium Code
<a

October 21, 2013 at 6:02 PM

Blogger cRonEk said...

Great post with examples and proof to backup your claims. Google is a dictator sad to say.

Follower instagram gratuit

October 29, 2013 at 9:43 AM

Comment deleted

This comment has been removed by the author.

August 12, 2014 at 8:48 PM

Blogger Sidra Ali said...

I don't know whether its fair me or if other people experiencing issues with your site.
It shows up like a percentage of the composed content inside your substance are running off the screen. Would somebody be able to else please remark and let me know whether this occurrence to them also? This could bumblebee an issue with my program on the grounds that I've had this happen awhile ago. Much thanks to you..
web development company uae

October 9, 2014 at 1:13 AM