Bandwidth – reloaded

Posted by & filed under , .

I’ve posted before about bandwidth in my posts and I knew from the beginning that it’s one of those issues that you can never exhaust. As it happens recently I came across another interesting thing which is probably worth sharing: cross browser delivery. In brief, it means delivering to each browser the content it can “understand” and more to the point display.

Sure it is easy to deliver the same content to each and every browser – it means you don’t have to do any browser “targeting”, and as such your code is easier to maintain and write. It does mean however you are pumping out quite often a lot of content that browsers will have to ignore as they cannot display it. For instance, some of you might not know this but there are a few text-only browsers in the unix world (mainly lynx or elinks) that have been designed to work for instance in a text console — they have no capabilities of displaying images, flash movies, quicktime etc as typically the container they get launched in doesn’t offer anything but text (think old-style green terminals!). You’d think that there is no point for someone to use these browsers and in most cases you’re correct, but imagine this: you’ve ssh’d into a server and figured out you need to download a patch to some component. You got 2 options:

  1. browse the net on your desktop using a “proper” browser, download the patch locally and then upload it onto the server to apply it
  2. use the likes of lynx on the server directly, google and find the patch, download it right onto your server and apply it

The latter surely is not the most user-friendly experience but it has the advantage of typically being fast: if your server is in a data centre it will quite likely have a fast connection and since it is a server by definition it is more powerful than your desktop machine so more than likely the download will complete in an instant. On the other hand, in the case of the former, the initial download onto your desktop is guaranteed to be slower as it will go through your office internet connection (I don’t know of that many companies that have a gigabit connection from their office!) and on top of that you have to then take the hit again to upload this onto the server through the same (slow) office connection. And if you have to jump through another machine to reach your server you have doubled the hassle for the sole purpose of avoiding using a text-based browser!

Don’t get me wrong, I don’t think anyone in their right mind would actually use one of these browsers all the time – but they do exist and are occasionally used, and as such you’ll probably find you get occasional hits from them. And chances are each time you reply with the same javascript block (that gets downloaded but not executed), or flash content, image etc only so the browser can ignore it. On average you’ll probably get about 0.5 to 1 % of your hits I would say based on previous observations, though there are of course variations. That’s not a big percentage but it can add up to big numbers depending on the size of the site your solution is running on: on a site with a daily average of 10,000 page views a day (and this isn’t by any means a big site – 300k page views a month isn’t a big number at all!) about 100 requests will come from these sort of browsers, and with an average response of say 5k (though I’m guessing the reality is higher than that if you have graphics etc as it will probably some up to about 30k or so) you are wasting daily on this site alone 0.5 MB. Doesn’t sound like much, but if you’re dealing with say 20 sites (though again, in reality, if you’re an average advertising solution you’re more likely to deal with at least 100 of them) that means 10MB a day wasted for these browsers alone.

I might have mentioned the next issue before but it’s worth re-itresting it: web spiders! (These only apply if you’re a company providing online services for sites, not if you’re a website yourself, in which case web spider traffic is useful for seo reasons.) Ask yourself: do you really need to return any content to web spiders? Does that bring you any benefits? Would simply return HTTP 200 (OK) with zero length content not suffice? If you are not providing any response to the spiders that will benefit the site (in terms of search ranking etc) than it won’t benefit you in anyway either returning that content! and bearing in mind that an average spidering happens once a week but across the whole website you might be saving yourself some bandwidth! If we consider the same average of 5k per response and a site with about 500 pages (again I’m being very conservative here!) each spider visit would see you waste around 3 MB per web spider a week. If we consider only the major 4 spiders (Google, Bing/Microsoft, Yahoo and Ask) we’re looking at about 12 MB every week per site. Times that 20 (sites) and you got approx 1/4 GB in w week wasted! Add the other 300 MB and rougly 1/2 GB a month are just wasted. And if you’re paying for your outgoing traffic from your datacentre you will probably find you’re wasting some money as well – not to mention the hidden cost of your servers actually processing the requests, entries probably stored in the database and so on.

Another thing to take into consideration is Flash support. With the increasing demands for user interactivity in the advertising space nowadays it’s pretty rare to find non Flash ads on websites. If your solution is involved in this chain of advertising delivery then you will have to start looking at whether the content you are pushing is compatible with the user browser — in this case for instance does it support Flash or does it support the right version of Flash? Obviously in most cases you would employ some JavaScript and with the help of the swfobject library you would find out and have a second hit requesting the right creative for what the browser supports – it is a standard mechanism of delivery of Flash. That is because normally you can’t determine Flash support based on browser headers — but how about the mobile phones traffic? You can have a pretty good idea of what type of mobile it is based on the user agent, right? And most of the handsets don’t support Flash and even though they do support JavaScript it doesn’t make sense to use the same mechanism, knowing upfront that Flash is not supported. Why not instead just deliver upfront the non Flash creative? (I.e. static graphic and link) with your average Flash creative size for a sky scraper around 30+ KB and an image of the same size going around 15 KB (half size) you’d be halving your bandwidth consumption for mobile traffic! With some sites getting up to 20% of their traffic from mobiles if you consider the above example again you’re looking at 2,000 requests a day being mobile traffic. And based on the above figures it means for these requests you’ll be wasting 2000 x 15KB=30,000 KB (30 MB) a day – so more than another half a gig a month.

Come to think of it you might want to check as well whether your solution does need indeed to deliver to mobile handsets? If it doesn’t, just blocking those requests and not returning anything saves you another 600 MB or so! Oh and by the way, that would be per site – multiply this by the 20 sites you’re working with and the figures will start being more substantial!

I’m not going to end here as more than likely there’s more to the bandwidth saga than this – I will however take a break for now. Till next time I write about it, watch your bytes! 😀