Nothing is worse than a successful website failing to serve visitors by crashing every time it comes under load. Worse, you invest in traffic, then face the cost of emergency services when your site goes down in the middle of a marketing campaign. How can your business website better handle success?
As your site gets more popular, business website capacity management has a greater importance (especially if you're making any money with your site). Capacity management is a design process that considers upper constraints for systems to handle requested transactions.
I'm periodically asked by clients, “Could my website handle 100,000 new visitors tomorrow? or over a short time?” A smart client will ask this before a marketing campaign, television coverage, but most often this kind of traffic happens unexpectedly.
Imagine you just got picked up as a guest speaker at your customers next big trade event. Or, your product is being covered on a popular review site. These things aren't always planned but can be addressed with tests associated to capacity management.
Website capacity is a tricky question …
You can artificially create the traffic with a testing tool, however, if you are unsure about how to do that it would be a good idea you didn't. Handling a high volume of traffic is more in the design rather than testing existing sites.
Can you do this yourself? Many webmasters and programmers want to attempt it. However, a lot of things can go wrong testing a live site, here are a few:
- Taking down other websites. If you are in a shared hosting or virtual private environment your tests can be influenced by others on the same box. Worse, you might take down their websites and face a lawsuit.
- Getting banned by your provider. Even proper testing looks like a denial of service attack. Every step needs coordinated with a willing service provider, or done in a controlled environment.
- Overload DNS server. Your requests may slow down the name resolution of thousands of other users. An overloaded DNS server may take down a hundred thousand sites at a large provider.
- False positive from hosting caches. Many quality hosting providers cache front end content to prevent failures from legitimate traffic, but this means you'll be looking at their equipment, not your own.
- Upset bandwidth controls. In testing you could use up your allocated bandwidth and get your site shutdown for the rest of the month. In other cases you'll be billed the overage.
- Taking down your providers infrastructure. Your box may be able to handle the traffic, however, your Internet or hosting provider may not. Accidentally taking down their network will get a whole lot of people mad at you.
- Exceeding your hosting providers allocation. Drive traffic up too much and you might get a big bill from a provider by running over their bandwidth allocation. Worse, with a non-bursting network you could bring down their connections.
This list could go on for days from some of the horror stories I've seen. Fortunately most hosting providers will just shut off your test and send you a bill. This is to protect their network and other customers.
Testing is different if you have a static site than if you have a dynamic site. To know for sure on a dynamic site you'll need to profile the database server and local disks too. At that volume you'll want to have access to the box AND local network.
If you have several servers behind a load balancer, then you'll test differently than a site you have with a hosting provider. If you are using a hosting provider then contact them, they may be able to tell you if your plan can handle that kind of load.
Not understanding the many dynamics of testing could lead to false assumptions …
In some cases you'll be able to handle a short burst of 100k page impressions, but not 100k of unique visitors because of how your server does session handling. You may be allowed 100 concurrent sessions, each being able to handle 1000 requests, so page views won't be a problem, but unique sessions would.
Unfortunately, there is no way of knowing that your provider won't throttle more than 50 concurrent connections in a network device to protect their network if the burst is too short (or too sustained.) A quality provider will have several layers of protection.
For your safety (and most accurate results), if the site is making any money, then get a professional involved who can help you load test without breaking the site. I've seen some very big sites get killed (more than just crashing the server) by poorly implemented load testing.
Involving a professional protects you, your customers, and a real professional will work with your team as needed. Expect them to determine a “monetary figure for failure” before testing. By determining how much it costs if the site crashes you'll take a more sensible approach to testing and can better gauge the return of proposed solutions.
Trying to determine the upper end capacity of your website on your own puts your job at risk. You risk upsetting customers. Worse, the business will be mad if the site comes down hard during testing, or they will be mad when the site comes down under load.
I've seen good and well meaning people fired on the spot for testing gone wrong. You'll need a game plan before you start testing. Everyone must understand what factors influence accurate results, here's a brief outline:
- Where testing is conducted. You'll get different results testing from home, out of the office, at the hosting provider, off a local network, or across the Internet.
- How to interpret results. Read your test results wrong and the site crashes under load, or you spend to much for unnecessary upgrades.
- Hardware configuration and resources. A load balanced system tests differently than a loose cluster, or shared hosting environment.
- Baseline latency and bandwidth usage. Testing during any time of the day has unexpected variables. Testing in off peak doesn't take into account other users, availability of bandwidth, or base noise.
- Types of caching or fail safes. Your proper planning may go south if an Internet provider shuts down or throttles your tests early, you may not even know you are being blocked. They can even black hole your traffic, which can make a bad test look good.
- Sessions verse impressions. What you are testing depends on how your site is used. Application servers are tested differently than static websites or database driven content management.
- Network thresholds along path. Can your test even reach your server with the available connection between you and your target. Use a factor of page size, number of concurrent users, and threads to calculate network stress.
This is just a short list of factors that influence website capacity tests. With any one of these factors missed you can get wildly different results, any of which can cause you to draw the wrong conclusion.
The method of testing you use is important too. Depending on your traffic volume, trends, and type of web server; you may be able to extrapolate capacity from server performance and log files. However, at the 50k+ volume you have many variables to consider.
With all that said, if your site is making any money, proper capacity management is insurance against losing money. Many great websites have lost an estimate 40 million a year to systems crashes, all that could have been avoided with a few thousand a year in capacity management.
© 2008 B2B Website Profits, All rights reserved.
Justin Hitt and Hitt Publishing Direct have more than a decade of experience with capacity management for high-traffic business systems with more than a hundred-thousand to two-million unique visitors a day. If you rely on your business website for financial success, visit https://www.jwhco.com/