For those of you who cannot go to freeproxies because “proxy” is in the url, we’ve pointed freetunnels.com .net and .org to load this site as well. Enjoy!
Every now and then, someone asks me “Hey, how do I ‘tune’ my webserver to better run these proxies? I’ve heard tuning is really important but what does it mean?” or they’ll ask “why is it important to run a dedicated server just for a proxy site? Can’t I run my sites on (some cheaper hosting method)?”
Hopefully once you’re done reading this article, you’ll better understand the answers to these common questions, and come out with a knowledge of how specifically to tune apache for J Marshall’s Cgi-Proxy script to run on Apache. You should also be able to take the concepts herein and apply them to performance tuning Apache for any website. So without further ado, I give you:
The Freeproxies Guide to Tuning Apache To run CGI-Proxy
Optimization is a matter of the right tool for the job, and reducing dependance on whichever resource is your current bottleneck. I find that with cgi proxy, having plenty of ram is a necessity, and that tuning your apache maxclients, read timeout, keepalive timeout, requests per child, and max keepalive requests are essential to a properly functioning server. First of all, to edit these and all the other apache configurations, you need to edit your httpd.conf file (with the text editor of your choice). All of these modifications balance end-user perceived speed against ram and cpu usage.
The default settings of apache are pretty good for lightly trafficked servers to improve the response time the end user perceives, as well as supporting all the options and features a webmaster might want to use. This is at the expense of greater ram use, and occasionally, increased cpu usage. Therefore, using a more specialized apache configuration can get you a lot more performance for the task at hand.
The settings we’ll be going for, since cgi proxy tends to be ram limited (once you’ve installed mod_perl), are designed to balance a user’s perceived site loading speed against the need to minimize the number of client slots that are being used up. So the first thing we’ll mention is the “maxclients” parameter, since it’s the simplest one to understand and also the hardest one to just give you a good value for. Noah didn’t come down from on high to deliver the proper maxclients value, so lets explore what goes into that decision.
Maxclients, in case you’re wondering, is the maximum number of people that can be connected to your webserver at one time. We put a limit on max clients because each connection takes a certain amount of cpu and ram to process, and if we run out of either, the site will load very slowly and the server may even crash. Maxclients therefore needs to be set so that you will run out of client slots before you run out of ram.
Each client slot is another copy of apache running on your server, and depending on what modules you’ve loaded into it, will use up different amounts of ram. Your goal in general is to reduce the amount of per-client ram usage while still having the modules you need installed to run your web service properly. Since cgi-proxy uses perl, mod_perl is absolutley necessary to get any kind of reasonable performance level. mod_perl does throw an entire copy of perl into each apache process, so ram usage is quite high when you use mod_perl, especially with cgi proxy. However, you gain massive savings in cpu usage, so it isn’t worth considering going without mod_perl. Exactly the benefits and pitfalls of mod_perl are best left to another article.
At this point, we can therefore say that your total ram needs are as follows:
Ram needs = Ram usage per client slot * number of slots needed + operating system overhead.
And, to find out the maxclients value you can support, you can use the following equation:
Maxclients = (Total system ram - Operating system overhead) / Ram usage per client slot.
So our goal here is to reduce usage per client slot, and reduce the total number of slots needed to serve a given amount of traffic. Reducing operating system overhead is outside of the scope of this article. Luckily there are a few things we can do with our apache configuration to reduce the number of client slots we’ll be using and to reduce the amount of ram each one takes up. In httpd.conf, here is what we’ll be editing:
Keepalive Timeout: This should be be low on any busy site so that client slots free up for new users quickly. Keepalive timeout is the number of seconds a client slot will be sitting around unused, waiting for its client to make new requests before it gives up and allows itself to be used by another client instead. The advantage to keepalives are that the same client doesn’t have to try to connect to your site over and over again to get multiple resources (images, web pages, etc). Performance will suffer greatly if you turn off keepalives entirely, but the default of 15 seconds is way way too high for our usage. On my sites, I use a keepalive timeout of 1 second. This is long enough for a user downloading a single page to get all the page elements downloaded without having to continually reconnect to my server, but short enough to free up the slot quickly for other users.
Connect timeout: For the same reasons, connect timeout should be low. The default of 300 seconds means a bad client can take up a client slot for 5 minutes without using it! Depending on my mood, I’ve set connect timeout between 5 and 15 seconds. You can start with a low number and raise it later if you notice users are getting timeout errors.
Requests per child: This should also be a low number. Other apache tuning guides recommend a number in the thousands or tens of thousands. That’s great when ram is not at a premium, or when clients tend not to use more ram as they serve more requests, as it means apache won’t spend much time killing and creating new apache instances. It is important to understand however, what effect this has on memory usage.
When a child apache process starts, most of the ram it takes up is actually shared with other programs (shared pages). As a process changes the data stored in these shared pages, it has to make a copy just for itself. As time goes on, a process will be using more and more real memory, and less shared memory. Therefore, to free up ram (at the cost of some cpu), we will eventually kill off an apache child and replace it with a new one. The more often you do this, the less ram you use, so for our purposes, we want a low requests / child process.
How low is low? For cgi proxy, anywhere from 50-150 works for me. Setting to 1000 or higher is a death wish for our proxy (but is fine, I’m told, for other kinds of websites). Keep in mind that a keepalive request does not count against this total requests / child, so you can actually get substantially more mileage out of each apache child process than this setting implies.
Max keepalive requests is a somewhat less important figure. This is the maximum number of items apache is willing to serve to a single client from a single connection. I have mine set to 100. Basically, you want to make sure that an entire pageview from a user can be served within a single http connection. This is so that the user doesn’t have to disconnect and renegotiate a new connection to your server in order to download everything needed for that pageview. Because the overhead in creating dozens of connections is so high, you will want to have keepalives turned on to 1 second instead of disabling keepalives altogether. The main reason you’d want a low keepalive requests value is to make sure no single user can monopolize a client slot and “own” it forever.
Minspare, Maxspare and Startservers dont tend to matter as much as the other settings. Start servers should probably be pretty close to maxspare, but not more than maxspare. If starservers was higher than maxspare, then when you start apache it’s going to create all these new apache threads to satisfy start servers, and then immediately kill them to satisfy maxspare. I personally have minspare of 20, maxspare of 100 and start servers of 100, but keep in mind these should be some reasonable proportion of your maxclients. The things to keep in mind are that you dont want to waste ram by having too many idle clients sitting around, but you do want to have some extra ones around in case you get a burst of traffic. You dont want min and max spare to be too close together because then you’ll be constantly creating and killing apache processes, which eats up cpu time.
I know I said before I wouldn’t be giving you maxclients values handed done from a stone tablet, but you’ve got to start somewhere, so here’s what I use:
For cgi proxy, 200 is a reasonable maxclients value to test for on a server with 2gb ram. On my 8gb servers, i’m testing out 550 slots, but I figure I can probably go a bit higher. If you max your clients and you have more free ram, you can add more to maxclients. If you start using swap memory excessively (or even have a server crash as you run out of ram *and* swap), then you need to reduce your maxclients.
If you make use of ssl, or you have php or other large modules compiled into apache, then you likely will need fewer clients. For static page serving, or even php hosting, these maxclients values may seem abysmally low… and they are. But, cgi proxy uses a lot of cpu, so if everything on your server is working properly, these number of clients should be good enough to max your cpu.
There’s a few other things we can do to performance tune our server to get maximum mileage out of it. We’ve already discussed how many client slots you *can* have with your given ram and configuration, and now we’re going to figure out how many we actually need. This can be found by the following equation:
Slots needed = (requests / second) * (seconds to process each request)
So if you want more requests / second (more traffic), you need to reduce the time apache spends serving each request, or you need to add more slots (more ram).
Interestingly, the time it takes to process a request can be broken up into a few areas (a proxy adds a couple that a normal website wouldn’t have):
1) client connection handshaking
2) dns resolution (since we’re proxying content, we need to know the ip of the server the user is requesting content from)
3) downloading the target website from the internet
4) doing cgi-proxy related processing
5) sending the request result back to the user
As you can see, theres a lot of things that can conspire against you to use up valuable client slots.
Bad ping times or congested bandwidth between your server and the target website or client will make parts 1), 3) and 5) go slower, increasing seconds per request and requiring more client slots to fufill the same number of requests. You may think this is not under your control, but actually, you can reduce the time apache spends on all three parts by setting up Squid to run as a reverse proxy (also known as web server acceleration mode). Doing this makes squid responsible for “spoon feeding” the cgi proxy results to your clients, so that apache doesn’t have to. Because squid uses less ram per connection than apache, our expensive apache client slots can go ahead and work on more important matters than dealing with client connections. Setting up squid is outside the scope of this article, but it is something worth investigating.
A maxed cpu will make part 4) go slower, so if you’re maxing out your cpu, you will end up using more slots as each request is taking longer to process. By having more slots you’ll be processing more requests, and each of those will go slower still. If you run out of real memory and start using the swapfile, the amount of time taken for cgi-proxy processing is going to go through the roof, exacerbating this problem. If you dont have a reasonable upper bound on the number of requests your server is willing to process, this can cause your server to max out its ram *and* swap, and then crash.
Section 2 is less obvious, but if your dns resolvers are slow or the first one in your list is down, this will require every dns request to timeout before your server can ask the next server in the list to fufill the request. This will substantially increase the time needed to serve every page. You will therefore get fewer pageviews as your users get impatient and leave, and your client slots will easily max as it is taking a lot longer to serve each page request.
Your goal therefore, is to resolve dns names as quickly as possible. Your isp or datacenter should have provided the ip addresses of their own resolvers that you can use. You can edit your resolvers in /etc/resolv.conf in most linux installations. If you have Bind installed on your server, you can make the first resolver in the list 127.0.0.1, which has your server resolve hostnames locally. This can speed things up dramatically. You will still need other resolvers in the list, because if you can’t resolve a name locally, bind will go ask another server to do it for you, and then cache the result for later use.
Finally, to save on ram, you should disable as many modules in apache as you can, while still having your proxy work correctly. The added modules and extentions for apache can be edited in httpd.conf. For cgi proxy, this means you definitely dont want php compiled into the same server you run mod_perl on, if at all possible. You also should turn off the ssl related modules unless you actually use them, as this is a major ram hog as well. The rest of the common modules you can do without are fairly small, but in total, removing them can get you that last bit of power you want out of your server.
In httpd.conf, go down the list of modules and google for them. For most modules, the first google result should be the apache documentation for that module. After reading about the module, if you decide you can go without it, comment out the line, restart apache, and make sure your service still works as expected. Go through that for every module in the list, and you can improve your memory usage significantly.
WHEW! That was a lot to take in all at once wasn’t it? Congratulations on making it this far.
All these modifications make for a fairly good working environment for cgi proxy, but a pretty poor working environment for a typical webserver, which is why proxies should have their own dedicated servers tuned for their specific use. Shared hosting is an especially bad place for a proxy because a shared host needs to have all kinds of modules installed to satisfy the diverse needs of its customer base, as well as provide good management and reporting tools for the hosting company. Most hosting providers do not or will not install mod_perl for ram usage and security reasons. Without mod_perl, even the smallest web proxy can bring a server to its knees.
This article only scratches the surface as far as the different things you can do to optimize your server and get the most out of your limited computing resources. There are other more advanced optimizations you could do, involving setting up complicated caching arcitectures, or significantly modifying the proxy code to either do less stuff, or do the same stuff faster.
If you’ve just got done eagerly implementing all the advice in this article, and are thirsting for more great ideas, there is a book I would recommend to you. I just got done reading a book that is very informative about everything you would need to know (conceptually at least) for building scalable internet architectures. Not surprisingly, the book is called “Scalable Internet Architectures” by Theo Schlossnagle. In particular, the book goes into detail on using squid to improve serving performance, setting up failsafe hosting environments, and designing applications that scale. Anyone who thinks their web application may grow to use more than one server can benefit from the concepts detailed in this book.c40