Post by francis picabiaPost by Bob ProulxPost by francis picabiaOne of the most frustrating problems which can happen in apache is to
server reached MaxClients setting
Why is it frustrating?
Yes, maybe you don't know this condition.
I am familiar with the condition. But for me it is protection. I am
not maintaining anything interesting enough to be slashdotted. But
some of the sites do get attacked by botnets.
Post by francis picabiaSuppose you have hundreds of users who might decide to dabble in
some php and not know much more than their textbook examples? In
this case, part of the high connection rate is merely code running
on the server. It comes from the server's IP,
That would definitely be a lot of users to hit this limit all at one
time only from internal use. I could imagine a teacher assigning
homework all due at midnight that would cause a concentration of
effect though.
Post by francis picabiaso no, a firewall rate limit won't help. It is particularly annoying when
this happens after hours and we need to understand the situation
posthumously.
You would need to have a way to examine the log files and gain
knowledge from them. Lots of connections in a short period of time
would be the way to find that.
The default configuration is designed to be useful to the majority of
users. That is why it is a default. But especially if you are a
large site and need industrial strength configuration then you will
probably need to set some industrial strength configuration.
Post by francis picabiaPost by Bob ProulxIs that an error? Or is that protection against a denial of service
attack? I think it is protection.
It does protect the OS, but it doesn't protect apache. Apache stops
taking new connections, and it is just as good as if the system had
burned to the ground in terms of what the outside world sees.
First, I don't quite follow you. If Apache has been limited becuase
there isn't enough system resources (such as ram) for it to run a
zillion processes, then if it were to try to run a zillion processes
it would hurt everything. By hurt I mean either swapping and
thrashing or by having the out of memory killer invoke or other bad
things. That isn't good for anyone.
Secondly Apache continues to service connections. It isn't stuck and
it isn't "burned to the ground". So that is incorrect. It just won't
spawn any new service processes. The existing processes will continue
to process queued requests from the network. They will run at machine
speed to handle the incoming requests as fast as they can. Most web
browsers will simply be slower to respond. If the server can't keep
up to the timeout from the browser then of course the browser will
timeout. That is the classic "slashdot" effect. But that isn't to
say that Apache stops responding. A large number of clients will get
serviced.
There isn't any magic pixie dust to sprinkle on. The machine can only
do so much. At some point eBay had to buy a second server. :-)
Post by francis picabiaMaxClients isn't much of a feature from the standpoint of running a service
as the primary purpose of the server. When the maximum is reached,
apache does nothing to get rid of the problem. It can just stew there
not resolving any hits for the next few hours. It is not as useful say
as the OOM killer in the Linux kernel.
I read your words but I completely disagree with them. The apache
server doesn't take a nap for a few hours. Of course if you are
slashdotted then the wave will continue for many hours or days but
that is simply the continuing hits from the masses. But if you have
already max'd out your server then your server is max'd. Unlike
people the machine is good at math and knows that giving 110% is just
psychological but not actually possible. And as far as the OOM killer
goes, well, it is never a good thing.
Post by francis picabiaPost by Bob ProulxThe default for Debian's apache2 configuration is MaxClients 150.
That is fine for many systems but way too high for many light weight
virtual servers for example. Every Apache process consumes memory.
The amount of memory will depend upon your configuration (whether mod
php or other modules are installed) but values between 20M and 50M are
typical. On the low end of 20M per process hitting 150 clients means
use of 1000M (that is one gig) of memory. If you only had a 512M ram
server instance then this would be a serious VM thrash, would slow
your server to a crawl, and would generally be very painful. The
default MaxClients 150 is probably suitable for any system with 1.5G
or more of memory. On a 4G machine the default should certainly be
fine. On a busier system you would need additional performance
tuning.
Ours was at 256, already tuned. So the problem is, as I stated, not
about raising the limit, but about troubleshooting the source of
the problem.
I think you have some additional problem above and beyond hitting
MaxClients. If you are having some issue that is causing your system
to be completely unresponsive for hours as you say then something else
is going wrong.
Post by francis picabiaPost by Bob ProulxLook in your access and error logs for a high number of simultaneous
clients. Tools such as munin, awstats, webalizer and others may be
helpful. I use those in addition to scanning the logs directly.
I hate munin. Too much overhead for the systems where we
are already close to performance issues when there is
peak traffic. I already poll the load and it had not increased.
I should add a scan of the memory usage.
I prefer cacti for this sort of thing.
As I simply listed munin as an example it isn't something I say is
required. But munin on a node is quite low overhead. On the server
where it is producing graphs it can be a burden. But on individual
nodes it is quite light weight. Of course there are a lot of
available plugins. If a particular plugin is problematic (I don't
know of any) I would simply disable that one plugin.
Bob
why the server stopped responding. It would be better
at the moment MaxClients was reached. And yes, in my experience,
a restart of apache service to make it responsive again. I've run
sports scores in a browser client. It doesn't require a slashdot
and the expected user numbers.
with extended information. I wait to see the problem and take
a look at what URLs are associated with the W state.
Thanks for the responses from all.