PHP op-code caches / accelerators: Drupal large site case study

For a large Drupal site, one of the biggest performance boosts one can do is to install an op-code cache/accelerator.

PHP op-code caches / Accelerators

Since PHP is an interpreted language, every page access has to load the script, parse it, compile it into op-codes, then execute it.

This load/parse/compile cycle can add up to a lot of processing time, specially when you have lots of page accesses.

Op-code caches/accelerators eliminate this load/parse/compile time, by doing so once, and keeping the compiled version of the script in memory (or disk) and use it next time a page requests this script.

This has two main benefits:

  • Page processing times can be much less in many cases.
  • Decreased load on server resources, mainly CPU time and memory consumption, which is important for a large site, or when a site suffers the Slashdot/Digg Effect.

These are compelling reasons to use an op-code cache.

Of course, an op-code cache will not help your site if your bottleneck in not CPU or memory. For example, if your bottleneck is in the database, or disk, the op-code cache will not directly help. However, since scripts start and end faster than without one, you may experience less contention in some cases.

Case study: Large Drupal site with APC and eAccelerator

Drupal as an application does a lot of processing for every page load. If you have lots of add on modules, then the processing is even more.

As a case study, a Drupal site that receives hundreds of thousands of page views per day was tested with and without an op-code cache.

The site runs on a Dual Opteron 246, 2 GB of memory, using Ubuntu Edgy Eft 6.10, Apache 2.0.55, PHP 5.1.6, and MySQL 5.0.24.

The following CPU and memory graphs were produced by Munin, a highly recommended near real time monitoring tool. Note that because the server has two CPUs, the scale is from 0 to 200% and not 100% as one would expect.

Resource Utilization without an op-code cache

Without an accelerator, the CPU utilization was as follows. You can see that during peak hours the server CPUs were consistently above 3/4 utilization (169% out of 200%), with an average of 1/2 (103% out of 200%).

As for memory utilization, here is the graph without any op-code cache. The important part is the green section (apps), which is the amount used by applications. The memory available is the blue (unused), and the file system cache (cache) is the orange part. You can see how the applications part moves up from 0.5GB to around 0.9 GB during peak hours.

Resource Utilization with eAccelerator 0.9.5.

After installing eAccelerator 0.9.5 from source, the CPU utilization was far less on a day with a comparable page views.

You can see that the maximum utilization was around 80% of 200%, with short spurts to 100% only. This means there is room for growth on the server now, instead of being at the brink of resource shortages.

And the memory utilization is also considerably less, with less variance even. The average is 0.29 GB out of 2GB, leaving a lot of room not only for growth, but for MySQL to cache as much of its tables in memory as well.

As you can see, this is far better resource utilization than without an op-code cache.

A nice summary for the entire week with regards to CPU utilziation can be seen in the following graph.

Note that Monday and Tuesday were without any op-code cache, Wednesday was with APC (3.0.13) starting at midnight, and Thursday and Friday are with eAccelerator. Note the hight of the Mon/Tue peaks vs. the mere hills on Thu/Fri.

Note: Ignore the red and magenta parts in the graph, since this was one off non-web load ran with the "nice" command.

Drawbacks of PHP op-code caches: Segmentation Faults

One common drawback of most op-code caches is that they often cause the web server to crash by causing a segmentation fault. From that point on, one Apache process is unavailable. This causes either an error 500 (server error) or the dreaded blank pages (White Screen of Death).

When this happens, you will see a message like this in Apache's error log:

[Fri Mar 02 03:55:04 2007] [notice] child pid 30253 exit signal Segmentation fault (11)
[Fri Mar 02 03:55:04 2007] [notice] child pid 30256 exit signal Segmentation fault (11)
[Fri Mar 02 03:55:04 2007] [notice] child pid 30257 exit signal Segmentation fault (11)
[Fri Mar 02 03:55:04 2007] [notice] child pid 30393 exit signal Segmentation fault (11)

The down blip above in the last memory graph was due to such a case with eAccelerator crashing Apache.

The only remedy is to restart the web server at this point.

In order to avoid downtime, there are solution out there that detect this condition, and automatically restart Apache when this happens. One such solution is the logwatcher script. This causes about one minute of downtime. Depending on the nature of your site, this may or may not be an acceptable solution.

Which one to choose: APC or eAccelerator or XCache?

As one blogger puts it: choose your poison!

The choice of which cache to use depends on several factors. All of them do the job fairly well. The difference in speed is not very noticeable.

Empirical observations on the above site show that while APC makes it feel that it has less page execution times, although a controlled benchmark failed to confirm this.

The graphs above show that eAccelerator provides better CPU and memory savings. As for memory, eAccelerator uses about 5 MB less of memory per Apache process than does APC, which can add up to significant savings.

A benchmark of Drupal with PHP APC vs. eAccelerator was conducted by 2bits. It includes Drupal 5.1, as well as the current HEAD with some menu system enhancements, and tests Drupal without any op-code caches/accelerators, with APC, and with eAccelerator. This is not a live site, but a test environment.

Other points to consider:

  • APC is maintained by core PHP developers, including Rasmus and others. APC does not utilze a disk cache, unlike both eAccelerator and Xcache (although it is configurable).
  • XCache is fairly new, although it seems to have momentum behind it.
  • All of them suffer from the above drawback of segmentation faults.

Conclusion

For a large site receiving tens of thousands of page views per day or more, a PHP op-code cache is a must. Which one to use is not an easy question to answer, but they are pretty similar. Segmentation faults is the plague of all of them as well, but possible workarounds are available.

Resources

These are links and articles on PHP op-code caches/accelerators. This page focuses on free op-code caches. There are some commerical closed source ones out there.

Contents: 

Comments

A question

Thank you for this very useful article. I have been looking for ways to extend resources on my VPS which has just 256mb guaranteed RAM.

Just one question. Does this have to implemented once on the server or for each domain that runs php applications? For example, if I have a multi-site drupal and civicrm install, should I install opcode cache on just the domain holding the drupal/civicrm codebase or on each domain that runs off the single codebase?

Thanks,
Venkat

Thanks for the clarfication,

Thanks for the clarfication, Khalid. This set of articles on optimization are much appreciated.

Venkat.

op-code caches and php as cgi

Is it true that eaccelerator will not work if php is run as cgi? I recompiled apache (through WHM) to run php with phpsuexec. This changed php from running as an apache module to running as cgi and although a phpinfo.php showed eaccelerator as installed, all its values were disabled. A quick check on webhostingtalk and other forums confirmed this.

Venkat

can small sites benefit?

I see "large scale sites" mentioned a few times. What basic tests can we run to see what, if any, optimization is beneficial in our individual situations?

I run a number of low-volume, small sites on a single linux box: two in Drupal 4.7 and several in 5.1, both Drupal instances multisite. I am also serving up a few low-volume "standard php sites" with minimal processing (header/footer includes, etc) off the same box.

Depends

Yes, small sites can benefit, if there are many of them on a single server.

Opcode caches do two things:

  • Speed the processing of PHP scripts (including Drupal)
  • Reduce the memory consumption per Apache process.

Depending on how many pages you serve, whether you get spikes in traffic (e.g. be on Digg's front page), it may be worth considering.

-- 2bits -- Drupal consulting

drupal cache is a great place to begin

Thanks for the great info and followup confidence in opcode caching for a multisite albeit small setup.

Before I can dive into setting up my opcode cache, I wanted to share my glowing delight in Drupal core caching alone ("Normal" setting). Improvements went down from 1782-4614ms/request to 132-249ms/request on a particular drupal page built on a view! CPU spiked, but not really long enough for "top" to catch it over 60%. I'm thrilled with Drupal cache.

Different areas of optimization

Drupal's cache and PHP's op-code cache optimize different areas.

PHP op-code cache optimize CPU utilization for PHP code load/parse/tokenize, and the memory associated with these.

Drupal's cache optimizes database queries. So, the entire page is stored in a table and the next time a request comes in a single query is done rather than many. Note that this only works for anonymous users, not for authenticated logged in users. It also gets cleared every time a node or comment is posted.

So, depending on which area is the bottleneck in your site, one would help or the other, or both.

It is important to know where the bottleneck(s) are before optimizing. Otherwise you are like barking the wrong tree.

-- 2bits -- Drupal consulting

Hi

Hi,

I am receiving following errors in my apache logs:

[Mon Feb 11 00:36:28 2008] [notice] child pid 5751 exit signal Segmentation fault (11)
[Mon Feb 11 00:36:29 2008] [notice] child pid 14333 exit signal Segmentation fault (11)
[Mon Feb 11 00:36:31 2008] [notice] child pid 5750 exit signal Segmentation fault (11)
[Mon Feb 11 00:38:15 2008] [notice] child pid 30220 exit signal Segmentation fault (11)
[Mon Feb 11 00:38:18 2008] [notice] child pid 30225 exit signal Segmentation fault (11)
[Mon Feb 11 00:38:30 2008] [notice] child pid 14318 exit signal Segmentation fault (11)

The error occurs when I enable comment module in my Drupal CMS installation, the error only occurs while comment module is enabled and a blank page is served by apache to any blog post having comments, the problem started when I got a cpanel backup restored as posted here - http://drupal.org/node/209107

Do you think this could be because of op-code caches ?? My webhosting provider and drupal support forum seem to be clueless !

Oh man, yeah! Thank

Oh man, yeah! Thank you! Not just for the graphs, but for the munin link! I kept trying to remember what it's name was, but I couldn't quite place it.

INSTALLING.

APC and Disk Cache

hi,

i've been experimenting with disk cache and APC using my dedicated server and it seems disk caching is faster. in one of the site we handle, we have seen that in using disk caching (enhanced), the homepage load for only 3ms. when we requested APC installed in our server, and used it in caching our database loads slower (118ms). although this is, like you said, not noticeable, the site (http://www.kokeytechnology.com) usually suffers sluggishness when traffic spikes. we turned back to using disk caching for Page Caching and we've seen loading times improve from 118ms to 18ms.

we're using W3 Total Cache in running our Wordpress blogs, by the way.

thanks!

Confusion of caching code vs. data

You are confusing two different things that APC does.

One is code caching, and this is what the original article is focused on. Without such op-code caching, PHP takes longer to read, parse and tokenize, which adds CPU and memory load. Using APC avoids this extra overhead, speeds up your site, and reduce resource utilization.

The other is using APC's user cache as a data caching for Drupal data objects. This article is not about this aspect of APC.