advcache and memcached benchmarks with Drupal

Robert Douglass has been working on two very interesting and promising modules that should boost Drupal's performance significantly. One is the memcache module and the other is Advcache (Advanced Cache).
Robert posted a request for benchmarks, with an overview of the modules, so here they are.
The benchmarks turned out to be normal Drupal caching with and without memcache. Advanced Caching could not be tested on plain 5.1.

Testing environment

We used a simple setup, with memcached on the same server that is running Drupal. In a real production environment, memcached would be distributed over several servers, and Drupal would be running on one or more web servers, and a database server.

The hardware consists of the following:

  • AMD64 3000+
  • 1GB RAM
  • WD 160GB SATA

The software is as follows:

  • Ubuntu Feisty 7.04, AMD 64bit kernel
  • PHP 5.2.1
  • MySQL 5.0.38
  • eAccelerator 0.9.5.1
  • Drupal 5.1

The tests were simple too, they consisted of 500 requests to the home page anonymously, with a concurrency of 5.

The command used is ab -n500 -c5 http://example.com/

I used -c3 in some cases, when the system started thrashing due to memory starvation (e.g. no caching at all).

I used the official Drupal 5.1, with the following modules enabled:

 block         
blog
comment
contact
customerror
drupal
feedback
filter
forum
forward
gsitemap
image
menu
node
nodewords
page
path
pathauto
ping
poll
profile
search
sections
service_links
sitemenu
spam
statistics
subscriptions
syndication
system
taxonomy
tracker
update_status
upload
urlfilter
user
video
video_image
watchdog

The site has 8409 published nodes, and 65544 published comments. The home page, which was the page we tested, has a gallery with 30 images in it, in the form of node teasers.

The site currently runs live on a Dual Opteron dedicated server with 2GB of RAM.

Test 1a: Cache disabled, concurrency 3

No caching is enabled of any kind in this test. This is like a default install would be, like on a shared hosting.

 

Document Path:          /
Document Length: 41935 bytes

Concurrency Level: 3
Time taken for tests: 144.445923 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 21231000 bytes
HTML transferred: 20967500 bytes
Requests per second: 3.46 [#/sec] (mean)
Time per request: 866.676 [ms] (mean)
Time per request: 288.892 [ms] (mean, across all concurrent requests)
Transfer rate: 143.53 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 271 865 317.9 844 3227
Waiting: 267 860 318.2 839 3223
Total: 271 865 317.9 844 3227

Percentage of the requests served within a certain time (ms)
50% 844
66% 973
75% 1048
80% 1118
90% 1290
95% 1419
98% 1556
99% 1679
100% 3227 (longest request)

It yields 3.4 requests per second, and 95% of the requests were served in 1.41 seconds, which is high.

Test 1b: No caching, concurrency 5

We could run this test with 5 concurrent users but only with difficulty, because of operating system thrashing caused by memory starvation. We had to restart Apache, MySQL and memcached in order to do so.

Document Path:          /
Document Length: 42024 bytes

Concurrency Level: 5
Time taken for tests: 165.299650 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 21275500 bytes
HTML transferred: 21012000 bytes
Requests per second: 3.02 [#/sec] (mean)
Time per request: 1652.996 [ms] (mean)
Time per request: 330.599 [ms] (mean, across all concurrent requests)
Transfer rate: 125.69 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 302 1650 4395.6 1675 65084
Waiting: 297 1642 4395.5 1669 65079
Total: 302 1650 4395.6 1675 65084

Percentage of the requests served within a certain time (ms)
50% 1675
66% 1688
75% 1691
80% 1699
90% 1735
95% 1947
98% 2321
99% 2800
100% 65084 (longest request)

As you can see we barely do 3 requests per second, and 95% of requests take 1.94 seconds to finish.

Test 2a: Normal cache enabled, concurrency 3

In this test, the normal Drupal caching was enabled by visiting Administer -> Site configuration -> Performance -> Caching mode, and setting it to Normal instead of None.

This is a configuration that can be enabled in shared hosting environments.

Document Path:          /
Document Length: 41935 bytes

Concurrency Level: 3
Time taken for tests: 4.426980 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 21263442 bytes
HTML transferred: 21009435 bytes
Requests per second: 112.94 [#/sec] (mean)
Time per request: 26.562 [ms] (mean)
Time per request: 8.854 [ms] (mean, across all concurrent requests)
Transfer rate: 4690.56 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 8 7.3 7 22
Processing: 7 17 7.9 15 34
Waiting: 0 8 7.2 7 22
Total: 21 26 5.1 22 34

Percentage of the requests served within a certain time (ms)
50% 22
66% 32
75% 32
80% 32
90% 33
95% 33
98% 33
99% 33
100% 34 (longest request)

With normal cache enabled, things improve considerably to 112.9 requests per second, and 95% of the requests are served in 33 milliseconds. A dramatic improvement for anonymous users.

Test 2b: Normal cache enabled, concurrency 5

This is the same as the test above, but with a concurrency of 5. The system is now able to handle more concurrent requests with caching enabled.

Document Path:          /
Document Length: 41935 bytes

Concurrency Level: 5
Time taken for tests: 8.18402 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 42442000 bytes
HTML transferred: 41935000 bytes
Requests per second: 124.71 [#/sec] (mean)
Time per request: 40.092 [ms] (mean)
Time per request: 8.018 [ms] (mean, across all concurrent requests)
Transfer rate: 5168.98 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 15 11.6 14 44
Processing: 7 24 12.0 23 56
Waiting: 0 15 11.5 14 44
Total: 36 39 6.6 37 56

Percentage of the requests served within a certain time (ms)
50% 37
66% 37
75% 37
80% 37
90% 55
95% 55
98% 55
99% 55
100% 56 (longest request)

We get 124 requests per second, and 55 milliseconds for 95% of the requests.

Test 3a: Aggressive cache enabled, concurrency 3

Aggressive caching uses less resources, and very useful for anonymous users. However, it has some side effects too. Read the documentation for it carefully before you use it.

Document Path:          /
Document Length: 41933 bytes

Concurrency Level: 3
Time taken for tests: 3.319873 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 21304880 bytes
HTML transferred: 21050366 bytes
Requests per second: 150.61 [#/sec] (mean)
Time per request: 19.919 [ms] (mean)
Time per request: 6.640 [ms] (mean, across all concurrent requests)
Transfer rate: 6266.81 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 6 5.3 5 16
Processing: 5 13 5.7 11 25
Waiting: 0 6 5.3 5 16
Total: 8 19 3.6 16 25

Percentage of the requests served within a certain time (ms)
50% 16
66% 23
75% 23
80% 23
90% 23
95% 23
98% 24
99% 24
100% 25 (longest request)

Aggressive caching enables the server to do 150 requests per second, and 95% of the requests will be served in 23 milliseconds.

Test 3b: aggressive cache with 5 concurrency

With aggressive caching and a concurrency of 5, results are still good.

Document Path:          /
Document Length: 42024 bytes

Concurrency Level: 5
Time taken for tests: 3.126275 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 21265500 bytes
HTML transferred: 21012000 bytes
Requests per second: 159.93 [#/sec] (mean)
Time per request: 31.263 [ms] (mean)
Time per request: 6.253 [ms] (mean, across all concurrent requests)
Transfer rate: 6642.73 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 11 9.0 10 32
Processing: 5 18 10.8 17 136
Waiting: 0 11 10.5 10 134
Total: 21 30 7.7 27 136

Percentage of the requests served within a certain time (ms)
50% 27
66% 27
75% 39
80% 40
90% 40
95% 40
98% 40
99% 40
100% 136 (longest request)

159 requests per second, and 95% of requests are done in 40 milliseconds.

Test 4a: memcache with normal cache, concurrency 3

We are now back to Normal caching, but with memcache instead of the database cache. On a site that has a lot of disk I/O, using RAM and memcache for caching avoids hitting the disk. Several memcache servers can be used over a private network.

For memcache, we used the Ubuntu supplied binaries by installing it like this:

aptitude install memcached php5-memcache 

We then modified the $conf variable in settings.php to be as follows, which is the simplest possible memcache configuration for Drupal:

$conf = array(
'cache_inc' => './sites/all/modules/memcache/memcache.inc',
);

The interesting part is that the memcache.module need NOT be enabled, just the .inc file.

The setting in Administer -> Site configuration -> Performance -> Caching mode is set to Normal. 

Document Path:          /
Document Length: 41779 bytes

Concurrency Level: 3
Time taken for tests: 4.288611 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 21227572 bytes
HTML transferred: 20973058 bytes
Requests per second: 116.59 [#/sec] (mean)
Time per request: 25.732 [ms] (mean)
Time per request: 8.577 [ms] (mean, across all concurrent requests)
Transfer rate: 4833.73 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 7 7.1 7 23
Processing: 7 17 19.0 15 406
Waiting: 0 8 19.1 7 404
Total: 15 24 18.6 22 421

Percentage of the requests served within a certain time (ms)
50% 22
66% 23
75% 23
80% 33
90% 33
95% 34
98% 35
99% 35
100% 421 (longest request)

We get 116 requests per second, and 95% of the requests take 34 milliseconds or less.

Test 4b: memcache with normal cache, with 5 concurrency

This is the same test as above, but with more concurrency.

Document Path:          /
Document Length: 41779 bytes

Concurrency Level: 5
Time taken for tests: 4.524459 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 21185306 bytes
HTML transferred: 20931279 bytes
Requests per second: 110.51 [#/sec] (mean)
Time per request: 45.245 [ms] (mean)
Time per request: 9.049 [ms] (mean, across all concurrent requests)
Transfer rate: 4572.48 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 18 43.0 15 486
Processing: 7 24 25.3 23 525
Waiting: 0 15 11.6 15 51
Total: 30 43 48.8 38 525

Percentage of the requests served within a certain time (ms)
50% 38
66% 38
75% 38
80% 38
90% 56
95% 56
98% 56
99% 525
100% 525 (longest request)

There is a slight drop in requests per second, and an increase in the 95% millseconds.

Test 5a: memcache with aggressive cache, with concurrency 5

Combining memcache with aggressive cache and high concurrency, results are still good. 

Document Path:          /
Document Length: 42022 bytes

Concurrency Level: 5
Time taken for tests: 3.187112 seconds
Complete requests: 500
Failed requests: 0
Write errors: 0
Total transferred: 21264500 bytes
HTML transferred: 21011000 bytes
Requests per second: 156.88 [#/sec] (mean)
Time per request: 31.871 [ms] (mean)
Time per request: 6.374 [ms] (mean, across all concurrent requests)
Transfer rate: 6515.62 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 12 9.3 10 32
Processing: 5 19 9.7 17 40
Waiting: 0 12 9.2 10 31
Total: 26 31 6.1 27 40

Percentage of the requests served within a certain time (ms)
50% 27
66% 39
75% 39
80% 39
90% 39
95% 39
98% 40
99% 40
100% 40 (longest request)

We do 156 requests per second, and 95% of requests are finished in 39 milliseconds. 

Advanced cache (advcache) module

We could not test the the advcache module for two reasons:

  • It requires a more extensive setup, since the benchmarks have to use a logged in user's cookie.
  • It is synched to the DRUPAL-5 tag in CVS, not the released Drupal 5.1 tarball.

Conclusions

From the above tests, one can conclude the following:

  • Using normal caching is significantly better than no caching at all, for anonymous users.
  • Using aggressive caching provides much better performance than using the normal cache. Note that there are some side effects for some modules.
  • Using memcached is not that different from using database caching. However given that only one page was benchmarked, and hence the Linux or the database cached the rows (since they are a small sample size) may have skewed these results.
  • The memcache module is not needed, just the .inc file.

Remember that this is not an optimal setup, just a comparative benchmark. If you can afford it, use multiple servers with lots of RAM for memcache, and a dedicated server for the database, also with lots of RAM.

Also remember that the scenario is different for authenticated users, which we hope to have the time to cover in a  future benchmark, perhaps with the advanced caching module too.

Contents: 

Comments

Some notes

Thanks for benchmarking! This is truly exciting stuff. Some notes about advcache. The module alone doesn't do anything... it only starts to do advanced caching if you apply the patches that are included. For benchmarking purposes it's easiest just to apply the all_patches.patch, which includes caching for nodes, comments, taxonomy, forums and search. The advcache patches are specifically targeted at improving the performance of *authenticated* users, so it is not surprising to me that you didn't see much performance gain due to advcache since you tested for the anonymous user. Try applying the all_patches.patch and running ab with the -C PHPSESSID=12345.... parameter. It is best to get the PHPSESSID from a normal user, not admin, so log onto your site as a normal site user and then look at your cookie to get the actual session hash. Finally, the configuration that you specified for the memcache setup, where there is one memcached instance and all bins go into the default cluster, is synonymous with no configuration at all (ie the default state), so you don't technically need to specify 'memcache_servers' and 'memcache_bins' in this case. Great work!

Memory?

What's the (memory) size of your Apache / Mysql & Memcache processes if your not able to have a concurrency higher than three at some moments? That seems very low to me on your hardware. On our dedicated server on www.tik.be with 1 GB RAM, we can easily run ab with -c 100 and serve 220.70 requests/s. That's a Drupal 5.1 with normal caching and CiviCRM and lot's of other modules installed...

Here it is with concurrency 100

This is a development machine that has lots of stuff on it. After turning off the stuff that is not needed, and some tuning, here is a test using the command:

ab -n7500 -c100 http://example.com

This is with normal cache.

Document Path:          /
Document Length: 42024 bytes

Concurrency Level: 100
Time taken for tests: 72.547443 seconds
Complete requests: 7500
Failed requests: 5
(Connect: 0, Length: 5, Exceptions: 0)
Write errors: 0
Non-2xx responses: 5
Total transferred: 318780040 bytes
HTML transferred: 314978660 bytes
Requests per second: 103.38 [#/sec] (mean)
Time per request: 967.299 [ms] (mean)
Time per request: 9.673 [ms] (mean, across all concurrent requests)
Transfer rate: 4291.10 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 13 24.0 0 197
Processing: 74 943 807.3 914 46375
Waiting: 59 890 779.4 879 46371
Total: 74 957 807.3 927 46438

Percentage of the requests served within a certain time (ms)
50% 927
66% 967
75% 973
80% 977
90% 990
95% 1024
98% 1136
99% 1357
100% 46438 (longest request)

Concurrency is 100 as you can see, and 100+ requests per second.

The live server for this site, which is highly tuned Dual Opteron handles 59,000 page views an hour on a busy day, and 930,000 page views a day.

-- 2bits -- Drupal consulting

Other considerations

Two points:

1) On a complex drupal install, the memcache backend can make a huge different by not burdening your site with a lot of mysql i/o pulling all the cached table data for logged-in users. Just not loading the variables table can have a big impact.

2) I've seen some divergence in performance when scaling upward in benchmarking w/concurrency. When you're hitting a lot of different pages (e.g. the mysql query cache isn't doing basically the same thing as memcached) you see some benefits from serving static values out of cache. But, if yr #1 hit is the frontpage (or the like), drupal's built in cache is great. In this context, boost is even better since it doesn't engage PHP in the first place.

All this makes me think maybe the ideal (on a single box) is some way to use both boost and memcache. The performance improvement for logged-in users from having all cache queries pulled from RAM is quite significant for logged-in users. But if you're talking about a site with public-facing pages and the possibility that you get hit with the traffic firehose (e.g. frontpaged by digg), it's hard to get any better than boost for serving a million copies of the same anonymous page.

Anyway, food for thought.