We discussed the topic of the open buffet binge syndrome, where web sites "gorge" on contributed modules, and cause site performance to be slow and memory utilization to be high.
We encountered an interesting case recently while doing work for a client. The main symptom was page execution time that is high (800 to 1200 ms), but database query time that is low.
The normal case is where the total page execution time is high, and
a significant portion of that is due to database queries. Using the
devel module we can identify those and work on tuning the module, or
the database to squeeze more performance.
In this case, it was the other way around. For example:
Executed 175 queries in 43.46 milliseconds. Page execution time was 1150.73 ms.
Why would queries take only 43 ms, yet the whole page execution is 1150 ms? Where are the 1107 ms spent? This is too high.
So, in order to troubleshoot this we replicated the site to our test servers, and proceeded to investigate. Our test server runs APC, so PHP time should not be high at all.
Using xdebug and kcachegrind, we narrowed down the problem to the drupal_load() function, where most of the time was spent.
Seeing that this function does include_once() for every module, that was the place to investigate.
Measuring include_once
First we changed the drupal_load() function to look like this:
function drupal_load($type, $name) { static $files = array(), $total_time = 0; if (isset($files[$type][$name])) { return TRUE; } $timer_id = 'include ' . $name; timer_start($timer_id); $filename = drupal_get_filename($type, $name); if ($filename) { include_once "./$filename"; $time = timer_read($timer_id); $total_time += $time; print "$name: $time, total: $total_time\n"; $files[$type][$name] = TRUE; return TRUE; } return FALSE; }
This does not change the behavior, but gives us back the time required to load each module using the include_once() function, as well as a running total.
The output, using APC, looked like this (some module names obscured so as not to identify the web site in public):
0.63 date_api
0.64 views_bonus_tag_cloud
0.71 imagecache_profiles
0.75 flatforum
0.85 [censored]
1.01 views_bonus_grid
1.08 help
1.11 token
1.12 auto_nodetitle
1.21 googleanalytics
1.28 optionwidgets
1.37 ed_readmore
1.41 search404
1.51 thickbox
1.61 custom_breadcrumbs
1.61 simplemenu
1.65 mail_edit
1.74 menutrails
1.92 watchdog
2.16 views_alpha_pager
2.17 blockcache
2.17 text
2.22 scheduler
2.35 front_page
2.42 path
2.47 number
2.5 tagadelic
2.55 persistent_login
2.56 image_attach
2.96 calendar_ical
3.08 image_gallery
3.29 nodereference
3.32 link
3.33 feedback
3.36 service_links
3.37 votingapi
3.91 fieldgroup
4.02 fivestar
4.09 imagefield
4.2 [censored]
4.68 menu
4.74 imagecache
4.79 block
4.93 [censored]
5.25 panels
5.52 logintoboggan
5.73 profile
5.82 upload
5.83 views_theme_wizard
6.07 content
6.21 image
6.45 views_bookmark
6.58 location_views
6.65 search
7.46 tinymce
7.52 forum
7.63 imce
8.24 subscriptions
8.76 filter
9.24 devel
9.29 akismet
9.45 taxonomy
9.54 aggregator
10.27 pathauto
10.36 img_assist
11.66 timeline
11.9 comment
12.01 calendar
12.56 views
12.86 views_ui
12.88 date
13.99 system
16.94 privatemsg
18.32 node
19.38 user
27.09 views_filterblock
79.98 location
172.97 views_rss
174.64 book859.9 Total
Wow! Out of 1150 ms, 860 are spent just loading the modules!
Without APC
So, we disabled APC and got 286.8 ms.
What? no op-code cache/accelerator is better than with APC? Unbelievable ...
APC settings for include_once
Consulting Google's search, it seems that APC has a performance problem with include_once, as mentioned here and here. Versions 3.0.12 and later have a new parameter for improving performance. In a nutshell, this goes like this in your php.ini or apc.ini.
apc.apc.stat = 0 apc.include_once_override = 1
So, we changed APC to have those settings, yet we did not see much difference. Still 800+ ms.
Xcache performance
We then decided to try another op-code cache. We decided to use Xcache, since it is maintained (unlike the best performer, eAccelerator). As a bonus Ubuntu has xcache in the repository, so it can be installed using apt, without the need to built it locally.
The results were very encouraging: total page execution was 288.29 ms, and the drupal_load() was only 66.81 ms.
Example:
Executed 175 queries in 31.03 milliseconds. Page execution time was 288.29 ms.
We now know that APC is indeed the culprit.
Back to APC
But why do the APC settings not improve performance. Something else is at play here.
So, we decided to change APC to use 64 MB of shared memory instead of the default 30 MB. APC uses this memory to cache the parsed and tokenized scripts for later calls. This was done by adding:
apc.shm_size = 64
The results were amazing: total page execution was 253.77 ms, and out of that, only 37.38 ms were from drupal_load()!
Example:
Executed 175 queries in 37.48 milliseconds. Page execution time was 253.77 ms.
This is faster than Xcache, as expected.
No doubt the number of modules uses overflowed the default 30MB, and hence caching was next to useless. Increasing that solved the problem.
What is strange is that removing apc.stat and apc.include_once_override did not seem to have any negative effect on drupal_load()'s performance.
Conclusion
Based on the above tests, we can make the following conclusions:
- We have a happy client, since page generation times are now down from 1150 ms to 254 ms.
- If in doubt, use the apc.php administration front end for APC to know exactly how much of the allocated shared memory you are using.
- Experiment with apc.stat and apc.include_once_override to see if they help in your setup.
- Xcache is a viable option, specially if you find installing APC daunting.
Update
Article updated on Dec 1, 2007.
- The patch to drupal_load() has been modified to print the running total. Output from both cases can be seen below (700 ms vs. 38 ms).
- Below, you will find screenshots from apc.php. Fragmentation is what hurts performance, not the size of memory per se. But it also seems that fragmentation happens when memory is low (in this case, 38MB or less).
Results from drupal_load() with APC using 30 MB
Total is 700 ms.
aggregator: 10.35, total: 10.35 block: 5.21, total: 15.56 book: 6.42, total: 21.98 comment: 13.48, total: 35.46 filter: 10, total: 45.46 forum: 7.97, total: 53.43 help: 1.13, total: 54.56 menu: 5.18, total: 59.74 node: 19.82, total: 79.56 path: 2.57, total: 82.13 profile: 6.5, total: 88.63 search: 7.31, total: 95.94 system: 15.87, total: 111.81 taxonomy: 10.4, total: 122.21 upload: 6.27, total: 128.48 user: 20.93, total: 149.41 watchdog: 2.09, total: 151.5 akismet: 10.77, total: 162.27 [...]: 0.86, total: 163.13 calendar: 13.3, total: 176.43 calendar_ical: 3.31, total: 179.74 content: 6.82, total: 186.56 nodereference: 3.56, total: 190.12 number: 2.67, total: 192.79 optionwidgets: 1.41, total: 194.2 text: 2.43, total: 196.63 date: 14.67, total: 211.3 date_api: 0.67, total: 211.97 googleanalytics: 1.37, total: 213.34 image_attach: 2.71, total: 216.05 image_gallery: 3.38, total: 219.43 image: 6.82, total: 226.25 img_assist: 11.72, total: 237.97 link: 3.51, total: 241.48 panels: 5.82, total: 247.3 pathauto: 11.4, total: 258.7 persistent_login: 2.78, total: 261.48 blockcache: 2.35, total: 263.83 custom_breadcrumbs: 1.74, total: 265.57 [...]: 4.54, total: 270.11 ed_readmore: 1.51, total: 271.62 feedback: 3.81, total: 275.43 fivestar: 4.38, total: 279.81 flatforum: 0.81, total: 280.62 front_page: 2.75, total: 283.37 imagecache: 5.38, total: 288.75 imagecache_profiles: 0.8, total: 289.55 imagefield: 4.49, total: 294.04 imce: 8.74, total: 302.78 [...]: 5.32, total: 308.1 location_views: 6.95, total: 315.05 location: 84.15, total: 399.2 logintoboggan: 6.05, total: 405.25 menutrails: 1.86, total: 407.11 mail_edit: 1.8, total: 408.91 privatemsg: 18.84, total: 427.75 scheduler: 2.37, total: 430.12 search404: 135.56, total: 565.68 service_links: 3.56, total: 569.24 subscriptions: 8.72, total: 577.96 thickbox: 1.7, total: 579.66 timeline: 12.29, total: 591.95 views_alpha_pager: 2.07, total: 594.02 views_bookmark: 6.43, total: 600.45 views_filterblock: 27.55, total: 628 votingapi: 3.39, total: 631.39 simplemenu: 1.67, total: 633.06 tagadelic: 3.37, total: 636.43 tinymce: 11, total: 647.43 views_rss: 1.82, total: 649.25 views_theme_wizard: 6.23, total: 655.48 views_ui: 14.82, total: 670.3 views_bonus_grid: 1.04, total: 671.34 views_bonus_tag_cloud: 0.62, total: 671.96 auto_nodetitle: 1.17, total: 673.13 fieldgroup: 4.06, total: 677.19 token: 1.13, total: 678.32 views: 13.09, total: 691.41 devel: 9.72, total: 701.13
Results from drupal_load() with APC using 48MB
Total is 38 ms, vast improvement.
aggregator: 0.26, total: 0.26 block: 0.13, total: 0.39 book: 0.15, total: 0.54 comment: 0.27, total: 0.81 filter: 0.19, total: 1 forum: 0.2, total: 1.2 help: 0.07, total: 1.27 menu: 0.09, total: 1.36 node: 0.33, total: 1.69 path: 0.08, total: 1.77 profile: 0.17, total: 1.94 search: 0.2, total: 2.14 system: 0.21, total: 2.35 taxonomy: 0.29, total: 2.64 upload: 0.15, total: 2.79 user: 0.54, total: 3.33 watchdog: 0.1, total: 3.43 akismet: 0.28, total: 3.71 [...]: 0.07, total: 3.78 calendar: 0.15, total: 3.93 calendar_ical: 0.09, total: 4.02 content: 0.12, total: 4.14 nodereference: 0.08, total: 4.22 number: 0.07, total: 4.29 optionwidgets: 0.06, total: 4.35 text: 0.07, total: 4.42 date: 0.24, total: 4.66 date_api: 0.14, total: 4.8 googleanalytics: 0.07, total: 4.87 image_attach: 0.1, total: 4.97 image_gallery: 0.12, total: 5.09 image: 0.17, total: 5.26 img_assist: 0.22, total: 5.48 link: 0.1, total: 5.58 panels: 0.12, total: 5.7 pathauto: 0.38, total: 6.08 persistent_login: 0.16, total: 6.24 blockcache: 0.12, total: 6.36 custom_breadcrumbs: 0.09, total: 6.45 [...]: 0.1, total: 6.55 ed_readmore: 0.09, total: 6.64 feedback: 0.11, total: 6.75 fivestar: 0.1, total: 6.85 flatforum: 0.08, total: 6.93 front_page: 0.07, total: 7 imagecache: 0.11, total: 7.11 imagecache_profiles: 0.17, total: 7.28 imagefield: 0.22, total: 7.5 imce: 0.29, total: 7.79 [...]: 1.27, total: 9.06 location_views: 0.24, total: 9.3 location: 20.82, total: 30.12 logintoboggan: 0.22, total: 30.34 menutrails: 0.15, total: 30.49 mail_edit: 0.16, total: 30.65 privatemsg: 0.41, total: 31.06 scheduler: 0.15, total: 31.21 search404: 0.14, total: 31.35 service_links: 0.16, total: 31.51 subscriptions: 0.24, total: 31.75 thickbox: 0.15, total: 31.9 timeline: 0.72, total: 32.62 views_alpha_pager: 0.19, total: 32.81 views_bookmark: 0.21, total: 33.02 views_filterblock: 2.42, total: 35.44 votingapi: 0.19, total: 35.63 simplemenu: 0.14, total: 35.77 tagadelic: 0.19, total: 35.96 tinymce: 0.17, total: 36.13 views_rss: 0.13, total: 36.26 views_theme_wizard: 0.2, total: 36.46 views_ui: 0.25, total: 36.71 views_bonus_grid: 0.13, total: 36.84 views_bonus_tag_cloud: 0.13, total: 36.97 auto_nodetitle: 0.14, total: 37.11 fieldgroup: 0.16, total: 37.27 token: 0.16, total: 37.43 views: 0.44, total: 37.87 devel: 0.32, total: 38.19
Screenshots of apc.php
The following screenshots are taken during the tests. They show that what really hurts is fragmentation.
APC, default settings (i.e. 30 MB shm_size). Note the fragmentation.
APC, shm_size set to 38. Fragmentation changes from one page to the other, but page time is always 800 ms to 1000 ms.
APC with 38MB, as above, but shows fragmentation. Performance is poor.
APC, shm_size set to 48. No fragmentation and page loading time is in the low 200s.
Other links
A blog post about APC memory fragmentation caused by the default shmmax parameter not being enough.
Comments
Khalid
Not observed
Fri, 2007/12/14 - 17:47I have not observed a similar behavior with eAccelerator.
Note that while eAccelerator sure has lower memory utilization per Apache process as well as better performance, but is plagued by more segmentation faults (crashes) specially with newer versions of PHP. APC is more stable and more maintained.
--
2bits -- Drupal and Backdrop CMS consulting
Visitor (not verified)
APC is more stable and more maintained.
Wed, 2007/12/26 - 14:36Hi, Khalid. I saw this article and install apc with my server but is not vast surprised like you said.
apc.stat = 0
apc.include_once_ovrride = 1
apc.shm_size = 48(or 64)
drual_load() execute detail data below:
1.29 aggregator total:1.29
1.55 block total:2.84
0.87 blog total:3.71
1.43 color total:5.14
18.83 comment total:23.97
2.17 filter total:26.14
1.63 forum total:27.77
0.99 help total:28.76
1 locale total:29.76
1.58 menu total:31.34
2.38 node total:33.72
1.94 path total:35.66
1.34 profile total:37
2.02 search total:39.02
0.98 statistics total:40
2.69 system total:42.69
3.58 taxonomy total:46.27
2.48 upload total:48.75
4.99 user total:53.74
19.88 watchdog total:73.62
2.59 advcache total:76.21
3.24 advuser total:79.45
3.09 archive total:82.54
2.25 author total:84.79
2.53 avatar_selection total:87.32
2.58 bbcode total:89.9
25.66 blogcore total:115.56
6.2 blogger total:121.76
2.89 bloginfo total:124.65
2.53 browse_member total:127.18
27.82 buddylist total:155
3.16 captcha total:158.16
3.61 image_captcha total:161.77
3.36 text_captcha total:165.13
2.07 commentmail total:167.2
40.44 comment_list total:207.64
9.71 favorite total:217.35
20.36 flag_content total:237.71
3.72 forumblock total:241.43
3.46 forum_category total:244.89
3.12 forum_feature total:248.01
2.35 guest total:250.36
3.35 guestbook total:253.71
3.93 imce total:257.64
3.3 invite total:260.94
2.97 legal total:263.91
20.64 loginstatus total:284.55
2.71 login_destination total:287.26
2.68 memcache total:289.94
2.93 mimemail total:292.87
4.4 privatemsg total:297.27
3.46 search_config total:300.73
9.75 send total:310.48
19.32 solblock total:329.8
3.01 solchat total:332.81
2.59 solfm total:335.4
3.93 solmap total:339.33
3.64 solmoderator total:342.97
3.03 solredirect total:346
2.3 solstatistics total:348.3
2.63 soltag total:350.93
3.81 soluser total:354.74
20.31 sol_customcategory total:375.05
3.34 sol_report total:378.39
2.28 sol_switchtemplate total:380.67
3.13 spotlight total:383.8
3.56 troll total:387.36
2.73 troll_ban_ip total:390.09
3.4 userlink total:393.49
20.9 views_ui total:414.39
20.91 webfm total:435.3
2.88 pathauto total:438.18
3.33 jscalendar total:441.51
2.25 token total:443.76
4.19 views total:447.95
3.75 devel total:451.7
Total:451~~~,I don't understand why? can you give me some suggestion?
Khalid
WSOD on low memory
Thu, 2008/01/03 - 09:58Frederic Marand has an interesting, possibly related, case where a VPS with low memory suffers White Screen of Death (WSOD) caused by APC segmentation faults, when the cache fills up and is purged.
--
2bits -- Drupal and Backdrop CMS consulting
Visitor (not verified)
reducing memory use by core modules
Sat, 2008/01/05 - 11:23the modules here: http://wtanaka.com/drupal may be useful to some people for reducing Drupal's memory usage
Khalid
Contribute them?
Sat, 2008/01/05 - 11:52This is very helpful, but also strange.
Why not contribute the core ones (taxonomy, system, watchdog, ...etc.) as patches to their respective core module, so everyone benefits, rather than running a forked version?
Also, see my comment on Drupal 6.x and the use of accelerators.
--
2bits -- Drupal and Backdrop CMS consulting
Visitor (not verified)
Contributing Large Patches to Core
Thu, 2008/01/10 - 03:57I would warmly welcome these changes being folded into core.
But would a diff of such size be accepted into Drupal 5?
Khalid
No, but still useful
Thu, 2008/01/10 - 10:08Well, it would not be accepted now, because of two reasons.
One is that Drupal 5 is in maintenance mode, where drastic features are not included.
Two is that Drupal 6 has a different scheme for splitting modules, so why would Drupal 5 use a different potentially incompatible scheme?
If you have contributed that early in the Drupal 6 cycle, or took up the task of splitting the Drupal 6 after the menu changes came in, your would would have been in base Drupal 6 now.
But should the stop you from creating an issue on Drupal.org and attaching your patches? No. Just create an issue and attach the diff files to it, describing what you have done and why. I did the same for the Drupal 5.1 backport of watchdog hook for custom logging and alerts via module, and although it never got in Drupal 5 (nor should it have), others took it and integrated in their distributions (e.g. Bryght).
--
2bits -- Drupal and Backdrop CMS consulting
Visitor (not verified)
Do you see any reason NOT to
Mon, 2009/01/26 - 13:10Do you see any reason NOT to replace include_once() with include() in drupal_load in bootstrap.inc?
I'm seeing a slight improvement in page load times and no (obvious) failures yet after doing so.
I too am seeing huge load times in drupal_load, but I think this must be due to the underlying file system (nfs).
Khalid
Has been proposed
Mon, 2009/01/26 - 13:20Yes, it has been proposed by someone (chx?).
How much of an improvement there is (i.e. milliseconds before/after).
Visitor (not verified)
My ballpark
Mon, 2009/01/26 - 14:31My ballpark observations:
With include_once() : 1000 - 1700 ms, average ~1400 ms
With include() : 650 - 1000 ms, average ~850 ms
Not a bad improvement in my particular case. Needs more testing.
Pages