How Drupal's node_access table can negatively impact site performance

Recently, a client approached 2bits.com for a consulting on performance.

Introduction: a popular site is a victim to its success

The site had 13,000+ nodes, with almost 12,000 of them being forum posts.

+------------+-------+
| type       | cnt   |
+------------+-------+
| forum      | 11994 |
| image      |   351 |
| weblink    |   269 |
| feed_item  |   153 |
| story      |   105 |
| resume     |    77 |
| poll       |    47 |
| blog       |    36 |
| page       |    26 |
| faq        |    24 |
| job        |    23 |
| chatroom   |     3 |
| simplenews |     2 |
| feed       |     1 |
+------------+-------+ 

The site uses the job search module with the resume access and job access modules to keep things private. Moreover, they have the taxonomy access control module to restrict some premium content to permium members.

+--------------+-------+
| realm        | cnt   |
+--------------+-------+
| term_access  | 32209 |
| resume_view  |    51 |
| resume_owner |    51 |
| job_view     |    19 |
+--------------+-------+

As you can see, the number of rows related to resumes and jobs is not the issue, but the number of rows from taxonomy access is where the bulk of the data is.

The main complaint was about the forum's main page. It took more than 11 seconds to generate the page for about 20 forums.

Executed 283 queries in 11515.54 milliseconds. Page execution time was 11662.6 ms. 

Diagnosis: identifying the problem

First monitoring the server, then using the devel module, we were able toidentify the problem. The heavy queries were as follows, each preceded with the number of milliseconds it took to generate each query:

683.98 1 forum_get_forums SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid INNER JOIN node_access na ON na.nid = n.nid WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all') OR (na.gid = 1 AND na.realm = 'job_view') OR (na.gid = 11284 AND na.realm = 'resume_owner') OR (na.gid = 2 AND na.realm = 'term_access'))) AND ( n.status = 1 AND n.type='forum' AND tn.tid = 26 ) ORDER BY ncs.last_comment_timestamp DESC LIMIT 0, 1

683.96 1 forum_get_forums SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid INNER JOIN node_access na ON na.nid = n.nid WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all') OR (na.gid = 1 AND na.realm = 'job_view') OR (na.gid = 11284 AND na.realm = 'resume_owner') OR (na.gid = 2 AND na.realm = 'term_access'))) AND ( n.status = 1 AND n.type='forum' AND tn.tid = 28 ) ORDER BY ncs.last_comment_timestamp DESC LIMIT 0, 1

673.34 1 forum_get_forums SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid INNER JOIN node_access na ON na.nid = n.nid WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all') OR (na.gid = 1 AND na.realm = 'job_view') OR (na.gid = 11284 AND na.realm = 'resume_owner') OR (na.gid = 2 AND na.realm = 'term_access'))) AND ( n.status = 1 AND n.type='forum' AND tn.tid = 5 ) ORDER BY ncs.last_comment_timestamp DESC LIMIT 0, 1

673.15 1 forum_get_forums SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid INNER JOIN node_access na ON na.nid = n.nid WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all') OR (na.gid = 1 AND na.realm = 'job_view') OR (na.gid = 11284 AND na.realm = 'resume_owner') OR (na.gid = 2 AND na.realm = 'term_access'))) AND ( n.status = 1 AND n.type='forum' AND tn.tid = 27 ) ORDER BY ncs.last_comment_timestamp DESC LIMIT 0, 1

672.71 1 forum_get_forums SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid INNER JOIN node_access na ON na.nid = n.nid WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all') OR (na.gid = 1 AND na.realm = 'job_view') OR (na.gid = 11284 AND na.realm = 'resume_owner') OR (na.gid = 2 AND na.realm = 'term_access'))) AND ( n.status = 1 AND n.type='forum' AND tn.tid = 6 ) ORDER BY ncs.last_comment_timestamp DESC LIMIT 0, 1

671.58 1 forum_get_forums SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid INNER JOIN node_access na ON na.nid = n.nid WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all') OR (na.gid = 1 AND na.realm = 'job_view') OR (na.gid = 11284 AND na.realm = 'resume_owner') OR (na.gid = 2 AND na.realm = 'term_access'))) AND ( n.status = 1 AND n.type='forum' AND tn.tid = 7 ) ORDER BY ncs.last_comment_timestamp DESC LIMIT 0, 1

671.41 1 forum_get_forums SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid INNER JOIN node_access na ON na.nid = n.nid WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all') OR (na.gid = 1 AND na.realm = 'job_view') OR (na.gid = 11284 AND na.realm = 'resume_owner') OR (na.gid = 2 AND na.realm = 'term_access'))) AND ( n.status = 1 AND n.type='forum' AND tn.tid = 34 ) ORDER BY ncs.last_comment_timestamp DESC LIMIT 0, 1

671.31 1 forum_get_forums SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid INNER JOIN node_access na ON na.nid = n.nid WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all') OR (na.gid = 1 AND na.realm = 'job_view') OR (na.gid = 11284 AND na.realm = 'resume_owner') OR (na.gid = 2 AND na.realm = 'term_access'))) AND ( n.status = 1 AND n.type='forum' AND tn.tid = 8 ) ORDER BY ncs.last_comment_timestamp DESC LIMIT 0, 1

671 1 forum_get_forums SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid INNER JOIN node_access na ON na.nid = n.nid WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all') OR (na.gid = 1 AND na.realm = 'job_view') OR (na.gid = 11284 AND na.realm = 'resume_owner') OR (na.gid = 2 AND na.realm = 'term_access'))) AND ( n.status = 1 AND n.type='forum' AND tn.tid = 4 ) ORDER BY ncs.last_comment_timestamp DESC LIMIT 0, 1

670.71 1 forum_get_forums SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid INNER JOIN node_access na ON na.nid = n.nid WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all') OR (na.gid = 1 AND na.realm = 'job_view') OR (na.gid = 11284 AND na.realm = 'resume_owner') OR (na.gid = 2 AND na.realm = 'term_access'))) AND ( n.status = 1 AND n.type='forum' AND tn.tid = 3 ) ORDER BY ncs.last_comment_timestamp DESC LIMIT 0, 1

The problem here is that for each of the above queries, the node_access table was being joined, and since it has 32,000 rows, together with the other tables in the query, this caused a temporary table and a filesort operation to be performance, which is very slow.

Since the forums are the most active part of the site, this was of course slowing down the entire site and having a bad user experience.

Solution: Bybass the node_access table

The solution to this problem was a bit unconventional. Since all the forums are public, joining the node_access table would not add any value for forum posts. So, the core forum module was rewritten to bypass the node_access joins entirely, while keeping the new module compatible with the database tables and settings of the core forum module.

This showed immediate improvement: From 11,000 ms, we are down to 1300 only. 

Executed 235 queries in 1221.66 milliseconds. Page execution time was 1345.35 ms. 

The queries are much lighter on the server now:

389.93  1  xforum_get_forums  SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid WHERE n.status = 1 AND n.type='forum' AND tn.tid = 26 ORDER BY ncs.last_comment_timestamp DESC LIMIT 0, 1

50.44  1  xforum_get_forums  SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid WHERE n.status = 1 AND n.type='forum' AND tn.tid = 7 ORDER BY ncs.last_comment_timestamp DESC LIMIT 0, 1

44.59  1  xforum_get_forums  SELECT ncs.last_comment_timestamp, IF (ncs.last_comment_uid != 0, u2.name, ncs.last_comment_name) AS last_comment_name, ncs.last_comment_uid FROM node n INNER JOIN users u1 ON n.uid = u1.uid INNER JOIN term_node tn ON n.nid = tn.nid INNER JOIN node_comment_statistics ncs ON n.nid = ncs.nid INNER JOIN users u2 ON ncs.last_comment_uid=u2.uid WHERE n.status = 1 AND n.type='forum' AND tn.tid = 28 ORDER BY ncs.last_comment_timestamp DESC LIMIT 0, 1

So, no wonder you have node access as #8 on the community's wish list of improvements for Drupal 7. Not only do we need to make it easier to develop for, and use, but also we need it to be efficient.

Contents: 

Comments

bypass the forum module

Personally, I would have written this article as 'bypass the forum module'. It has some awful queries with or without node access. The suckage here can be blamed on both systems. I have seen node_access scale well with non forum applications.

True

True.

Node access scales well when modules using it do not overuse it. Seems that taxonomy access control is one of those who overuse it. Job/Resume access do not.

Forum has some bad queries in it. Coupled with tracker's queries, it bogs sites down. Drupal.org itself suffered from both.

Some of that is because of features, e.g. showing number of new comments, new posts, ...etc.

So, either we reduce the features or find a way to preserve them with less load on the database.

Not easy.
--
2bits -- Drupal consulting

Forum Module

I would be readily inclined to agree when 3 Drupal experts tell me the Forum module is a disaster, after all, I have no technical background and a relative newby to Drupal and MySQL.

The reason I struggle to accept this answer is the fact that about 3 weeks ago I was running the forums (virtually same # of nodes, same traffic) on a weaker server (2 AMD cores, 4GB or RAM) with 1-2 second page loads and posts/writes to the database.

This tells me that something I did (or one of my web developers did) messed up the tables in my DB. Whether that was installing Forum Access and ACL, xml sitemap, FeedAPI / simplefeed is beyond me. What it does feel like is that even after Forum Access and ACL were uninstalled and Permissions Rebuilt, the access table and subsequent MySQL queries were significantly more complex than they previously were...almost as if there was another filter / access table in front of / on top of the regular node access table that was magnifying or multiplying the query complexity.

Fast forward to today and I am on 8 Optiron CPUs with 4GB or RAM and no solution in site. The "server load" is manageable for now, but the page loads on the forums page http://www.wallstreetoasis.com/forums is still usually more than 5 seconds and the track tab on the profile pages takes usually more than 10 seconds to load -- I truly think this is a sympton of the problem with the access tables, not the cause of the problem.

Khalid's creative solutions made the site usable, but I somehow still feel like I am running on a broken engine with turbo boosters attached (all the caching going on). The question is: is the Forum Module the Broken engine, or a table in the DB? I would put my money on a table in the DB.

Anyways, not good to be a case study for shitty forum performance :)

Any other thoughts you could share to potentially help optimize the tables or indexing would be welcomed with open arms.

Thanks guys,
Patrick

Indeed

Indeed I did. Thanks, Khalid!

Hey Patric, you might want

Hey Patric, you might want to try disabling XML Sitemap and watch performance fly......

What about doing what we did with cache?

It would be very interesting to try splitting the node_access table up by content type (i.e. node_access_forum, node_access_blog etc). This would complicate the management (and rewriting) a bit, but would give the double benefit of both shorter tables (leading to often non-linear performance increases) and also ensuring that one content type does not suffer because another content type (with a lot of nodes) is using node access.

Of course, more generalized improvements in node_access are welcome too :)

OK.. it's pretty much agreed

OK.. it's pretty much agreed that the forum module leaves a lot to be desired.

so what are my options?

i don't need forum posters to have the traditional drupal user functionality for the posting main content / sections of my site. that's reserved for admins, article writers etc.

however, i DO want a secure forum that forces posters to register. the forum users don't have to be "drupal users" as long as i have some control over the posting process. i also want the forum to be "integrated" into my site so that it looks like it's part of the site event hough it may be a separate php based forum.

1) what Open source forum would one recommend in place of the forum module?

2) is it easy to integrate into an existing drupal site both from a look and an url path approach (e.g making it play nicely with the rest of the site's drupal index.php with clean urls)?

any links or tips welcome
thanks

node_access is detrimental to Drupal's long term survival

I tried to create workable alternatives to node_access without destroying the system in D6, but I was stonewalled. In fact there was a belief that utilizing the node_access system even *more* was the direction to go, and I couldn't get anyone to seem to care that that particular join can cause very negative performance problems.

We'll have to see what happens in D7. This is a big deal to me.

Looking forward to it

Earl

You did a great job with Drupal 5 when node access modules can coexist together with less suckage than what used to be in 4.7.

As I said in the other comments, some modules over use node access. Perhaps that is one of the issues?

It is a tough one though. Can't think of an easy way to make the permissions granular enough yet not slow for performance.

I am looking forward to your node access changes for D7.
--
2bits -- Drupal consulting