Presentation: 2.8 Million page views per day, 70 million per month, one server!

The session that I proposed for DrupalCon San Francisco 2010 on 2.4 million page views per day, 60 million pages per month, one server was accepted, and I just finished giving it.

Note that from the time we proposed the talk, until we actually delivered it, the site hit new records (2.8 million page views per day, 70 million page views per month).

Here are the slides from the session, which I designed to be useful on their own, even for those who did not attend.

And the video of the slideshow, with the sound track, is now uploaded, and you can watch it below:

Contents: 

Comments

Good stuff, thanks. So all

Good stuff, thanks. So all MYISAM tables except the three you listed which are INNODB? (users, sessions, votingapi_vote)

Selective use of InnoDB

For this particular site, yes: those are the only ones that are InnoDB, all the other tables are MyISAM.

For other sites, more tables need to be InnoDB, for example, another site had the following:

comments
node
node_comment_statistics
node_counter
sessions
term_node
url_alias
users

Why do you to plan to use

Why do you to plan to use Varnish instead of nginx as frontend reverse proxy? And more important, why didn't you plan to use that at the beginning?

About the website, is its layout complex? I can read from your graphs that each page uses 100 KB in average, not much for an "entertainment" website.

Well, a small Drupal site is fast, and scalable. But a large one is totally different. Great work and thank you.

We did not plan to use

We did not plan to use Varnish from the beginning simply because it is not needed. We still do not need it at present, but we will be ready with it when it is.

For nginx (and lighttpd, ...etc.), we basically like to stay with the version that is in the repository of the distro we are using, to make upgrades simple, and minimal. The nginx version in Ubuntu 8.04 LTS is old. Perhaps 10.04 will have a better version. So we don't compile things from source unless there is no other way. Simplicity is the key here.

The layout is based on Marinelli, with one column on the right hand side, and a main content column. No panels or anything like that. The front page size is around 200K.

Why custom module instead of CCK?

I was wondering what the performance hit is for using CCK vs. creating a custom module. Did you go with a custom module simply to help reduce the number of total modules needed, or does CCK slow down a high-traffic Drupal site that much?

Thanks,
Jason

Simplicity

The main goal was simplicity, which leads to maintainability. This means less stuff to keep up to date, less stuff to worry about when upgrading the site. Also, less code to load for each page request and less hooks firing.

It was not an "Aha! that is what is making the site slow", rather an example of how to simplify: for 3 fields for one content type, it was not worth having so many modules around.

Figures have increased

As I commented on the Lullabot podcast, the figures have increased over the summer. They are:

Peak daily numbers:
3.42 million page views per day
839.9K visits per day

Monthly figures:
92.4 million page views per month.
22.8 million visits per month.

Most days are above 3 million page views, and the lowest traffic day over the last 30 days is 2.7 million.

Still on the single server that I presented back in April, and doing well.

Works for me

Works for me, and I am on Linux. You are the only one who reported a problem. Must be something on your end.

Seriously? It's not a big deal

Of all my development, a few of them still host over 2.5 million pages per day. The most recent is in .Net 3.5 single server! That includes scheduled tasks and mail. If you know hot to program well and effecient, i don't see why this is topic is presentation worthy? My most recent project is on the upward ramp of a viral campaign for a television show. I expect to experience more then 3.5 million hits per day in the next week. Yes, single server.

Uncool dude. Your presentation is bragging about nothing.

Been there already

We already surpassed 3 million page views a day.

Check out our recent presentation on 3.4 million page views a day, 92 million a month, one server and Drupal. This is organic growth, not just a traffic spike.

You, having come from the proprietary .NET background, do not know how knowledge sharing works in the open source universe.

For a taste of how others see these presentations are beneficial, check out the comments they left here and there.

An armchair critic belittles the efforts of others. Let us see how you share the knowledge you learn.

I value you.

Khalid,

I watched your presentation and then read your reply to this thread. I really appreciate all the work you have done for the open source community, how you explain yourself and methods so clearly, and how you respond to subterfuge to well.

I have sent a request for a quote on a proprietary site I need developed and I sincerely hope you email me back.

Thanks again,

Cory

Good and bad

There are some very interesting points in this presentation, but in the end I don't see the point. If you are going to make custom module instead of CCK, don't use Panels, maybe views. There is really not point in using drupal then, just use wordpress and you have all the same with better UR, more stability and less memory/cpu need. Or build something custom. Only reason to use drupal is Views, Panels, CCK. Other is just better on other systems as they were not planed to support such complexity that your client obviously doesn't need. Again thanx for good info on drupal optimization :-)

Of course not

Of course we did not rewrite CCK, but since it was used just for a handful of fields for one content type, we opted to simplify the site and use a custom module to add these fields to that one content type and eliminate several modules in the process.

This approach is feasible in a subset of sites only, and not all sites. This case is one of them.

The site does use views.

What would be the reason to

What would be the reason to use Drupal then? I know that is not the main point of discussion here but really looks like they could use much less complex system then Drupal and this way make much more optimization regarding CPU and Memory.