With the "Cloud" being in vogue currently, we see a lot of clients asking for cloud solutions, mostly Amazon AWS. Sadly, this is normally done without really doing a proper evaluation into whether the cost is reasonable, or the technology is suitable for their specific needs.
Amazon AWS provides some unique and compelling features. Among those are: instant provisioning of virtual servers, billing for used resources only, ability to provision more instances on demand, a wide variety of instance types, and much more.
We certainly like Amazon AWS for development and testing work, and for specific use cases such as seasonal sites.
For most high traffic sites though, Amazon AWS can be overly expensive, and not performant enough.
Before you decide to use Amazon, spend some time studying the various Amazon instance types, and the pricing that will be incurred. You may be surprised!
Here is a case study of a client that was on Amazon until recently, and we moved them to a more custom setup, with great results.
The site was originally hosted at another big name hosting company, but unfortunately they went down several times due to data center power issues.
After moving to Amazon AWS, with the setup below, the site was a bit sluggish, and when the traffic spikes described above happened, the setup could not cope with the increased traffic load ...
Amazon AWS Setup
The setup relied on Amazon's Elastic Load Balancer (ELB) front ending the site.
Behind the load balancer, there were a total of 4 instances, varying in type.
First, there were two web servers, each one of them m1.large.
Another m1.small instance acted as the NFS server for both web servers.
Finally another m1.large instance housed the MySQL database.
2 x web servers (each m1.large) 1 x MySQL database server (m1.large) 1 x NFS server (m1.small)
The cost was high compared to the features: EC2 computing cost alone for the instances was around $920 per month.
Additionally, there were 331 million I/O requests costing $36 per month. ELB was an additional $21 per month.
Storage and bandwidth brought the total to $990 per month.
The drawbacks of such a setup are many:
First, there is complexity: there are many components here, and each required patching with security updates, monitoring of disk space and performance, and other administration tasks.
Second, it was not fault tolerant. The only part that is redundant is the web server, with two of them present. However, if there is a database server or NFS server crash, the entire setup would stop serving pages.
Third, the cost is too high compared to a single well configured server, at almost half the cost.
Fourth, Amazon's ELB Load balancer forces the "www." prefix for the site, which is not a big deal for most sites, but some want to be known without that prefix.
Fifth, the performance was not up to par. The site was sluggish most of the time.
Finally, the setup was not able to handle traffic spikes adequately.
After doing a full Drupal site performance assessment, 2bits.com recommended and implemented a new setup consisting of a single medium sized dedicated server for $536 per month.
The server is quad core Xeons E5620 at 2.4GHz, 8GB of RAM, 4 disks (each 2 forming a mirror).
We then did a full server installation, configuration, tuning and optimization tuning the entire LAMP stack from scratch, on our recommended Ubuntu 8.04 LTS Server Edition. Instead of using Varnish, we used only memcached.
The setup is mostly like what we did for another much higher traffic site. You can read the details at: 2.8 million page views per day: 70 million per month: one server!
After consulting with the client, we recommended to go a step further and use the additional budget, and implement a near fault tolerant setup. The second server is in another data center, and the monthly cost is $464.
The results is that we now have a satisfied client, happy with the new setup that is free of the headaches that they used to face when traffic spikes happen.
Here are the graphs showing a large spike in July 2010.
Two traffic spikes in Google Analytics. The traffic shot up from the normal 81,000 to 83,000 page views per day, to 244,000. The spike on July 12th was 179,000 page views.
Apache accesses per second, by day
Apache volume per second, by day
CPU utilization per day, no noticable spike
Memcache utilization, showing that it took most of the load
Impact on Google Site Speed Metric
Some time back Google started tracking site speed in Analytics.
Here is one Drupal site that moved from a single dedicated server to an Amazon AWS cluster. The same code base is now slower in "the cloud". The site was around 250 milliseconds for Server Response Time. After the switch on April 25th, that response time doubled to 500 milliseconds, with spikes of near 1 second.
There are lots of good lessons learned from
A performance assessment is valuable before deciding what hosting to use. This will give a baseline and reveal any bottlenecks that your site may have.
In most cases, we advocate simplicity over complexity. Start simple and then go add complexity when and where needed.
Try to make the most of vertical scaling, before you go horizontal.
Amazon AWS is great for development and specific use cases. It may not be your most cost effective option for high traffic sites though.
Memcache, used properly, will get you far on your journey to scalability and performance.
There are lots of links on the web about Amazon AWS and hosting LAMP on it.
Here are a select few recent Drupal specific presentations and podcasts:
- Barry Jaspan San Francisco 2010 DrupalCon session on challenges of hosting Drupal on Amazon AWS
- Lullabot podcast with Barry Jaspan on the same topic
- Acquia Amazon Web Services building blocks for Drupal applications and hosting