Wednesday, April 15, 2009

Clustering Drupal

We have been testing MySQL/Galera cluster with various benchmarks and one exercise in our test plan is to try clustering performance in web application level. We picked Drupal as our first target application and composed cluster from identical Drupal instances. Each Drupal node has local MySQL database and we cluster the databases with Galera synchronous multi-master replication system. As a result, the effects of http requests hitting any Drupal node, will be synchronously replicated to the whole cluster.

Alex wrote a detailed article about the benchmarking session. I present here just an executive summary and go directly to the final results.

Test Platform
We tested the Drupal cluster with Amazon EC2 small instances. Small instance is not particularly suitable for web platform due to long latencies, but we got our baseline figures from this setup, and decided to run more high end tests later on.

Running the test, shows strong unbalance between Apache/Drupal and MySQL CPU usages. MySQL consumes just 5-10% of the CPU and rest goes for Apache. Resource-wise, it would make sense to create separate farm for web servers and have a small MySQL cluster serving the farm. However, our test configuration has some advantages as well:
  • It is easy to setup, each node is identical
  • Local MySQL gives faster response to Drupal
  • It is possible to fallback on one node only

For the test session, we created clusters from 1-4 Drupal instances.

The Test
For testing, we used a jmeter test, which runs three thread groups:
  1. Posters - create new pages in the system
  2. Commenters - read pages and add comments to the stories
  3. Browsers - just keep on reading pages in the system
The jmeter http load goes through glb load balancer to the Drupal cluster. Each http request can hit any cluster node at will.


Results

Final results show quite linear scalability:

















Nodes Users Request rate Latency Error rate
(req/min) (ms) (%)
---------------------------------------------------
1 40 129 3950 0.07
2 80 259 3960 0.06
3 120 387 3700 0.05
4 160 514 3490 0.12



Scaling continues linearly up to four nodes and we did not try with larger cluster sizes.

Near Future
Alex promised to continue with Drupal testing and run the tests with EC2 large instances to get reasonable latencies. Results from these experiments should appear in the near future.

We are also presenting Galera clustering in the Percona Performance Conference and can provide ad-hoc demonstrations for anybody interested there.