Tuesday, April 27, 2010

From Behind The Ash Curtain

Like many other MySQL conference visitors, also Codership team was stranded in bay area due to the volcano eruption in Iceland. For us however, the extra time in silicon valley was not an issue, as we were working on local assignments anyways, and staying local enabled us to work really focused during this period.

Also the "evacuation" from SFO worked out really well for us, thanks to KLM and Air France. We just visited SFO on Thursday (22) and got flights for the next morning. I actually got two bookings, which was a little embarrassing. There were many empty seats and especially the CDG-HEL leg was practically empty. Ash refugees had already left by train, I guess.
The return from SFO happened somewhat too early for us, as we had holiday plans for the forthcoming weekend, and we had to cancel the fun part. So this trip ended up as all work and no joy for us...

The conference itself, was fun and very interesting. We were busy first to prepare our presentation, then giving the presentation and finally sorting out mis-conceptions caused by the presentation. But it was fun altogether, and Galera got a lot attention there. We had many interesting conversations of Galera and replication strategies in general.
To my disappointment, the expo hall was half empty, I'm not sure if this can be profitable and it makes me wonder of the future of this conference.

On Fri 16th, there were both Drizzle and MariaDB conferences, which we planned to visit but were too busy to catch. We wanted to sort out the differences between the Drizzle replication API and wsrep API in very detail, but this work must be postponed a little.
We, however, entered MariaDB conference just when they were closing, and hooked into short replication discussion with Kristian Nielsen, Sergei Golubchik and Paul McCullagh. MontyProgram is driving the Replication API design and implementation and we don't need to work inside MariaDB code base. Codership will just provide Galera plugin for the end solution.

Paul brought to my attention an interesting fact about PBXT: rollback is low effort operation with PBXT. This is very favorable for Galera replication (and optimistic concurrency control in general). Odds are that PBXT will scale better in Galera cluster, even with hot spot work loads.

Slides of our presentation are available here: http://en.oreilly.com/mysql2010/public/schedule/detail/13286

Saturday, March 20, 2010

Codership Visit in O'Reilly MySQL Conference


O'Reilly MySQL Conference & Expo 2010
I will be presenting Galera replication in O'Reilly MySQL Conference & Expo on April 14. Here is the link to the presentation abstract:
Galera - Synchronous Multi-master Replication For InnoDB.
The presentation will be run jointly with Alexey Yurchenko and will focus mostly in the practicalities of managing Galera cluster, like:

  • Howto download and install MySQL/Galera cluster

  • Configuration options, clustering use cases and topologies

  • Managing Galera cluster, joining node(s) in cluster, backups etc...

  • Application connectivity options

  • Monitoring Galera cluster, troubleshooting best practices


This presentation will give in a nutshell all you need to know to start using Galera cluster as your application's data redundancy solution.
We plan to arrange a somewhat extended trip in Bay area, so we will have spare time for ad-hoc meetings. If you would like to have a f2f meeting with Codership team, just get it touch. We can arrange Galera demonstrations / presentations, or .e.g. analyze your use case in detail showing how Galera works for your application load.

See you in Santa Clara!

Thursday, February 11, 2010

For Those About to Galera - We Salute You!

Two new Galera presentations are available for downloads. First is from the ever famous
FOSDEM 2010 conference in Brussels, where I visited the MySQL Developers' room and
presented a 20 minute overview of Galera project. This presentation contains new 100% insert rate benchmark and synchronous WAN replication test results. Get your copy from here.

Yesterday, we presented Galera replication in MySQL University session. The focus of this webinar is to describe our replication API (wsrep API) and our patch in MySQL source code to support the API. This is MySQL oriented and quite technical presentation. The session was recorded and you can play it in MySQL University site: Galera presentation. The plain presentation slides (without disturbing narration), are also available in Codership site

Thanks for all the feedback! If your comment/question looks to have public interest, don't hesitate to post in our mailing list
We are working now for 0.7.3 release, and will publish in very near future (we look to squeeze MySQL 5.1.43 merge and "LOAD DATA LOCAL..." support still in the package). To ease the installation, we have also debian and RPM packages coming soon. They are needed for our very first cloud image, we are working on.

Thursday, January 14, 2010

MySQL/Galera 0.7.1 Released

MySQL/Galera release 0.7.1 ships out.

This is a maintenance release, which has fixes for 9 issues, listed in launchpad release page:

https://launchpad.net/codership-mysql/0.7/0.7.1

It makes sense to upgrade, if you suffer from any of the above.

Most notable changes are perhaps fixes for running concurrently DDL and DML queries. The MySQL version has also been bumped two notches up to 5.1.41

Prebuilt binary downloads are available, as usual, in the launchpad site:
https://launchpad.net/codership-mysql/+download.
Pay attention to pick the latest 0.7.1 version, as launchpad seems to give precedence for the old 0.7 release (no matter how hard I try to configure LP...).

Friday, December 11, 2009

Galera Author Interviewed by Himself

We just made a major software release, but I still don't see journalists queuing outside our office. Looks like I have to do the hard work and interview myself. In the following, I'll give rough reporter treatment to me:

So, what are we talking about?
MySQL/Galera release 0.7 - synchronous multi-master clustering solution for InnoDB.

Downloads? Where?
.e.g. here: https://launchpad.net/codership-mysql

Support?
Sure, here: www.codership.com/services/consulting
But, can't you ask any longer questions?

Oh, sorry, assumed that you geek people prefer not to talk with natural language. But, what is this Galera thingie good for? For whom would you suggest this release?
Practically any innodb user can potentially benefit of MySQL/Galera. There are no unnecessary tweaks in the MySQL behavior. Odds are good that your application will notice no difference when compared with vanilla MySQL.

If high availability is your need, Galera provides that out of the box, due to synchronous replication. After committing, the data is safe in every active cluster node, simple as that.

And, if more performance is needed, Galera can boost your data access considerably. Note that, Galera scales even write intensive workloads. However, hot spots are poison for this replication method. If workload contains focused hot spots, the number of write-accepting masters should be reduced.

Is it good for production, anything to worry about?
We have tested this during a focused test session after 0.7pre release, and we are quite happy with the stability. Two issues were postponed for future maintenance release. There is obvious issue when running DDL and DML concurrently in the cluster. That should be avoided, if it ever were in your plans.

But no matter how much we can test in laboratory, for production use, it is anyway essential to evaluate with the real application and with test load that closely simulates production use.

How stable is it, can I go in engine room and pull out cables wildly?
yes! 0.7 release was designed to be fault tolerant and can recover from most of the expected and un-expected situations. It tolerates even ad-hoc engine room visits.

Does it support innodb plugin?
This build is over MySQL 5.1.39 and innodb plugin is in there. We have enabled innodb plugin in the build and did also some compatibility tests with it. No issues surfaced, but our testing was quite minimal. .e.g. no performance testing has been run with plugin version.
MySQL/Galera will start by default with builtin innobase engine.There is configuration sample in the distribution showing how innodb plugin can be loaded, if you want to play with it.

Everybody is talking of this emerging MariaDB, any plans on supporting that?
Yes, plans and even actions. MariaDB version will be available here: https://launchpad.net/codership-maria

Everybody is talking of PostgreSQL, any plans on supporting that?
PostgreSQL has been in our roadmap from the very beginning. However, reality bites, and in practice MySQL development has eaten all our resources so far. We plan to get PostgreSQL development rolling in near future, but it sure would help if some experienced PostgeSQL partner would join in this development.

Is this a cry for help, or what?
Yes

So, what's next?
Next in schedule is maintenance release 0.7.1, ETA before end of the year. It will mostly address issues in running DDL and DML concurrently. In general, the maintenance release cycle will be kept as short as possible.
Next major release will be 0.8, which has features for considerably faster node join operation. (currently we are limited by mysqldump speed...).
Also MariaDB porting will continue with added effort. One more cup of coffee, and I will promise MariaDB port during December time frame.

Thanks! What are we?
You are welcome

And what was it?
It was a pleasure

Friday, May 15, 2009

MySQL/Galera Release 0.6

MySQL/Galera release 0.6 shipped out today.

MySQL/Galera is synchronous multi-master clustering solution for innodb storage engine, offering un-compromised performance and thanks to certification based replication model, scalability even with write intensive work loads.

We have tested MySQL/Galera 0.6 with a number of benchmarks. Here is a summary of sysbench oltp benchmark run on clusters of 1-4 nodes of Amazon EC2 large instances: sysbench results Scalability is remarkable here and many other benchmarks show similar performance gain.

The 0.6 release adds following new features over the earlier Demo-2 release:

  • Merged with MySQL 5.1.33
  • Full DDL replication using "total order isolation" mode
  • Workaround for drupal issue #282555. The fix is simply about retrying the failed autoinc insert query
  • ...and some bug fixes to go


The MySQL/galera 0.6 is binary linux release (both 32 and 64 builds available) and is available in: Codership Downloads. This release has passed a number of feature and performance tests .e.g. with Drupal benchmarks.

You can evaluate MySQL/Galera 0.6 with minimal effort. Just install and configure MySQL/Galera in each node in your cluster. Then start group communication daemon and all MySQL servers. MySQL/Galera cluster is functional at this point and you can load your data in one cluster node, data will replicate to whole cluster. Then start your application and connect to any node(s). You can also use load balancer in front to balance connections between nodes. We have good experience with Galera Load Balancer (glb: Codership Downloads), but in practice, any TCP level load balancer will do as well.

Next Galera release will be 0.7 and it is under R&D effort having deadline at the end of June. The 0.7 release will be open sourced and is functionally quite complete, offering.e.g. node join capabilities for the cluster. Galera is cooking good at the moment.

Wednesday, April 15, 2009

Clustering Drupal

We have been testing MySQL/Galera cluster with various benchmarks and one exercise in our test plan is to try clustering performance in web application level. We picked Drupal as our first target application and composed cluster from identical Drupal instances. Each Drupal node has local MySQL database and we cluster the databases with Galera synchronous multi-master replication system. As a result, the effects of http requests hitting any Drupal node, will be synchronously replicated to the whole cluster.

Alex wrote a detailed article about the benchmarking session. I present here just an executive summary and go directly to the final results.

Test Platform
We tested the Drupal cluster with Amazon EC2 small instances. Small instance is not particularly suitable for web platform due to long latencies, but we got our baseline figures from this setup, and decided to run more high end tests later on.

Running the test, shows strong unbalance between Apache/Drupal and MySQL CPU usages. MySQL consumes just 5-10% of the CPU and rest goes for Apache. Resource-wise, it would make sense to create separate farm for web servers and have a small MySQL cluster serving the farm. However, our test configuration has some advantages as well:
  • It is easy to setup, each node is identical
  • Local MySQL gives faster response to Drupal
  • It is possible to fallback on one node only

For the test session, we created clusters from 1-4 Drupal instances.

The Test
For testing, we used a jmeter test, which runs three thread groups:
  1. Posters - create new pages in the system
  2. Commenters - read pages and add comments to the stories
  3. Browsers - just keep on reading pages in the system
The jmeter http load goes through glb load balancer to the Drupal cluster. Each http request can hit any cluster node at will.


Results

Final results show quite linear scalability:

















Nodes Users Request rate Latency Error rate
(req/min) (ms) (%)
---------------------------------------------------
1 40 129 3950 0.07
2 80 259 3960 0.06
3 120 387 3700 0.05
4 160 514 3490 0.12



Scaling continues linearly up to four nodes and we did not try with larger cluster sizes.

Near Future
Alex promised to continue with Drupal testing and run the tests with EC2 large instances to get reasonable latencies. Results from these experiments should appear in the near future.

We are also presenting Galera clustering in the Percona Performance Conference and can provide ad-hoc demonstrations for anybody interested there.