Monday, January 28, 2008

CTDB doesnt do Failover and this is a good thing

The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.


These are my personal thoughts on CTDB and failover.


CTDB while originally developed only as being a component to clusterize samba and to allow building a highly-scalable all-active cluster has evolved into a full blown HA-solution.

Just today I was thinking about what CTDB actually does and how. When thinking about it I realize that technically speaking, Clustered samba doesnt really do failover at all. And that this is In my opinion a good thing.

Hey, what am I saying? I dont do failover for this cluster HA solution and I claim this is a good thing? That sounds absolutely insane doesnt it? Well not really. Let me try to explain...


CIFS is a very tricky protocol to clusterize since CIFS is so very stateful. In many existing HA solutions to make a Samba based NAS server robust against failures, what people often do is
building an Active/Passive solution. Two nodes that are almost identical, both with a CIFS server and both connected to the same storage backend. But only one of the two nodes are active at any given time.
You then have a 2 node cluster with one active node and one passive node and the passive node is in some semi-dormant state, waiting for when it needs to be brought online.

The idea is then that when the active node fails for some reason, HW failure?, you do a failover shutting down the active node completely and boots up the passive node.
So far so good, but assuming you have the most simplest HA solution and your passive linux box with samba is in a powered off state, how can you be sure that the passive node will actually boot correctly and start up?
In general, I dont think you can be 100% sure that the passive node will start up correctly since there could have been some kind of fault developed on the passive node while it was dormant and maybe you wont detect that this fault exists until you actually try to bring the passive node up. At which stage it is too late since the Active node has already failed and the passive node just have to come up without any faults.


This is completely different of CTDB-Samba. CTDB samba does not use an Active/Passive concept. Instead it is an "All-Active" cluster where all nodes at all times are running and actively serving (the same) data. This could be a two node cluster or a multinode cluster.

If we compare a two node CTDB-Samba cluster, the difference between this 2 node cluster and an active/passive Samba failover pair, in the 2 node CTDB-Samba cluster BOTH nodes would normally be active at the same time and where the workload from all attached clients are distributed/shared across the two active nodes.

This means that if one of the nodes fail, we are not really doing a failover, instead we just redistribute/migrate all the clients from the failed node so they instead are taken over by the node that remains. This means that when one node in our two node cluster fails, only 50% of the attached clients will experience a disruption (since only half of the active clients were attached to the failed node instead of 100% of the clients as in the active/passive samba setup.

Fewer clients are disrupted.

Second, when the node failed and I migrate the clients over to the other node, that other node is already active and hosting/serving data. that other node is already running and may have been running for quite a while! This for me means much less uncertainty. No longer a question : "will the passive node start up?" since we never start the other node up! It is already running and it is (hopefully) running fine. It is like having an automatic verification that all systems are go PRIOR to actually implementing the failover procedure.
I.e. I know pretty confidently that the "failover" should work since the node that will take over the workload is already active and verified to be completely healthy.

I think this is a pretty cool side effect of having an All-Active cluster!


Sorry for using the "failover" word several times above since I don't think "failover" is a very accurate word to describe what CTDB does, but I don't have any better word available. :-)


Comments?

Oh, by the way, feel free to try out CTDB/Samba. It is very cool software and can be downloaded from ctdb.samba.org.

Introduction

Hello world.

This is my first blog entry.
My name is Ronnie Sahlberg, I work as an open source developer for Linux Technology Centre at IBM. My main focus is to work with Samba and the CTDB component of Samba, but I also hack on Wireshark from time to time.
This is my blog.
The postings on this site are my own and don’t necessarily represent
IBM’s positions, strategies or opinions.


So what is CTDB and what is it that I do?
CTDB (Clustered Trivial DataBase) is a very very thin and VERY fast database that is developed for samba to makeclusterize samba. What CTDB does is to make it possible for Samba to run and serve the same data from several different hosts in your network AT THE SAME TIME!

This means that with CTDB, samba suddenly becomes a clustered service where all nodes in the cluster are active and exports the same samba shares read-write at the same time. This allows for both very high performance (if we manage to scale well, which we do) as well as pretty interesting reliability features.

On this blog I will talk about various issues and topics I work with related to CTDB and what I do with CTDB to make it "tick".