GFS – Goodgrief where’s the documentation File System
I struggled most of Friday and for a few more hours tonight (and probably more time last week I can’t recall) on setting up [http://en.wikipedia.org/wiki/Global_File_System GFS] for work. GFS is a cluster filesystem – where more than one box share the filesystem amongst themselves – useful for HA (High Availability) clustering.
Now an additional complexity here is that instead of proper shared storage such as an iSCSI or Fibre Channel SAN, we are using [http://en.wikipedia.org/wiki/DRBD DRBDv8] – this basically provides RAID-1 over the network (one “disk” on one server, the other mirrored “disk” on the other server). v8 allows both nodes to be primary – writing data to the virtual disk (i.e. both copies) simultaneously. Of course, both boxes must be careful not to overwrite data the other box has just written (which is where clustered filesystems come in).
I’m writing this lot up to save other people some hassle.
We use Debian etch which has a 2.6.18 kernel and has GFSv1 filesystem. However DRBDv8 needs a later kernel (I used 2.6.22). However _that_ doesn’t like GFS very much but does has GFSv2. So I used redhat-cluster only to discover that it wants 2.6.23 or later! Arrrghh. this all took most of a day to figure out as it was only after tracking down obscure errors and compile issues that it all became a little clearer.
So, the steps you need to take are:
* backport openais to etch from sid
* install libvirt-dev from backports.org
* compile up 2.6.24
** I had to make my own (as Ethernet bonding is broken in 2.6.24 as shipped – there is a patch in the Fedora kernel SRPM)
* get redhat-cluster source from sid, edit debian/rules:
** now build the redhat-cluster package
I used the [http://gfs.wikidev.net/DRBD_Cookbook GFS wiki] instructions for the most part of GFS, however a few hints:
* use cman gfs2-tools, do _not_ use the old GFSv1 and RH cluster packages gfs-tools clvm cman fence
* /etc/cluster/cluster.conf has _dreadful_ documentation. Here is what I use:
<?xml version="1.0"?> <cluster name="amq_test" config_version="1"> <cman expected_votes="2" two_node="1"> <!--<multicast addr="188.8.131.52"/>--> </cman> <clusternodes> <clusternode name="node1.manage.example.com" nodeid="1" votes="1"> <!--<multicast addr="184.108.40.206" interface="eth0"/>--> <fence> <method name="lan"> <device name="ipmi" ipaddr="10.3.2.65" auth="password" login="root" passwd="..." /> </method> <!-- If all else fails, make someone do it manually --> <method name="human"> <device name="last_resort" ipaddr="node1"/> </method> </fence> </clusternode> <clusternode name="node2.manage.example.com" nodeid="2" votes="1"> <!--<multicast addr="220.127.116.11" interface="eth0"/>--> <fence> <method name="lan"> <device name="ipmi" ipaddr="10.3.2.35" auth="password" login="root" passwd="..." /> </method> <!-- If all else fails, make someone do it manually --> <method name="human"> <device name="last_resort" ipaddr="node2"/> </method> </fence> </clusternode> </clusternodes> <fencedevices> <fencedevice name="ipmi" agent="fence_ipmi"/> <fencedevice name="last_resort" agent="fence_manual"/> </fencedevices> <rm/> </cluster>
* if you receive an error on mount of “/sbin/mount.gfs2: can’t connect to gfs_controld: Connection refused” then install and start the cman daemon
* if cman spends ages “Waiting for fenced to join the fence group.” then ensure that it is also running on the other node, and more importantly, be aware that the clusternode name in cluster.conf must be in DNS (or /etc/hosts) and resolve to an IP reachable from the other node (yes, that’s really quite naff)
* finally, ensure you have gfs2-tools installed before mounting otherwise you receive this kernel error (and you need to reboot to fix it):
GFS2: fsid=: Trying to join cluster "lock_dlm", "amq_test:gfsdisk" kmem_cache_create: duplicate cache dlm_conn Pid: 3493, comm: mount Not tainted 2.6.24 #1 [
] kmem_cache_create+0x36a/0x39f [ ] dlm_lowcomms_start+0xa7/0x5eb [dlm] [ ] schedule+0x64e/0x68d [ ] rb_insert_color+0x4c/0xad [ ] schedule_timeout+0x13/0x8d [ ] enqueue_task_fair+0x16/0x24 [ ] dlm_new_lockspace+0x95/0x751 [dlm] [ ] dlm_scand+0x0/0x57 [dlm] [ ] kobject_register+0x28/0x2d [ ] gdlm_kobject_setup+0x51/0x71 [lock_dlm] [ ] gdlm_mount+0x2c8/0x373 [lock_dlm] [ ] gfs2_mount_lockproto+0xe1/0x12e [gfs2] [ ] gfs2_lm_mount+0x92/0x1f7 [gfs2] [ ] gfs2_glock_cb+0x0/0x135 [gfs2] [ ] fill_super+0x386/0x511 [gfs2] [ ] snprintf+0x1f/0x22 [ ] disk_name+0x30/0x83 [ ] get_sb_bdev+0xcc/0x10a [ ] gfs2_get_sb+0x21/0x3e [gfs2] [ ] fill_super+0x0/0x511 [gfs2] [ ] vfs_kern_mount+0x7f/0xf6 [ ] do_kern_mount+0x35/0xbb [ ] do_mount+0x5d8/0x63a [ ] do_generic_mapping_read+0x2ab/0x3b2 [ ] mntput_no_expire+0x11/0x66 [ ] link_path_walk+0xa9/0xb3 [ ] find_lock_page+0x19/0x6d [ ] filemap_fault+0x213/0x384 [ ] __alloc_pages+0x59/0x2d5 [ ] copy_mount_options+0x26/0x109 [ ] sys_mount+0x77/0xae [ ] sysenter_past_esp+0x5f/0x85 ======================= dlm: cannot start dlm lowcomms -12 lock_dlm: dlm_new_lockspace error -12 GFS2: fsid=: can't mount proto=lock_dlm, table=amq_test:gfsdisk, hostdata=