Site menu:

Sponsored by

Bitcube Ltd.  Expert Linux Consultancy

Categories

Meta

Site search

 

February 2008
M T W T F S S
« Jan   Mar »
 123
45678910
11121314151617
18192021222324
2526272829  

Archives

Links:

GFS – Goodgrief where’s the documentation File System

I struggled most of Friday and for a few more hours tonight (and probably more time last week I can’t recall) on setting up [http://en.wikipedia.org/wiki/Global_File_System GFS] for work. GFS is a cluster filesystem – where more than one box share the filesystem amongst themselves – useful for HA (High Availability) clustering.

Now an additional complexity here is that instead of proper shared storage such as an iSCSI or Fibre Channel SAN, we are using [http://en.wikipedia.org/wiki/DRBD DRBDv8] – this basically provides RAID-1 over the network (one “disk” on one server, the other mirrored “disk” on the other server). v8 allows both nodes to be primary – writing data to the virtual disk (i.e. both copies) simultaneously. Of course, both boxes must be careful not to overwrite data the other box has just written (which is where clustered filesystems come in).

I’m writing this lot up to save other people some hassle.

We use Debian etch which has a 2.6.18 kernel and has GFSv1 filesystem. However DRBDv8 needs a later kernel (I used 2.6.22). However _that_ doesn’t like GFS very much but does has GFSv2. So I used redhat-cluster only to discover that it wants 2.6.23 or later! Arrrghh. this all took most of a day to figure out as it was only after tracking down obscure errors and compile issues that it all became a little clearer.

So, the steps you need to take are:
* backport openais to etch from sid
* install libvirt-dev from backports.org
* compile up 2.6.24
** I had to make my own (as Ethernet bonding is broken in 2.6.24 as shipped – there is a patch in the Fedora kernel SRPM)
* get redhat-cluster source from sid, edit debian/rules:
** –kernel_src=/usr/src/linux-headers-2.6.24-1-amd64
** now build the redhat-cluster package

I used the [http://gfs.wikidev.net/DRBD_Cookbook GFS wiki] instructions for the most part of GFS, however a few hints:
* use cman gfs2-tools, do _not_ use the old GFSv1 and RH cluster packages gfs-tools clvm cman fence
* /etc/cluster/cluster.conf has _dreadful_ documentation. Here is what I use:

<?xml version="1.0"?>
<cluster name="amq_test" config_version="1">
  <cman expected_votes="2" two_node="1">
    <!--<multicast addr="224.0.0.8"/>-->
  </cman>
  <clusternodes>
    <clusternode name="node1.manage.example.com" nodeid="1" votes="1">
      <!--<multicast addr="224.0.0.8" interface="eth0"/>-->
      <fence>
        <method name="lan">
          <device name="ipmi" ipaddr="10.3.2.65" 
          auth="password" login="root" passwd="..." />
        </method>
        <!-- If all else fails, make someone do it manually -->
        <method name="human">
          <device name="last_resort" ipaddr="node1"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="node2.manage.example.com" nodeid="2" votes="1">
      <!--<multicast addr="224.0.0.8" interface="eth0"/>-->
      <fence>
        <method name="lan">
          <device name="ipmi" ipaddr="10.3.2.35" 
          auth="password" login="root" passwd="..." />
        </method>
        <!-- If all else fails, make someone do it manually -->
        <method name="human">
          <device name="last_resort" ipaddr="node2"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <fencedevices>
    <fencedevice name="ipmi" agent="fence_ipmi"/>
    <fencedevice name="last_resort" agent="fence_manual"/>
  </fencedevices>
  <rm/>
</cluster>

* if you receive an error on mount of “/sbin/mount.gfs2: can’t connect to gfs_controld: Connection refused” then install and start the cman daemon
* if cman spends ages “Waiting for fenced to join the fence group.” then ensure that it is also running on the other node, and more importantly, be aware that the clusternode name in cluster.conf must be in DNS (or /etc/hosts) and resolve to an IP reachable from the other node (yes, that’s really quite naff)
* finally, ensure you have gfs2-tools installed before mounting otherwise you receive this kernel error (and you need to reboot to fix it):

GFS2: fsid=: Trying to join cluster "lock_dlm", "amq_test:gfsdisk"
kmem_cache_create: duplicate cache dlm_conn
Pid: 3493, comm: mount Not tainted 2.6.24 #1
 [] kmem_cache_create+0x36a/0x39f
 [] dlm_lowcomms_start+0xa7/0x5eb [dlm]
 [] schedule+0x64e/0x68d
 [] rb_insert_color+0x4c/0xad
 [] schedule_timeout+0x13/0x8d
 [] enqueue_task_fair+0x16/0x24
 [] dlm_new_lockspace+0x95/0x751 [dlm]
 [] dlm_scand+0x0/0x57 [dlm]
 [] kobject_register+0x28/0x2d
 [] gdlm_kobject_setup+0x51/0x71 [lock_dlm]
 [] gdlm_mount+0x2c8/0x373 [lock_dlm]
 [] gfs2_mount_lockproto+0xe1/0x12e [gfs2]
 [] gfs2_lm_mount+0x92/0x1f7 [gfs2]
 [] gfs2_glock_cb+0x0/0x135 [gfs2]
 [] fill_super+0x386/0x511 [gfs2]
 [] snprintf+0x1f/0x22
 [] disk_name+0x30/0x83
 [] get_sb_bdev+0xcc/0x10a
 [] gfs2_get_sb+0x21/0x3e [gfs2]
 [] fill_super+0x0/0x511 [gfs2]
 [] vfs_kern_mount+0x7f/0xf6
 [] do_kern_mount+0x35/0xbb
 [] do_mount+0x5d8/0x63a
 [] do_generic_mapping_read+0x2ab/0x3b2
 [] mntput_no_expire+0x11/0x66
 [] link_path_walk+0xa9/0xb3
 [] find_lock_page+0x19/0x6d
 [] filemap_fault+0x213/0x384
 [] __alloc_pages+0x59/0x2d5
 [] copy_mount_options+0x26/0x109
 [] sys_mount+0x77/0xae
 [] sysenter_past_esp+0x5f/0x85
 =======================
dlm: cannot start dlm lowcomms -12
lock_dlm: dlm_new_lockspace error -12
GFS2: fsid=: can't mount proto=lock_dlm, table=amq_test:gfsdisk, hostdata=

Comments

Comment from adrian
Time: Tuesday 12 February, 2008, 21:09

We’ve seen bad corruption (correct filenames, incorrect length and data). Underlying DRBD device was identical on both boxes.

We then also needed this patch:
http://lkml.org/lkml/2008/1/24/197

We’ll see how this goes – the silent corruption makes me _really_ nervous!

Comment from Kalaiselvan
Time: Thursday 23 October, 2008, 09:46

Dear Sir,

we have two server with Redhat Enterprice Linux 5.We would like to configure clustering by using this 2 server.If i add one user by using useradd command it should be affect both the servers.

Kindly advice me to configure this type of clustering

Comment from adrian
Time: Sunday 26 October, 2008, 11:30

This isn’t a “ask your random Linux question” website. Your comment has nothing to do with this article. The answer is to use LDAP but please don’t post random stuff like this.

Comment from sr
Time: Tuesday 3 February, 2009, 08:59

Hey,

im looking around since a couple of days trying to get known this cluster.conf you allready said the documentaion is warse but may u can post som information u got… cause the documentation still sucks… and i cant get it work.

Comment from Brandon
Time: Thursday 9 July, 2009, 14:54

Hello,
Thanks for putting this online. I agree..Redhat docs are just awful.

I’ve been struggling with this for a few days and am starting to doubt any assumptions.

in your cluster.conf, you have these lines:

<device name=”ipmi” ipaddr=”10.3.2.65″…
<device name=”ipmi” ipaddr=”10.3.2.35″…

Are the IPs listed the IP addresses of node01 and node02 respectively?

Thanks again!
~B

Comment from adrian
Time: Wednesday 22 July, 2009, 20:47

Nope – the IPMI are the addresses of the IPMI “service processor”/Lights Out Management/BMC.

Most servers have one – it allows you to check fan status and most importantly for a cluster, power off and on machines. It’s used therefore to STONITH (shoot the other node in the head).

Comment from Bernard
Time: Thursday 11 March, 2010, 16:06

Dear Sir,
Thanks for putting this online, i was trying to mount gf2 without starting cman, i thought i had a problem with the locking mechanism since lock_nolock was working.

anyway.. thanks again!