Welcome to docs.opsview.com

High availability with Heartbeat and DRBD on Debian

Configuration is based on Linux-HA software

More information: http://www.linux-ha.org/

Author: Philipp Noack

Note: If you clone the first server after setup you will NOT get it running. The second server has to be installed exactly like the first one!

1. Debian install : My setup was:

	/boot ext3 with 100 MB and boot-flag
	/ ext3 with 5 GB
	swap wi th4 GB (depending on memory)
	/var ext3 with 5 GB
 	/var2 with the rest of the space was setup but wasn't formatted, yet!

2. Install bigmem-kernel (in case of +4GB memory)

type "apt-cache search linux-image bigmem" and choose the right kernel 

3. Network config: /etc/network/interfaces (eth1 will be for DRBD (RAID over TCP/IP) / heartbeat)

auto lo eth0 eth1

iface lo inet loopback

iface eth0 inet static
address 172.30.86.???

iface eth1 inet static
address 192.168.1.???

4. Install heartbeat:

	aptitude install heartbeat

5. Install DRBD: Add Debain Backports for DRBD8 in sources.list (included since lenny)

deb http://www.backports.org/debian etch-backports main contrib non-free

Install packages : drbd8-source, drbd8-utils

	aptitude -t etch-backports install drbd8-source
	aptitude -t etch-backports install drbd8-utils

Create kernel module (has to be redone if the kernel will be updated in future)

	module-assistant auto-install drbd8

reboot with the new kernel.

6. Edit the DRBD config (Official documentation: http://www.drbd.org/docs/install/). Here is my config as example : Important: You will find the name of the var2 partition in /etc/fstab. Mine was /dev/cciss/c0d0p7.

global {
    usage-count yes;

common {
  syncer { rate 700000K; }

resource r0 {

  protocol C;

  handlers {
    pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";

    pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";

    local-io-error "echo o > /proc/sysrq-trigger ; halt -f";

    outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";

  startup {
    degr-wfc-timeout 120;    # 2 minutes.

  disk {
    on-io-error   detach;

  net {
    after-sb-0pri disconnect;

    after-sb-1pri disconnect;

    after-sb-2pri disconnect;

    rr-conflict disconnect;

  syncer {
    al-extents 257;
  on mbops01 {
    device    /dev/drbd0;
    disk      /dev/cciss/c0d0p6;
    meta-disk internal;
  on mbops02 {
    device    /dev/drbd0;
    disk      /dev/cciss/c0d0p6;
    meta-disk internal;

Customize the rights for DRBD:

	chgrp haclient /sbin/drbdsetup
	chmod o-x /sbin/drbdsetup
 	chmod u+s /sbin/drbdsetup

 	chgrp haclient /sbin/drbdmeta
 	chmod o-x /sbin/drbdmeta
	chmod u+s /sbin/drbdmeta

7. Initialize DRBD (on both machines):

	drbdadm create-md r0

check it with “cat /proc/drbd”

Warning: Do this set on the master server ONLY!

	drbdadm -- --overwrite-data-of-peer primary r0

8. Create filesystem /dev/drbd0 (only on the master server again):

	mkfs -t ext3 /dev/drbd0

9. Install OPSView incl. apache2 (or see the official debian documentation under http://docs.opsview.org/doku.php?id=opsview2.14:debian-installation): Add following lines to the sources.list:

	deb http://apt.opsview.org/debian etch main
	deb http://ftp.debian.org/debian etch non-free

Then do a “apt-get update” and “apt-get install opsview”.

I just quote the original docu : “Once Opsview has been installed, a Catalyst web server should be listening on port 3000. The Apache web server can then be used as a proxy to make Opsview available on port 80 (http) ñ this also provides a significant improvement in performance as static content is then served directly by apache rather than via the perl Catalyst web server.”

	apt-get install libapache2-mod-proxy-html
	a2enmod proxy 
	a2enmod proxy_http
	a2enmod proxy_html
	/etc/init.d/apache2 force-reload

10. Remove opsview + Mysql + Apache from the runlevels to start automatically at startup (heartbeat does it for us now):

	update-rc.d -f opsview remove
	update-rc.d -f opsview-web remove
	update-rc.d -f mysql remove
	update-rc.d -f apache2 remove
	update-rc.d -f opsview-agent remove

11. Configure heartbeat: /etc/ha.d/ha.cf

debugfile /var/log/ha-debug
logfile /var/log/ha-log

keepalive 2
deadtime 30
warntime 10
initdead 120
auto_failback off

bcast eth1

# This is a ping test in our network to check which server can ping it

node mbops01
node mbops02

respawn hacluster /usr/lib/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster

The file /etc/ha.d/haresources:

mbops01 drbddisk::r0 Filesystem::/dev/drbd0::/var2::ext3 mysql opsview opsview-web apache2

The file /etc/ha.d/authkeys:

auth 3
3 md5 anypassword

The set the filerights:

	chmod 600 /etc/ha.d/authkeys

11. Moving data:

	cd /usr/local/
	tar cvzf nagios.tar.gz nagios 
	mv nagios.tar.gz /var2
	rm -r nagios
	cd /var2
	tar xvzf nagios.tar.gz /var2
	ln -s /var2/nagios /usr/local/nagios

same with /usr/local/opsview-web same with /var/lib/mysql

12. Replace NRPE agents (to be done on both machines in primary mode) The opsview-agent needs the var2 partition to run, so you need to use another NRPE agent. Install NRPE server and plugins

	apt-get install nagios-nrpe-server nagios-plugins-basic
	rm /etc/nagios/nrpe.cfg
	cp /var2/nagios/etc/nrpe.cfg /etc/nagios/nrpe.cfg

Now you need to edit the paths in the nrpe.cfg, /usr/local/nagios/libexec is replaced by /usr/lib/nagios/plugins

	vim /etc/nagios/nrpe.cfg
	/etc/init.d/nagios-nrpe-server restart

13. Solving problems

- Disk-flush errors on RAID systems Add the line “no-disk-flushes;” into the drbd.conf:

resource r0
	disk {

- Apache2 proxy doesn't work:

	cp /usr/share/doc/opsview/apache2-proxy.conf /etc/apache2/sites-available/opsview
	ln -s /etc/apache2/sites-enabled/opsview /etc/apache2/sites-available/opsview

Customize the config files (remove comments and customize IPs). Do this on both machines, then just do a takeover to restart apache2 (hearbeat).

- MySQL doesn't start/stop on a node: /etc/mysql/debian.cnf passwords have to match

- Delete filesystem if there are problems with it

dd if=/dev/zero bs=1M count=1 of=/dev/cciss/????; sync

- Problems with ressources r1, r2 … Delete the line 'after “r2”;' in the drbd.conf