Diskless Machine(PxE boot)

PxE is a way of booting OS kernel over the network rather than from disk. It consist of one or more PxE servers which holds all kernel and initial ramdisk image. And PxE clients which are actual host where user application and services start. It is depicted in the Figure-1. PxE server running tftp and dhcp services. It is only necessary to PxE supported NIC on the client side. I will use Arch Linux as a PxE server. It is the one of the most lightweight and flexible  Linux distribution. I will use CentOS6 kernel image for the PxE client.

 

 

 

 

 

 

 

 

 

Figure-1: PxE Boot

Installing Necessary Packages:

First think we need to install necessary packages on the PxE server.

Installing tftp:

Directory for the tftp service is /srv/tftp.

[root@archbox netctl]# pacman -S tftp-hpa
[root@archbox netctl]# systemctl enable tftpd.service
Created symlink /etc/systemd/system/multi-user.target.wants/tftpd.service → /usr/lib/systemd/system/tftpd.service.
[root@archbox netctl]# systemctl start tftpd.service

Installing dhcp:

You may get an error on the starting dhcpd4 service due to the fact that we have not  configured yet the service.

[root@archbox netctl]# pacman -S dhcpd
[root@archbox netctl]# systemctl enable dhcpd4.service
[root@archbox netctl]# systemctl start dhcpd4.service

Installing NFS:

There are many options to boot the kernel over network such as NFS, HTTP, iSCSI etc. We will boot the host via NFS.

[root@arch2 etc]# pacman -S nfs-utils
[root@arch2 etc]# systemctl enable nfs-server.service 
Created symlink /etc/systemd/system/multi-user.target.wants/nfs-server.service → /usr/lib/systemd/system/nfs-server.service.
[root@arch2 etc]# systemctl start nfs-server.service 

Installing Syslinux

Syslinux is a packages that contains plenty of boot-loaders such as SYSLINUX (FAT filesystem bootloader), EXTLINUX (ext2/3/4, btrfs and xfs filesystem bootloader), PXELINUX (Network PXE bootloader) and ISOLINUX (ISO-9660) for CD/DVD bootloading.

[root@arch2 etc]# pacman -S syslinux

Configuration on the PxE Server:

dhcp:

[root@arch2 ~]# cat /etc/dhcpd.conf
allow booting;
allow bootp;
option domain-name-servers 192.168.122.1, 172.16.25.1;
option subnet-mask 255.255.255.0;
option routers 172.16.25.1;
subnet 172.16.25.0 netmask 255.255.255.0 {
  range 172.16.25.100 172.16.25.110;
}
next-server 172.16.25.60;
filename "pxelinux.0";

tftp:

As indicated before /srv/tftp the root directory of  the tftp server. So all necessary configurations such as OS kernel images, initial ram disk images to boot over network will be stored in this (sub)directories.

[root@arch2 syslinux]# pwd
/usr/lib/syslinux
[root@arch2 syslinux]# ls -ltr
total 16
drwxr-xr-x 2 root root 4096 Jan 14 20:47 efi32
drwxr-xr-x 2 root root 4096 Jan 14 20:47 efi64
drwxr-xr-x 2 root root 4096 Jan 14 20:47 diag
drwxr-xr-x 2 root root 4096 Jan 14 20:47 bios

In order for properly booting the kernel over network we need to copy some syslinux binaries to the /srv/tftpboot.

[root@arch2 syslinux]# cp /usr/lib/syslinux/bios/pxelinux.0 /srv/tftp
[root@arch2 syslinux]# cp /usr/lib/syslinux/bios/menu.c32 /srv/tftp
[root@arch2 syslinux]# cp /usr/lib/syslinux/bios/libutil.c32 /srv/tftp
[root@arch2 syslinux]# cp /usr/lib/syslinux/bios/ldlinux.c32 /srv/tftp

we also need to create a special folder named pxelinux.cfg under /srv/tftp which holds configuration for the grub-like menu. There is a directory named rhce6 under the /srv/tftp which holds our PxE client kernel and initrd image.

#For more information see /usr/share/doc/syslinux/pxelinux.txt

[root@arch2 syslinux]# mkdir -p /srv/tftp/pxelinux.cfg
[root@arch2 syslinux]# cat << EOF > /srv/tftp/pxelinux.cfg/default
default menu.c32
 timeout 30
 prompt 1

 label diskless6
 kernel /rhce6/vmlinuz-2.6.32-696.10.3.el6.x86_64
 append initrd=/rhce6/netboot6.img rw root=nfs:172.16.25.60:/srv/diskless/rhce6 selinux=0 enforcing=0

NFS:

NFS server will be used to mounting  root (/) file by the kernel. All our applications and OS configurations live here. So I created a folder named diskless/rhce6 under the /srv.

[root@arch2 rhce6]# cat /etc/exports
/srv/diskless/rhce6 *(rw,no_root_squash,no_subtree_check)

Creating a special initrd for the PxE client.

Here is the magic comes in. It is very important that we need to create initrd image that is able to mount  the root (/) by  using NFS protocol. By the way we do not do it on the PxE server. I have a  running CentOS6 machine which has the same kernel version as the host that will be used as PxE client.

We need to install dracut-network package.

[root@node01 ~]# yum install dracut-network

Creating a initrd image:

[root@node01 ~]# dracut -f /boot/netboot6.img `uname -r` root=dhcp root-path=nfs:172.16.25.60:/srv/diskless/rhce6

Only thing that we need to do is, copying the kernel image(vmlinuz) and initrd(in this case netboot6.img) to /srv/tftp/rhce6

[root@arch2 rhce6]# pwd
/srv/tftp/rhce6
[root@arch2 rhce6]# ls
netboot6.img  vmlinuz-2.6.32-696.10.3.el6.x86_64

Copying images to the PxE Server:

We need to copy CentOS root(/) folder to the PxE server. And PxE server serve this place as nfs export.

[root@node01 ~]# rsync -vaHAXSz --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found"} / root@172.16.25.60:/srv/diskless/rhce6

Changing Hostname:

[root@arch2 ~]# cat /srv/diskless/rhce6/etc/sysconfig/network
NETWORKING=yes
HOSTNAME=diskless.sfp.local

fstab:

We need to modify fstab suitable for the PxE boot.

[root@arch2 ~]# cat /srv/diskless/rhce6/etc/fstab
172.16.25.60:/srv/diskless/rhce6/  /            nfs     defaults       1 1
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0

Network:

I configured PxE client IP as dhcp which automatically given an IP by the dhcp server.

[root@arch2 ~]# cat /srv/diskless/rhce6/etc/sysconfig/network-scripts/ifcfg-eth0 
TYPE=ethernet
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=dhcp

PxE client Options:

Create a New Virtual Machine on the KVM and choose “Network Boot(PXE)”

 

 

 

 

 

 

Chose interface device model virtio in the NIC section.

Experiment

Finally I am ready to fire my virtual guest without adding a disk.

Booting over Network:

 

 

 

 

 

 

Disk Status:

 

 

 

 

IP:

 

 

 

 

That’s all for now. Happy booting 🙂

 

How to Create Red Hat HA Cluster Part -IV

This is the last part of clustering series post regarding Red Hat HA cluster. In this post, HA NFS will be created in the cluster.

Creating a Fail-over Domain:

[root@node01 ~]# ccs -h localhost --addfailoverdomain NFS
[root@node01 ~]# ccs -h localhost --addfailoverdomainnode NFS node01-hb.cls.local
[root@node01 ~]# ccs -h localhost --addfailoverdomainnode NFS node02-hb.cls.local

Creating Resources.

As you see the section nfs client we specify the target *(all). It is not secure configuration, so you should specify network addresses that you want to allow.

[root@node01 ~]# ccs -h localhost --addresource lvm name=nfs-halvm vg_name=vgcls_nfs lv_name=lvcls_nfs
[root@node01 ~]# ccs -h localhost --addresource fs name=nfs-fs mountpoint="/mnt/nfs-export" fstype="ext4" device="/dev/vgcls_nfs/lvcls_nfs" force_fsck="0" force_unmount="0"
[root@node01 ~]# ccs -h localhost --addresource nfsserver name=nfs-server
[root@node01 ~]# ccs -h localhost --addresource nfsclient name=nfs-client options="rw,no_root_squash" target="*"
[root@node01 ~]# ccs -h localhost --addresource ip address="192.168.122.20" monitor_link="1"

Creating Service Groups

[root@node01 ~]# ccs -h localhost --addservice  HANFS domain="NFS" nfslock=1 exclusive=0 recovery=relocate autostart=1
[root@node01 ~]# ccs -h localhost --addsubservice HANFS ip ref=192.168.122.20
[root@node01 ~]# ccs -h localhost --addsubservice HANFS lvm ref=nfs-halvm
[root@node01 ~]# ccs -h localhost --addsubservice HANFS lvm:fs ref=nfs-fs
[root@node01 ~]# ccs -h localhost --addsubservice HANFS lvm:fs:nfsserver ref=nfs-server
[root@node01 ~]# ccs -h localhost --addsubservice HANFS lvm:fs:nfsserver:nfsclient ref=nfs-client

Sync cluster configuration to the other nodes.

[root@node01 ~]# ccs -h localhost --sync --activate

Final Cluster Configuration:

[root@node01 ~]# ccs -h localhost --getconf
<cluster config_version="127" name="ankara-cluster">  
  <fence_daemon/>  
  <clusternodes>    
    <clusternode name="node01-hb.cls.local" nodeid="1">      
      <fence>        
        <method name="FMET_XVM">          
          <device domain="node01" name="FDEV_XVM1"/>          
        </method>        
      </fence>      
    </clusternode>    
    <clusternode name="node02-hb.cls.local" nodeid="2">      
      <fence>        
        <method name="FMET_XVM">          
          <device domain="node02" name="FDEV_XVM2"/>          
        </method>        
      </fence>      
    </clusternode>    
  </clusternodes>  
  <cman expected_votes="1" two_node="1"/>  
  <fencedevices>    
    <fencedevice agent="fence_xvm" name="FDEV_XVM1"/>    
    <fencedevice agent="fence_xvm" name="FDEV_XVM2"/>    
  </fencedevices>  
  <rm>    
    <failoverdomains>      
      <failoverdomain name="name=httpd" nofailback="0" ordered="0" restricted="0"/>      
      <failoverdomain name="NFS" nofailback="0" ordered="0" restricted="0">        
        <failoverdomainnode name="node01-hb.cls.local"/>        
        <failoverdomainnode name="node02-hb.cls.local"/>        
      </failoverdomain>      
    </failoverdomains>    
    <resources>      
      <clusterfs device="UUID=996a0360-1895-2c53-b4ed-876151027b61" fstype="gfs2" mountpoint="/data/httpd" name="httpdgfs2"/>      
      <ip address="192.168.122.10" monitor_link="yes" sleeptime="10"/>      
      <lvm lv_name="lvcls_nfs" name="nfs-halvm" vg_name="vgcls_nfs"/>      
      <fs device="/dev/vgcls_nfs/lvcls_nfs" force_fsck="0" force_unmount="0" fstype="ext4" mountpoint="/mnt/nfs-export" name="nfs-fs"/>      
      <nfsserver name="nfs-server"/>      
      <nfsclient name="nfs-client" options="rw,no_root_squash" target="*"/>      
      <ip address="192.168.122.20" monitor_link="1"/>      
    </resources>    
    <service domain="httpd" name="httpd-resources" recovery="relocate">      
      <ip ref="192.168.122.10">        
        <clusterfs ref="httpdgfs2"/>        
      </ip>      
    </service>    
    <service autostart="1" domain="NFS" exclusive="0" name="HANFS" nfslock="1" recovery="relocate">      
      <ip ref="192.168.122.20"/>      
      <lvm ref="nfs-halvm">        
        <fs ref="nfs-fs">          
          <nfsserver ref="nfs-server">            
            <nfsclient ref="nfs-client"/>            
          </nfsserver>          
        </fs>        
      </lvm>      
    </service>    
  </rm>  
  <quorumd label="qdisk"/>  
</cluster>

Cluster Status:

[root@node01 ~]# clustat 
Cluster Status for ankara-cluster @ Sat Jan 13 19:54:54 2018
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 node01-hb.cls.local                                                 1 Online, Local, rgmanager
 node02-hb.cls.local                                                 2 Online, rgmanager
 /dev/block/8:16                                                     0 Online, Quorum Disk

 Service Name                                                  Owner (Last)                                                  State         
 ------- ----                                                  ----- ------                                                  -----         
 service:HANFS                                                 node01-hb.cls.local                                           started       
 service:httpd-resources                                       node01-hb.cls.local                                           started       
[root@node01 ~]# 


df -hP

#on node01
#df -hP 
/dev/mapper/vgcls_httpd-lv_httpd 1020M  131M  890M  13% /data/httpd
/dev/mapper/vgcls_nfs-lvcls_nfs   988M  1.4M  936M   1% /mnt/nfs-export

Cluster IP NFS HA

[root@node01 ~]# ping -c3 192.168.122.20
PING 192.168.122.20 (192.168.122.20) 56(84) bytes of data.
64 bytes from 192.168.122.20: icmp_seq=1 ttl=64 time=0.035 ms
64 bytes from 192.168.122.20: icmp_seq=2 ttl=64 time=0.023 ms
64 bytes from 192.168.122.20: icmp_seq=3 ttl=64 time=0.042 ms

Mounting HA NFS:

[root@KVMHOST ~]# mount.nfs nfscls:/mnt/nfs-export /mnt/nfscls/
[root@oc7133075241 ~]# df -hP
(...omitted)
Filesystem                           Size  Used Avail Use% Mounted on

nfscls:/mnt/nfs-export               988M  1.3M  936M   1% /mnt/nfscls

Migrating Cluster Service to Other Node:

[root@node01 ~]# clusvcadm 
usage: clusvcadm [command]

Resource Group Control Commands:
  -v                     Display version and exit
  -d <group>             Disable <group>.  This stops a group
                         until an administrator enables it again,
                         the cluster loses and regains quorum, or
                         an administrator-defined event script
                         explicitly enables it again.
  -e <group>             Enable <group>
  -e <group> -F          Enable <group> according to failover
                         domain rules (deprecated; always the
                         case when using central processing)
  -e <group> -m <member> Enable <group> on <member>
  -r <group> -m <member> Relocate <group> [to <member>]
                         Stops a group and starts it on another
                         cluster member.
  -M <group> -m <member> Migrate <group> to <member>
                         (e.g. for live migration of VMs)
  -q                     Quiet operation
  -R <group>             Restart a group in place.
  -s <group>             Stop <group>.  This temporarily stops
                         a group.  After the next group or
                         or cluster member transition, the group
                         will be restarted (if possible).
  -Z <group>             Freeze resource group.  This prevents
                         transitions and status checks, and is 
                         useful if an administrator needs to 
                         administer part of a service without 
                         stopping the whole service.
  -U <group>             Unfreeze (thaw) resource group.  Restores
                         a group to normal operation.
  -c <group>             Convalesce (repair, fix) resource group.
                         Attempts to start failed, non-critical 
                         resources within a resource group.
Resource Group Locking (for cluster Shutdown / Debugging):
  -l                     Lock local resource group managers.
                         This prevents resource groups from
                         starting.
  -S                     Show lock state
  -u                     Unlock resource group managers.
                         This allows resource groups to start.

Migrating httpd-resources service to node02.

[root@node01 ~]# clusvcadm -r httpd-resources -m node02-hb.cls.local
Trying to relocate service:httpd-resources to node02-hb.cls.local...Success
service:httpd-resources is now running on node02-hb.cls.local

[root@node01 ~]# clustat 
Cluster Status for ankara-cluster @ Sat Jan 13 20:11:10 2018
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 node01-hb.cls.local                                                 1 Online, Local, rgmanager
 node02-hb.cls.local                                                 2 Online, rgmanager
 /dev/block/8:16                                                     0 Online, Quorum Disk

 Service Name                                                  Owner (Last)                                                  State         
 ------- ----                                                  ----- ------                                                  -----         
 service:HANFS                                                 node01-hb.cls.local                                           started       
 service:httpd-resources                                       node02-hb.cls.local                                           started

Testing

[root@oc7133075241 ~]# curl http://ankara-cluster
<h1> Hello Ankara-Cluster</h1>
<h2>Tue, 14 2017</h2>

Cluster Operations:

#Checking fence agent. You can run on this command only on the master node.
[root@node01 ~]# fence_check 
fence_check run at Sat Jan 13 20:05:49 CET 2018 pid: 1022
Testing node01-hb.cls.local method 1: success
Testing node02-hb.cls.local method 1: success

Finding Master Nodeid:

[root@node01 ~]# cman_tool services
fence domain
member count  2
victim count  0
victim now    0
master nodeid 1
wait state    none
members       1 2 

dlm lockspaces
name          httpdgfs2
id            0x2b180060
flags         0x00000008 fs_reg
change        member 1 joined 1 remove 0 failed 0 seq 1,1
members       1 

name          rgmanager
id            0x5231f3eb
flags         0x00000000 
change        member 2 joined 1 remove 0 failed 0 seq 2,2
members       1 2 

name          clvmd
id            0x4104eefa
flags         0x00000000 
change        member 2 joined 1 remove 0 failed 0 seq 2,2
members       1 2 

gfs mountgroups
name          httpdgfs2
id            0xf8e8df07
flags         0x00000048 mounted
change        member 1 joined 1 remove 0 failed 0 seq 1,1
members       1 

 

Showing Nodes:

[root@node01 ~]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   0   M      0   2018-01-13 19:41:11  /dev/block/8:16
   1   M    728   2018-01-13 19:41:06  node01-hb.cls.local
   2   M    732   2018-01-13 19:41:30  node02-hb.cls.local

 

 

How to Create Red Hat HA Cluster Part -III

In this post, we will do the actual work to finish cluster configuration such as starting cluster services, configuration of the clustered LVM disk, creating Fail-over domain, resources, and service groups.

Two-Node Cluster

As it is indicated in the previous posts two-node cluster is a special form of a cluster because of split-brain situations. So we will add special configuration for two-node cluster.

#If you are configuring a two-node cluster, you can execute the following command to allow a single node to maintain quorum (for example, if one node fails):

[root@node01 ~]# ccs -h host --setcman two_node=1 expected_votes=1

Starting Cluster Services:

After configuring quorum, fencing and special two-node cluster configuration we need to start cluster services in order.

#Start cluster services on each node in order below.
[root@node01 ~]# service cman start
[root@node01 ~]# service clvmd start #if CLVM has been used to create clustered volumes
[root@node01 ~]# service gfs2 start #if you are using Red Hat GFS2
[root@node01 ~]# service rgmanager start #if you using high-availability (HA) services (rgmanager).

Stopping Cluster Services

Stopping cluster services in order also important.

[root@node01 ~]# service rgmanager stop #if you using high-availability (HA) services (rgmanager).
[root@node01 ~]# service gfs2 stop #if you are using Red Hat GFS2
[root@node01 ~]# umount -at gfs2 #if you are using Red Hat GFS2 in conjunction with rgmanager, to ensure that any GFS2 files mounted during rgmanager startup (but not unmounted during shutdown) were also unmounted.
[root@node01 ~]# service clvmd stop #if CLVM has been used to create clustered volumes
[root@node01 ~]# service cman stop 

Test:

After starting cluster services, we can check if our nodes are available. To check it we use clustat command. For now, we do not have any fail-over domain, resource, service group.

[root@node01 ~]# clustat 
Cluster Status for ankara-cluster @ Thu Jan 11 20:57:34 2018
Member Status: Quorate

 Member Name                             ID   Status
 ------ ----                             ---- ------
 node01-hb.cls.local                         1 Online, Local, rgmanager
 node02-hb.cls.local                         2 Online, rgmanager
 /dev/block/8:16                             0 Online, Quorum Disk

Clustered LVM Configuration

It is very important to configure disk devices  with the correct configuration due to the fact that any node may corrupt the data on the shared disk because of the nodes in the cluster may access disk media at the same time for writing.

To use LVM on the cluster systems do not forget to change locking_type on the /etc/lvm/lvm.conf Change locking type from 1 to 3.

Change locking_type from 1 to locking_type=3

[root@node01 ~]# grep locking_type /etc/lvm/lvm.conf 
	# Configuration option global/locking_type.
	#     when to use it instead of the configured locking_type.
	locking_type = 3
	# Attempt to use built-in cluster locking if locking_type 2 fails.
	# Use locking_type 1 (local) if locking_type 2 or 3 fail.
	# locking_type 1 viz. local file-based locking.
	# The external locking library to use for locking_type 2.

Restart clvmd service on each node.

[root@node01 ~]# service clvmd restart

Initializing disks for the clustered LVM

#You can run below command any of the one node
[root@node01 ~]# pvcreate /dev/sdc
[root@node01 ~]# pvcreate /dev/sdd

Creating a Volume group

Creating a volume group for the clustered environment is almost the same as creating volume group for unclustered environment. Only difference we use -cy to specify clustered system.

[root@node01 ~]# vgcreate -cy vgcls_httpd /dev/sdc
[root@node01 ~]# vgcreate -cy vgcls_nfs /dev/sdd

Creating a Logical Volumes and Activating

[root@node01 ~]# lvcreate -n lv_httpd -l 100%vg vgcls_httpd
  Logical volume "lv_httpd" created.
[root@node01 ~]# lvcreate -n lvcls_nfs -l 100%vg vgcls_nfs
  Logical volume "lvcls_nfs" created.
[root@node01 ~]# lvchange -ay vgcls_httpd/lv_httpd
[root@node01 ~]# lvchange -ay vgcls_nfs/lvcls_nfs

#Scanning and rebuilding lvm caches
[root@node01 ~]# vgscan --mknodes -v



Creating GFS2 Files System

Even though we have cluster-aware underlying device, we still need a file system on OS level.

[root@node01 ~]# mkfs.gfs2 -t ankara-cluster:httpdgfs2 -j 2 -J 64 /dev/vgcls_httpd/lv_httpd
Device:                    /dev/vgcls_httpd/lv_httpd
Blocksize:                 4096
Device Size                1.00 GB (261120 blocks)
Filesystem Size:           1.00 GB (261118 blocks)
Journals:                  2
Resource Groups:           4
Locking Protocol:          "lock_dlm"
Lock Table:                "ankara-cluster:httpdgfs2"
UUID:                      996a0360-1895-2c53-b4ed-876151027b61

Creating a Failover-Domain:

Fail-over domain determines which cluster nodes are allowed to run which services on the cluster.

#Creating a failover domain
[root@node01 ~]# ccs -h localhost --addfailoverdomain name=httpd

Listing Resources

We have quite of resource types. You can list available resource options.

[root@node01 ~]# ccs -h localhost --lsserviceopts
service - Defines a service (resource group).
ASEHAagent - Sybase ASE Failover Instance
SAPDatabase - Manages any SAP database (based on Oracle, MaxDB, or DB2)
SAPInstance - SAP instance resource agent
apache - Defines an Apache web server
bind-mount - Defines a bind mount.
clusterfs - Defines a cluster file system mount.
fs - Defines a file system mount.
ip - This is an IP address.
lvm - LVM Failover script
mysql - Defines a MySQL database server
named - Defines an instance of named server
netfs - Defines an NFS/CIFS file system mount.
nfsclient - Defines an NFS client.
nfsexport - This defines an NFS export.
nfsserver - This defines an NFS server resource.
openldap - Defines an Open LDAP server
oracledb - Oracle 10g/11g Failover Instance
oradg - Oracle Data Guard Failover Instance
orainstance - Oracle 10g Failover Instance
oralistener - Oracle 10g Listener Instance
postgres-8 - Defines a PostgreSQL server
samba - Dynamic smbd/nmbd resource agent
script - LSB-compliant init script as a clustered resource.
tomcat-6 - Defines a Tomcat server
vm - Defines a Virtual Machine

Adding Resources

[root@node01 ~]# ccs -h localhost --addresource clusterfs name=httpdgfs2 fstype=gfs2 mountpoint=/data/httpd device=UUID="996a0360-1895-2c53-b4ed-876151027b61"

[root@node01 ~]# ccs -h localhost --addresource ip address=192.168.122.10 monitor_link=yes sleeptime=10

Listing Available Service Options

[root@node01 ~]# ccs -h localhost --lsserviceopts
service - Defines a service (resource group).
ASEHAagent - Sybase ASE Failover Instance
SAPDatabase - Manages any SAP database (based on Oracle, MaxDB, or DB2)
SAPInstance - SAP instance resource agent
apache - Defines an Apache web server
bind-mount - Defines a bind mount.
clusterfs - Defines a cluster file system mount.
fs - Defines a file system mount.
ip - This is an IP address.
lvm - LVM Failover script
mysql - Defines a MySQL database server
named - Defines an instance of named server
netfs - Defines an NFS/CIFS file system mount.
nfsclient - Defines an NFS client.
nfsexport - This defines an NFS export.
nfsserver - This defines an NFS server resource.
openldap - Defines an Open LDAP server
oracledb - Oracle 10g/11g Failover Instance
oradg - Oracle Data Guard Failover Instance
orainstance - Oracle 10g Failover Instance
oralistener - Oracle 10g Listener Instance
postgres-8 - Defines a PostgreSQL server
samba - Dynamic smbd/nmbd resource agent
script - LSB-compliant init script as a clustered resource.
tomcat-6 - Defines a Tomcat server
vm - Defines a Virtual Machine

Creating a Service Group

The VIP address will be our parent resource which will depend on filesystem. Therefore, the VIP will only be created if the filesystem successfully mount.

Pay careful attention to the order in which the service dependencies are added. Bear in mind that the cluster starts services from the bottom to top. Hence, all leave nodes need to be started before the parent node can start.

#Creating a service group
[root@node01 ~]# ccs -h localhost --addservice httpd-resources domain=httpd recovery=relocate
[root@node01 ~]# ccs -h localhost --addsubservice httpd-resources ip ref=192.168.122.10
[root@node01 ~]# ccs -h localhost --addsubservice httpd-resources ip:clusterfs ref=httpdgfs2

Finally sync cluster configuration to nodes.

[root@node01 ~]# ccs -h localhost --sync --activate

Experiment

[root@node01 ~]# clustat 
Cluster Status for ankara-cluster @ Thu Jan 11 22:08:38 2018
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 node01-hb.cls.local                                                 1 Online, Local, rgmanager
 node02-hb.cls.local                                                 2 Online, rgmanager
 /dev/block/8:16                                                     0 Online, Quorum Disk

 Service Name                                                  Owner (Last)                                                  State         
 ------- ----                                                  ----- ------                                                  -----         
 service:httpd-resources                                       node01-hb.cls.local                                           started       

 

Cluster Configuration on /etc/cluster/cluster.conf

<?xml version="1.0"?>
<cluster config_version="127" name="ankara-cluster">
	<fence_daemon/>
	<clusternodes>
		<clusternode name="node01-hb.cls.local" nodeid="1">
			<fence>
				<method name="FMET_XVM">
					<device domain="node01" name="FDEV_XVM1"/>
				</method>
			</fence>
		</clusternode>
		<clusternode name="node02-hb.cls.local" nodeid="2">
			<fence>
				<method name="FMET_XVM">
					<device domain="node02" name="FDEV_XVM2"/>
				</method>
			</fence>
		</clusternode>
	</clusternodes>
	<cman expected_votes="1" two_node="1"/>
	<fencedevices>
		<fencedevice agent="fence_xvm" name="FDEV_XVM1"/>
		<fencedevice agent="fence_xvm" name="FDEV_XVM2"/>
	</fencedevices>
	<rm>
		<failoverdomains>
			<failoverdomain name="name=httpd" nofailback="0" ordered="0" restricted="0"/>
		</failoverdomains>
		<resources>
			<clusterfs device="UUID=996a0360-1895-2c53-b4ed-876151027b61" fstype="gfs2" mountpoint="/data/httpd" name="httpdgfs2"/>
			<ip address="192.168.122.10" monitor_link="yes" sleeptime="10"/>
		</resources>
		<service domain="httpd" name="httpd-resources" recovery="relocate">
			<ip ref="192.168.122.10">
				<clusterfs ref="httpdgfs2"/>
			</ip>
		</service>
	</rm>
	<quorumd label="qdisk"/>
</cluster>

File system

The disk which is /dev/mapper/vgcls_httpd-lv_httpd automatically mounted  on a /data/httpd without configuring fstab file.

df -hP
/dev/mapper/vgcls_httpd-lv_httpd 1020M 131M 890M 13% /data/httpd

IP

As you see, Cluster ip (192.168.122.10) listed on the active node which is node01.

ip a s
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:54:00:2a:26:69 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.100/24 brd 192.168.122.255 scope global eth0
    inet 192.168.122.10/24 scope global secondary eth0

Apache Configuration

Only thing we need to do is installing Apache on each node(failover-domains) and putting your web pages (DocumentRoot) into the /data/httpd mount point on the active node.

You also need to sync your apache configurations  (/etc/httpd.conf.d)  to the other Apache server related nodes in the cluster whenever you modify your Apache configuration. You can use rsync or scp for this.

Sample Apache Configuration on Each Apache related node.

[root@node01 ~]# cat /etc/httpd/conf.d/00-default.conf 
<Directory /data/httpd/default/www>
	#Require all granted
	Allow from all
</Directory>

<VirtualHost _default_:80>
	ServerName ankara-cluster
	DocumentRoot /data/httpd/default/www
</virtualHost>

/etc/hosts

Contents of the /etc/hosts file on each cluster node and user Workstations who need to access web pages and nfs export.

192.168.122.100 node01
192.168.123.100 node01-hb
192.168.122.200 node02
192.168.123.200 node02-hb

192.168.122.10 ankara-cluster
192.168.122.20 nfscls

Testing Web Page

[root@oc7133075241 conf.d]# curl http://ankara-cluster
<h1> Hello Ankara-Cluster</h1>
<h2>Tue, 14 2017</h2>

That is all for now. Next post we will configure HA NFS on the cluster environment. For the previous post.

 

REFERENCES

https://bigthinkingapplied.com/creating-a-ha-cluster-with-red-hat-cluster-suite/

https://www.redhat.com/en

https://wiki.clusterlabs.org/wiki/Guest_Fencing

 

How to Create Red Hat HA Cluster Part -II

In the previous post, we prepared cluster environment such as cluster nodes(2), disks(3) network interfaces on each node(2)  and installed necessary packages on the nodes to build our cluster. In this part, we start building our cluster environment.

Creating Quorum Disk

We need to initialize one of our three disks as a quorum disk which is /dev/sdb in this case. As it was stated on the previous part all disks are the shared, you can  initialize the disk any of the one cluster node.

[root@node01 ~]# mkqdisk -c /dev/sdb -l qdisk
mkqdisk v3.0.12.1

Writing new quorum disk label 'qdisk' to /dev/sdb.
WARNING: About to destroy all data on /dev/sdb; proceed [N/y] ? y
Initializing status block for node 1...
Initializing status block for node 2...
Initializing status block for node 3...
Initializing status block for node 4...
Initializing status block for node 5...
Initializing status block for node 6...
Initializing status block for node 7...
Initializing status block for node 8...
Initializing status block for node 9...
Initializing status block for node 10...
Initializing status block for node 11...
Initializing status block for node 12...
Initializing status block for node 13...
Initializing status block for node 14...
Initializing status block for node 15...
Initializing status block for node 16...

You can check quorum disk mkqdisk -L command. We labeled our quorum disk as qdisk, in which will be used on the cluster configuration forthcoming post.

[root@node01 ~]# mkqdisk -L
mkqdisk v3.0.12.1

/dev/block/8:16:
/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0-0:
/dev/disk/by-path/pci-0000:00:09.0-virtio-pci-virtio2-scsi-0:0:0:0:
/dev/sdb:
	Magic:                eb7a62c2
	Label:                qdisk
	Created:              Wed Dec  6 14:00:51 2017
	Host:                 node01.cls.local
	Kernel Sector Size:   512
	Recorded Sector Size: 512

Creating a Cluster Environment.

In this section we start building our cluster configuration, by giving a name to our cluster cluster, configuring fencing, configuring quorum disk etc.  In order to configure cluster we will use ccs(Cluster Configuration System).

Before using ccs we need to set password for ricci user and start ricci service on all cluster nodes.

#Set password for the ricci user on all nodes.
passwd ricci

Starting ricci Service.

#Start ricci service on all nodes. And enable service to start on boot time.
service ricci start
chkconfig ricci start

Giving a name to your cluster. (ankara-cluster)

 

[root@node01 ~]# ccs -h localhost --createcluster ankara-cluster

Adding nodes to the cluster.

[root@node01 ~]# ccs -h localhost --addnode node01-hb.cls.local
Node node01-hb.cls.local added.
[root@node01 ~]# ccs -h localhost --addnode node02-hb.cls.local
Node node02-hb.cls.local added.

Checking config

ccs -h localhost --getconf

So far, all cluster configuration keep in the first node as we issued commands on the first cluster node. In order to push it to the other nodes. we need to sync cluster configurations.

[root@node01 ~]# ccs -h localhost --sync --activate

Configuring Fencing:

This part was the hardest part for me as I was using virtual guests as a cluster node. There are some fencing agents on the Internet for Virtual guests, but none of them worked for me. After quite of searching I configured fencing with fence_xvm. There are plenty of fencing agents you can use in compliance with your infrastructure.

[root@node01 ~]# ccs -h localhost --lsfenceopts

To see specific agent options you can issue below commands.  I will use fence_xvm for our cluster environment.

[root@node01 ~]# ccs -h localhost --lsfenceopts fence_xvm
fence_xvm - Fence agent for virtual machines
  Required Options:
  Optional Options:
    option: No description available
    debug: Specify (stdin) or increment (command line) debug level
    ip_family: IP Family ([auto], ipv4, ipv6)
    multicast_address: Multicast address (default=225.0.0.12 / ff05::3:1)
    ipport: Multicast or VMChannel IP port (default=1229)
    retrans: Multicast retransmit time (in 1/10sec; default=20)
    auth: Authentication (none, sha1, [sha256], sha512)
    hash: Packet hash strength (none, sha1, [sha256], sha512)
    key_file: Shared key file (default=/etc/cluster/fence_xvm.key)
    port: Virtual Machine (domain name) to fence
    use_uuid: Treat [domain] as UUID instead of domain name. This is provided for compatibility with older fence_xvmd installations.
    action: Fencing action (null, off, on, [reboot], status, list, monitor, metadata)
    timeout: Fencing timeout (in seconds; default=30)
    delay: Fencing delay (in seconds; default=0)
    domain: Virtual Machine (domain name) to fence (deprecated; use port)

In order to configure fence_xvm we need some configuration both on the Virtual Host(Physical Machine) and Virtual guests(cluster nodes).

# Install below packages on the KVM host which is RHEL7.3 
yum install fence-virt fence-virtd fence-virtd-multicast fence-virtd-libvirt

Configuring the Firewall(on the KVM host)

firewall-cmd --permanent --zone=trusted --change-interface=virbr0
firewall-cmd --reload

Or

[root@kvmhost ~]# firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.122.100" port port="1229" protocol="tcp" accept'
success

[root@kvmhost ~]# firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.168.122.100" port port="1229" protocol="udp" accept'
[root@kvmhost ~]# firewall-cmd --reload

Create random shared key on the KVM host.

#Create a random shared key:

  mkdir -p /etc/cluster
  touch /etc/cluster/fence_xvm.key
  chmod 0600 /etc/cluster/fence_xvm.key
  dd if=/dev/urandom bs=512 count=1 of=/etc/cluster/fence_xvm.key

Configure fence by issuing fence_virtd -c

#On the KVM host
fence_virtd -c

Enable and start service

#On the KVM host

systemctl enable fence_virtd
systemctl start fence_virtd

Configuration after fence_virtd -c.(on the KVM host)

[root@kvmhost ~]# cat /etc/fence_virt.conf
backends {
	libvirt {
		uri = "qemu:///system";
	}

}

listeners {
	multicast {
		port = "1229";
		family = "ipv4";
		interface = "virbr0";
		address = "225.0.0.12";
		key_file = "/etc/cluster/fence_xvm.key";
	}

}

fence_virtd {
	module_path = "/usr/lib64/fence-virt";
	backend = "libvirt";
	listener = "multicast";
}

Testing on the KVM host

[root@kvmhost ~]# fence_xvm -o list
node01                           60b3d846-8508-47d3-90e3-a3d5702ef523 on
node02                           68d3554f-f2fc-4a1e-8a1a-4e1a46987700 on

Configuring Virtual Guests(Cluster nodes)

Install below packages on each cluster node.

yum install fence-virt fence-virtd

Copy the shared key from KVM host to Virtual guests  directory /etc/cluster/fence_xvm.key.

clear iptables
copy fence_xvm.key from kvm host to virtual guest’s(/etc/cluster) folder.(Create this folder if it does not exist)

Testing

Issue below commands on all virtual guests if fencing configured correctly(We have not configured it yet on the cluster)

[root@node01 cluster]# fence_xvm -o list
node01               60b3d846-8508-47d3-90e3-a3d5702ef523 on
node02               68d3554f-f2fc-4a1e-8a1a-4e1a46987700 on

Reboot and Power-off test. (On the KVM host or on the cluster nodes)

#On the on of kvm host 
fence_xvm -o reboot -H "23edf335-a80a-4a7c-b6c9-4ad5ad79e02d"
fence_xvm -o off -H "23edf335-a80a-4a7c-b6c9-4ad5ad79e02d"

Configuring Fencing on the Cluster Environment

After configuring our KVM host and virtual guest for fencing successfully, we can configure our fencing for our cluster.

Adding Fence Device

[root@node01 ~]# ccs -h localhost --addfencedev FDEV_XVM1  agent=fence_xvm
[root@node01 ~]# ccs -h localhost --addfencedev FDEV_XVM2  agent=fence_xvm

Adding Method

[root@node01 ~]# ccs -h localhost --addmethod FMET_XVM node01-hb.cls.local
[root@node01 ~]# ccs -h localhost --addmethod FMET_XVM node01-hb.cls.local

 

[root@node01 ~]# ccs -h localhost --addfenceinst FDEV_XVM1 node01-hb.cls.local FMET_XVM domain=node01
[root@node01 ~]# ccs -h localhost --addfenceinst FDEV_XVM2 node02-hb.cls.local FMET_XVM domain=node02

Sync the configuration

[root@node01 ~]# ccs -h localhost --sync --activate

Testing fencing(node02 will be fenced(rebooted) by the node01

[root@node01 cluster]# fence_node node02-hb.cls.local
fence node02-hb.cls.local success

 

In the next post we will create a failover-domain, resource, and service groups.  At the and we will have a fully functional HA cluster environment. For the first part of the tutorial.

For the next part.

 

How to Create Red Hat HA Cluster Part -I

This post will actually consists of couple of parts. At the end of the complete part of the series,  we will have  Red Hat HA Cluster that runs Apache and NFS .

In this tutorial I will be using two CentOS6 hosts, named node01 and node02 on the KVM host. Each node has two interfaces one for the giving users access to the services and the other one for the heart-beat.

Topology:

It is depicted in the Figure-1, it is two-node cluster with shared disks and fencing agent. Each node in the cluster has below hosts entry.

#/etc/hosts entry on each node.
192.168.122.100 node01.cls.local
192.168.123.100 node01-hb.cls.local
192.168.122.200 node02.cls.local
192.168.123.200 node02-hb.cls.local

 

 

 

 

 

 

 

 

 

 

                                            Figure-1: Simple two-node cluster

As it is depicted on the Figure-1,  there are three shareable storages created  on the KVM host. If you want, you can also use iSCSI disk. It is two-node cluster. As you know it is special cluster type as it is highly susceptible to split-brain circumstances, in case of network corruption between heart-beat. In order to prevent this, It is set up one disk as quorum. Minimum size for a quorum disk is 10MB. 512MB is used for this topology.

Beside quorum disk we have two separate shared disks, in which 1GB size. One disk will be used for Apache DocumentRoot and other disk will be used for the nfs export.

Fencing:

In order to proper cluster configuration we also need to configure fencing. Fencing is a disconnection of a node from the shared storage. If communication with a single node in the cluster fails, then other nodes in the cluster must be restrict or release access to resources that the failed cluster node may have access to. This may not possible by connecting failed node as it may be unresponsive, so that it needs to be disconnected external way. It is accomplished by fence agent. A fence device is an external device that can be used by the cluster to restrict access to shared resources by an errant node, or to issue a hard reboot on the cluster node.

Installing Cluster Software on each node:

In order to configure cluster on the nodes, we need to install some software packages on each node.

[root@node01 ~]# yum install rgmanager lvm2-cluster gfs2-utils ccs

Firewall:

Firewall service is disabled for this case on each node for the ease of configuration. But it is not acceptable for the production environment. In order for configuring cluster services properly, Cluster services on each node, must be communicate each other properly. Because of this, adding correct firewall rules are vital.

You can use the following filtering to allow multicast traffic through the iptables firewall for the various cluster components. For openais, use the following filtering. Port 5405 is used to receive multicast traffic.

#For cman

iptables -I INPUT -p udp -m state --state NEW -m multiport --dports 5404,5405 -j ACCEPT

#For ricci:

iptables -I INPUT -p tcp -m state --state NEW -m multiport --dports 11111 -j ACCEPT

#For modcluster:

iptables -I INPUT -p tcp -m state --state NEW -m multiport --dports 16851 -j ACCEPT

#For gnbd:

iptables -I INPUT -p tcp -m state --state NEW -m multiport --dports 14567 -j ACCEPT

#For luci:

iptables -I INPUT -p tcp -m state --state NEW -m multiport --dports 8084 -j ACCEPT

#For DLM:

iptables -I INPUT -p tcp -m state --state NEW -m multiport --dports 21064 -j ACCEPT

#For ccsd:

iptables -I INPUT -p udp -m state --state NEW -m multiport --dports 50007 -j ACCEPT
iptables -I INPUT -p tcp -m state --state NEW -m multiport --dports 50008 -j ACCEPT


#After apply the rules above, we need to save firewall and restart the firewall service.

service iptables save ; service iptables restart

This is the first part of series of post about Red Hat HA. Next part, we will initialize quorum disk and configure the fence agent and build our cluster environment.