Adding Shared disk on KVM

Hello,

In this very short post, I will share with you how to create a shared disk on KVM host ugliest way :). It is prerequisite for the next  post. In the next topic, I will share with you how to build a two-node cluster on RedHat. I will use latest version of CentOS6 for the OS.

 

 

 

 

 

 

 

 

 

Removing and Rescanning SCSI Disk

Hi,

Sometimes storage team needs to upgrade HBA and Storage. During that time some of the multipath may not active. In this circumstance, as a system administrator, It is safe to set the device offline manually instead of immediate disk path failure before any HBA and storage upgrade happen.

multipath -ll

mpathb (1IET     00010001) dm-2 IET,VIRTUAL-DISK
size=1020M features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| `- 2:0:0:1 sdb 8:16 active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  `- 3:0:0:1 sdc 8:32 active ready running

In this post I will use /dev/sdb for the disk operation.

Setting the device offline disk:

In this case I will set offline the /dev/sdb.

echo offline > /sys/block/sdb/device/state

Checking Disk State:

[root@node02 ~]# cat /sys/block/sdb/device/state 
offline

In order to update multipath devices, we need to flush multipath and rescan it. (multipath -F; multipath)

[root@node02 ~]# multipath -F;multipath
Oct 18 23:00:35 | mpatha: ignoring map
create: mpathb (1IET     00010001) undef IET,VIRTUAL-DISK
size=1020M features='0' hwhandler='0' wp=undef
`-+- policy='round-robin 0' prio=1 status=undef
  `- 3:0:0:1 sdc 8:32 undef ready running
multipath -ll
mpathb (1IET     00010001) dm-2 IET,VIRTUAL-DISK
size=1020M features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 3:0:0:1 sdc 8:32 active ready running

After the storage and HBA operation we set the device running state again.

Setting the device to running state

echo running > /sys/block/sdb/device/state
[root@node02 etc]# cat /sys/block/sdb/device/state 
running

Checking multipath after setting the device to  running state

[root@node02 etc]# multipath -ll -v2
size=1020M features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=1 status=enabled
| `- 2:0:0:1 sdb 8:16 active ready running
`-+- policy='round-robin 0' prio=1 status=active
  `- 3:0:0:1 sdc 8:32 active ready running

 

Deleting a disk device

Sometimes we do not need to a disk device. In this case we need to delete disk device. For safety It is highly recommended setting device to offline before deleting a disk device.

[root@node02 etc]# echo offline > /sys/block/sdb/device/state

Deleting the device

echo 1 > /sys/block/sdb/device/delete
[root@node02 etc]# fdisk -l /dev/sdb
[root@node02 etc]# 

If you want to add the disk again:

Rescan the disk

echo "- - -" > /sys/class/scsi_host/host2/scan

dmesg output

sd 2:0:0:1: [sdb] Attached SCSI disk

 

multipath -ll
mpathb (1IET     00010001) dm-2 IET,VIRTUAL-DISK
size=1020M features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| `- 2:0:0:1 sdb 8:16 active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  `- 3:0:0:1 sdc 8:32 active ready running

Sending Mail

Hello,

In this short post, I will share you simple ansible playbook to send mail. I wrote this playbook  informing Project Office and Clients about OS system updates. In order for that Ansible has a mail module. Only things we need to define smtp server, smtp server port, Subject, Mail body, sender and recipients.

Example Usage:

[root@7133075243 ~]# ansible-playbook main.yml -i inventory/coreutils.ini -e INV=inventory/coreutils.ini -k -s

send_mail.yml

---

- name: 'Sending Mail'
mail:
host: 10.19.9.5
port: 25
subject: OS Updates
from: technologymanagement@manintheit.org (OSupdate)
to: PO@manintheit.org
charset: utf8
body: "{{ lookup('file', '/home/ansible/playbooks/inventory/body.txt','{{ INV }}') }}"
run_once: True
delegate_to: localhost

body.txt

Hello all,
There will be an OS update in short time. No server outage expected.
In case of problem please send mail to technologymanagement@manintheit.org. Please find affected servers below.


.

main.yml

---
- hosts: all
pre_tasks:
- include: send_mail.yml
tasks:
- name: "UPDATE core-utils sec."
shell: yum update --advisory RHSA-2017:2685

coreutils.ini(Ansible host inventory)

[targets]
server1
server2
server3

 

Sample Mail Output:

Hello all, 
There will be an OS update in short time. No server outage expected. 
In case of problem please send mail to technologymanagement@manintheit.org. Please find 
affected servers below. 



. 


.,[targets]
server1
server2
server3

 

Happy mailing and patching 🙂

 

 

 

 

Multipath with iSCSi

Hi Folks!

In this post, I will configure how to set-up iSCSI target and iSCSI initiator multipath way. Doing so we will be able to high available storage. Even though one links goes down other links kicks in and take over the mission. In this experiment I will be using two Linux CentOS6 systems and each system will have two network interface cards. I am using KVM hypervisor. I also add 1.5 GiB (/dev/sdb for me) disk for the iSCSI storage pool on the node01 as well.

1-Setup

I configured my network cards IPs below.

node01(iSCSI target)

eth0: 192.168.100.30/24

eth1: 192.168.100.31/24

node02(iSCSI initiator)

eth0: 192.168.100.50/24

eth1: 192.168.100.51/24

Setting-up disk for iSCSI target.

I will not delineate how to format disk 🙂

1- Format the disk via fdisk utility as LVM (0x8e)

2- Create LVM disk

pvcreate /dev/sdb1

vgcreate vgscsi /dev/sdb1

lvcreate -l 100%FREE -n lvscsi vgscsi (use all vgscsi pool)

Finally, I have a disk /dev/lvscsi/vgscsi

Note: For this experiment I turned off the firewall. If you want to enable firewall you need to open port 3260/tcp on the node01. iSCSI works on port 3260/tcp, default.

Hereafter, I will call node01 as iSCSI target and node02 as iSCSI initiator.

Installation

Install “scsi-target-utils” package on iSCSI target.

root@node01:~# yum install scsi-target-utils

Configuration of iSCSI target.

Edit /etc/tgt/targets.conf file via vim editor. Below two IP addresses owned by iSCSI initiator.(setting up backstore and ACL)

#/etc/tgt/targets.conf

<target iqn.2017-09.sfp.local:node1.target1>
backing-store /dev/vgscsi/lvscsi
initiator-address 192.168.100.50
</target>


<target iqn.2017-09.sfp.local:node1.target1>
backing-store /dev/vgscsi/lvscsi
initiator-address 192.168.100.51
</target>

Configure tgtd service to run on boot time and start tgtd service.

root@node01:~# chkconfig tgtd on
root@node01:~# service tgtd start

Setting up the Initiator

You may get an error regarding multipathd service as there is no configuration file on the /etc/multipath.conf . You can copy it from /usr/share/doc/device-mapper-multipath-0.4.9 For this experiment you do not need to edit multipath.conf file.

root@node02:~# cp /usr/share/doc/device-mapper-multipath-0.4.9/multipath.conf /etc/multipath.conf
root@node02:~# yum install iscsi-initiator-utils
root@node02:~# yum install device-mapper
root@node02:~# service multipathd start
root@node02:~# service iscsi start
root@node02:~# service iscsid start
root@node02:~# service chkconfig iscsi on
root@node02:~# service chkconfig iscsid on

Discovery and Logging to iSCSI  target:

root@node02:~# iscsiadm -m discovery -t sendtargets -p 192.168.100.30
root@node02:~# iscsiadm -m discovery -t sendtargets -p 192.168.100.31
root@node02:~# iscsiadm -m node -T iqn.2017-09.sfp.local:node1.target1 -p 192.168.100.30 --login
root@node02:~# iscsiadm -m node -T iqn.2017-09.sfp.local:node1.target1 -p 192.168.100.31 --login

If you do not get any error so far you are good to go for multipath. Nice thing about iSCSI initiator is that it will automatically detect the multipath. You do not need to do anything. You can check it with multipath -l command.

In my case multipath named with mpathb which is the physical path to the target. After this we do not use any /dev/sdX numbering anymore, instead /dev/mpathXX

root@node02:~# multipath -l
mpathb (1IET     00010001) dm-2 IET,VIRTUAL-DISK
size=1.5G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| `- 2:0:0:1 sdb 8:16 active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  `- 3:0:0:1 sdc 8:32 active ready running

Format the disk on the Initiator

You can format the disk mpath.

fdisk /dev/mapper/mpathb

After format the disk you may have a situation that kernel failed to re-read partition table. In this situation you need to reboot system or issue the command partprobe to update kernel partition table changes. I am not willing to reboot the system.

partprobe is a program that informs the operating system kernel of partition table changes. You need to install parted packages and issue the command partprobe.

root@node02:~# yum install parted
root@node02:~# partprobe

After format the disk my partition named mpathp1. It may be different in your case.

Initialize the disk with ext4 format and mount it.

root@node02:~# mkfs.ext4 /dev/mapper/mpathp1
root@node02:~# mkdir -p /media/disk
root@node02:~# mount -o _netdev /dev/mapper/mpathp1 /media/disk

Experiment:

For this experiment I bring down one of the NICs in the initiator and check if I am is still able to write to the disk. In order to test it, I write very primitive script.

root@node02:~# while true; do echo "HElloWorld" >> /media/disk/test.txt; sleep 4; done

Lets first bring down the NIC eth1

root@node02:~# ifdown eth1
Oct  1 10:29:48 node02 kernel: connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4340297368, last ping 4340302368, now 4340307368
Oct  1 10:29:48 node02 kernel: connection2:0: detected conn error (1011)
Oct  1 10:29:49 node02 kernel: connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4340298016, last ping 4340303016, now 4340308016
Oct  1 10:29:49 node02 kernel: connection1:0: detected conn error (1011)
Oct  1 10:29:49 node02 iscsid: Kernel reported iSCSI connection 2:0 error (1011 - ISCSI_ERR_CONN_FAILED: iSCSI connection failed) state (3)
Oct  1 10:29:49 node02 iscsid: Kernel reported iSCSI connection 1:0 error (1011 - ISCSI_ERR_CONN_FAILED: iSCSI connection failed) state (3)
Oct  1 10:30:24 node02 iscsid: connection2:0 is operational after recovery (3 attempts)
Oct  1 10:30:25 node02 iscsid: connect to 192.168.100.30:3260 failed (No route to host)

You can see first path looks failed.

root@node02:~# multipath -l
mpathb (1IET 00010001) dm-2 IET,VIRTUAL-DISK
size=1.5G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 2:0:0:1 sdb 8:16 failed undef unknown
`-+- policy='round-robin 0' prio=0 status=active
 `- 3:0:0:1 sdc 8:32 active undef unknown

I can still able to write to the file.

[root@node02 ~]# wc -l /media/disk/test.txt 
254 /media/disk/test.txt
[root@node02 ~]# wc -l /media/disk/test.txt 
255 /media/disk/test.txt

Bring up the first NIC and bring down the second NIC

[root@node02 ~]# ifup eth1

#/var/log/messages contents after bring up the NIC

Oct  1 10:34:53 node02 iscsid: connection1:0 is operational after recovery (46 attempts)
Oct  1 10:34:55 node02 multipathd: mpathb: sdb - directio checker reports path is up
Oct  1 10:34:55 node02 multipathd: 8:16: reinstated
Oct  1 10:34:55 node02 multipathd: mpathb: remaining active paths: 2

Lets bring down the second NIC (eth2)

[root@node02 ~]# ifdown eth2

#/var/log/messages contents after bring down the NIC
Oct  1 10:21:08 node02 multipathd: 8:32: reinstated
Oct  1 10:21:08 node02 multipathd: mpathb: remaining active paths: 2
Oct  1 10:21:48 node02 kernel: connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4339817248, last ping 4339822248, now 4339827248
Oct  1 10:21:48 node02 kernel: connection2:0: detected conn error (1011)
Oct  1 10:21:49 node02 iscsid: Kernel reported iSCSI connection 2:0 error (1011 - ISCSI_ERR_CONN_FAILED: iSCSI connection failed) state (3)

Oct  1 10:22:24 node02 iscsid: connect to 192.168.100.31:3260 failed (No route to host)

You can see second path looks failed in this time.

root@node02:~# multipath -l
mpathb (1IET     00010001) dm-2 IET,VIRTUAL-DISK
size=1.5G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=active
| `- 2:0:0:1 sdb 8:16 active undef unknown
`-+- policy='round-robin 0' prio=0 status=enabled
  `- 3:0:0:1 sdc 8:32 failed undef unknown

I count the lines bunch of times. It looks that I am still able to write.

[root@node02 ~]# wc -l /media/disk/test.txt 
339 /media/disk/test.txt
[root@node02 ~]# wc -l /media/disk/test.txt 
340 /media/disk/test.txt