Multipath with iSCSi

Hi Folks!

In this post, I will configure how to set-up iSCSI target and iSCSI initiator multipath way. Doing so we will be able to high available storage. Even though one links goes down other links kicks in and take over the mission. In this experiment I will be using two Linux CentOS6 systems and each system will have two network interface cards. I am using KVM hypervisor. I also add 1.5 GiB (/dev/sdb for me) disk for the iSCSI storage pool on the node01 as well.

1-Setup

I configured my network cards IPs below.

node01(iSCSI target)

eth0: 192.168.100.30/24

eth1: 192.168.100.31/24

node02(iSCSI initiator)

eth0: 192.168.100.50/24

eth1: 192.168.100.51/24

Setting-up disk for iSCSI target.

I will not delineate how to format disk 🙂

1- Format the disk via fdisk utility as LVM (0x8e)

2- Create LVM disk

pvcreate /dev/sdb1

vgcreate vgscsi /dev/sdb1

lvcreate -l 100%FREE -n lvscsi vgscsi (use all vgscsi pool)

Finally, I have a disk /dev/lvscsi/vgscsi

Note: For this experiment I turned off the firewall. If you want to enable firewall you need to open port 3260/tcp on the node01. iSCSI works on port 3260/tcp, default.

Hereafter, I will call node01 as iSCSI target and node02 as iSCSI initiator.

Installation

Install “scsi-target-utils” package on iSCSI target.

root@node01:~# yum install scsi-target-utils

Configuration of iSCSI target.

Edit /etc/tgt/targets.conf file via vim editor. Below two IP addresses owned by iSCSI initiator.(setting up backstore and ACL)

#/etc/tgt/targets.conf

<target iqn.2017-09.sfp.local:node1.target1>
backing-store /dev/vgscsi/lvscsi
initiator-address 192.168.100.50
</target>


<target iqn.2017-09.sfp.local:node1.target1>
backing-store /dev/vgscsi/lvscsi
initiator-address 192.168.100.51
</target>

Configure tgtd service to run on boot time and start tgtd service.

root@node01:~# chkconfig tgtd on
root@node01:~# service tgtd start

Setting up the Initiator

You may get an error regarding multipathd service as there is no configuration file on the /etc/multipath.conf . You can copy it from /usr/share/doc/device-mapper-multipath-0.4.9 For this experiment you do not need to edit multipath.conf file.

root@node02:~# cp /usr/share/doc/device-mapper-multipath-0.4.9/multipath.conf /etc/multipath.conf
root@node02:~# yum install iscsi-initiator-utils
root@node02:~# yum install device-mapper
root@node02:~# service multipathd start
root@node02:~# service iscsi start
root@node02:~# service iscsid start
root@node02:~# service chkconfig iscsi on
root@node02:~# service chkconfig iscsid on

Discovery and Logging to iSCSI  target:

root@node02:~# iscsiadm -m discovery -t sendtargets -p 192.168.100.30
root@node02:~# iscsiadm -m discovery -t sendtargets -p 192.168.100.31
root@node02:~# iscsiadm -m node -T iqn.2017-09.sfp.local:node1.target1 -p 192.168.100.30 --login
root@node02:~# iscsiadm -m node -T iqn.2017-09.sfp.local:node1.target1 -p 192.168.100.31 --login

If you do not get any error so far you are good to go for multipath. Nice thing about iSCSI initiator is that it will automatically detect the multipath. You do not need to do anything. You can check it with multipath -l command.

In my case multipath named with mpathb which is the physical path to the target. After this we do not use any /dev/sdX numbering anymore, instead /dev/mpathXX

root@node02:~# multipath -l
mpathb (1IET     00010001) dm-2 IET,VIRTUAL-DISK
size=1.5G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| `- 2:0:0:1 sdb 8:16 active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  `- 3:0:0:1 sdc 8:32 active ready running

Format the disk on the Initiator

You can format the disk mpath.

fdisk /dev/mapper/mpathb

After format the disk you may have a situation that kernel failed to re-read partition table. In this situation you need to reboot system or issue the command partprobe to update kernel partition table changes. I am not willing to reboot the system.

partprobe is a program that informs the operating system kernel of partition table changes. You need to install parted packages and issue the command partprobe.

root@node02:~# yum install parted
root@node02:~# partprobe

After format the disk my partition named mpathp1. It may be different in your case.

Initialize the disk with ext4 format and mount it.

root@node02:~# mkfs.ext4 /dev/mapper/mpathp1
root@node02:~# mkdir -p /media/disk
root@node02:~# mount -o _netdev /dev/mapper/mpathp1 /media/disk

Experiment:

For this experiment I bring down one of the NICs in the initiator and check if I am is still able to write to the disk. In order to test it, I write very primitive script.

root@node02:~# while true; do echo "HElloWorld" >> /media/disk/test.txt; sleep 4; done

Lets first bring down the NIC eth1

root@node02:~# ifdown eth1
Oct  1 10:29:48 node02 kernel: connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4340297368, last ping 4340302368, now 4340307368
Oct  1 10:29:48 node02 kernel: connection2:0: detected conn error (1011)
Oct  1 10:29:49 node02 kernel: connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4340298016, last ping 4340303016, now 4340308016
Oct  1 10:29:49 node02 kernel: connection1:0: detected conn error (1011)
Oct  1 10:29:49 node02 iscsid: Kernel reported iSCSI connection 2:0 error (1011 - ISCSI_ERR_CONN_FAILED: iSCSI connection failed) state (3)
Oct  1 10:29:49 node02 iscsid: Kernel reported iSCSI connection 1:0 error (1011 - ISCSI_ERR_CONN_FAILED: iSCSI connection failed) state (3)
Oct  1 10:30:24 node02 iscsid: connection2:0 is operational after recovery (3 attempts)
Oct  1 10:30:25 node02 iscsid: connect to 192.168.100.30:3260 failed (No route to host)

You can see first path looks failed.

root@node02:~# multipath -l
mpathb (1IET 00010001) dm-2 IET,VIRTUAL-DISK
size=1.5G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=enabled
| `- 2:0:0:1 sdb 8:16 failed undef unknown
`-+- policy='round-robin 0' prio=0 status=active
 `- 3:0:0:1 sdc 8:32 active undef unknown

I can still able to write to the file.

[root@node02 ~]# wc -l /media/disk/test.txt 
254 /media/disk/test.txt
[root@node02 ~]# wc -l /media/disk/test.txt 
255 /media/disk/test.txt

Bring up the first NIC and bring down the second NIC

[root@node02 ~]# ifup eth1

#/var/log/messages contents after bring up the NIC

Oct  1 10:34:53 node02 iscsid: connection1:0 is operational after recovery (46 attempts)
Oct  1 10:34:55 node02 multipathd: mpathb: sdb - directio checker reports path is up
Oct  1 10:34:55 node02 multipathd: 8:16: reinstated
Oct  1 10:34:55 node02 multipathd: mpathb: remaining active paths: 2

Lets bring down the second NIC (eth2)

[root@node02 ~]# ifdown eth2

#/var/log/messages contents after bring down the NIC
Oct  1 10:21:08 node02 multipathd: 8:32: reinstated
Oct  1 10:21:08 node02 multipathd: mpathb: remaining active paths: 2
Oct  1 10:21:48 node02 kernel: connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4339817248, last ping 4339822248, now 4339827248
Oct  1 10:21:48 node02 kernel: connection2:0: detected conn error (1011)
Oct  1 10:21:49 node02 iscsid: Kernel reported iSCSI connection 2:0 error (1011 - ISCSI_ERR_CONN_FAILED: iSCSI connection failed) state (3)

Oct  1 10:22:24 node02 iscsid: connect to 192.168.100.31:3260 failed (No route to host)

You can see second path looks failed in this time.

root@node02:~# multipath -l
mpathb (1IET     00010001) dm-2 IET,VIRTUAL-DISK
size=1.5G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=active
| `- 2:0:0:1 sdb 8:16 active undef unknown
`-+- policy='round-robin 0' prio=0 status=enabled
  `- 3:0:0:1 sdc 8:32 failed undef unknown

I count the lines bunch of times. It looks that I am still able to write.

[root@node02 ~]# wc -l /media/disk/test.txt 
339 /media/disk/test.txt
[root@node02 ~]# wc -l /media/disk/test.txt 
340 /media/disk/test.txt