Connect KVM over GRE

Hi Folks,

As you may know, Libvirt virtual network switches operates in NAT mode in default (IP Masquerading rather than SNAT or DNAT). In this mode Virtual guests can communicate outside world. But, Computers external to the host can’t initiate communications to the guests inside, when the virtual network switch is operating in NAT mode. One of the solution is creating a virtual switch in Routed-Mode. We have still one more option without changing underlying virtual switch operation mode. The Solution is creating a GRE Tunnel between the hosts.

What is GRE?

GRE (Generic Routing Encapsulation) is a communication protocol that provides virtually point-to-point communication. It is very simple and effective method of transporting data over a public network. You can use GRE tunnel some of below cases.

  • Use of multiple protocols over a single-protocol backbone
  • Providing workarounds for networks with limited hops
  • Connection of non-contiguous subnetworks
  • Being less resource demanding than its alternatives (e.g. IPsec VPN)

Reference: https://www.incapsula.com/blog/what-is-gre-tunnel.html

Example of GRE encapsulation
Reference: https://www.incapsula.com/blog/what-is-gre-tunnel.html

I have created GRE tunnel to connect to some of KVM guests from the external host. It is depicted in the Figure-2 how my topology looks like.

Figure-2 Connecting KVM guests over GRE Tunnel

I have two Physical hosts installed Mint and Ubuntu GNU/Linux distribution. KVM is running on the Ubuntu.

GRE Tunnel configuration on GNU/Linux hosts

Before create a GRE tunnel, we need to add ip_gre module on both GNU/Linux hosts.

mint@mint$ sudo modprobe ip_gre
tesla@otuken:~$ sudo modprobe ip_gre

Configuring Physical interface on both nodes.

mint@mint$ ip addr add 100.100.100.1/24 dev enp0s31f6
tesla@otuken:~$ ip addr add 100.100.100.2/24 dev enp2s0

Configuring GRE Tunnel (On the first node)

mint@mint$ sudo ip tunnel add tun0 mode gre remote 100.100.100.2 local 100.100.100.1 ttl 255
mint@mint$ sudo ip link set tun0 up
mint@mint$ sudo ip addr add 10.0.0.10/24 dev tun0
mint@mint$ sudo ip route add 10.0.0.0/24 dev tun0
mint@mint$ sudo ip route add 192.168.122.0/24 dev tun0

Configuring GRE Tunnel (On the Second Node)

tesla@otuken:~$ sudo ip tunnel add tun0 mode gre remote 100.100.100.1 local 100.100.100.2 ttl 255
tesla@otuken:~$ sudo ip link set tun0 up
tesla@otuken:~$ sudo ip addr add 10.0.0.20/24 dev tun0
tesla@otuken:~$ sudo ip route add 10.0.0.0/24 dev tun0

As GRE protocol adds additional 24 bytes of header, it is highly recommended to set MTU . Recommended MTU value is 1400.

Also do not forget to check iptables rules on both hosts.

Experiment:

Once configuration completed, I successfully ping the KVM guest(192.168.122.35) and transfer a file over SSH(scp). You can download the Wireshark pcap file here.

DRBD(without clustering)

Do you need transparent, real-time replication of block devices without the need for specialty hardware without paying anything ?

If your answer is YES. DRBD is your solution. DRBD is a distributed replicated storage system for the Linux platform. It is implemented as a kernel driver, several userspace management applications, and some shell scripts. DRBD is traditionally used in high availability (HA)

In this post, I am going create HA cluster block storage. Switching-over will be handled manually. But in the next post I will add cluster software. I have two Debian systems for this lab. It is depicted in the Figure-1 sample architecture.

Figure-1 Sample HA Block Storage

Reference:https://www.ibm.com/developerworks/jp/linux/library/l-drbd/index.html

Installing DRDB packages:

Install drbd8-utils on each of the node.

root@debian1:~# apt-get install drbd8-utils 

Add hostnames into the /etc/hosts file on each of the node.

192.168.122.70 debian1
192.168.122.71 debian2

Creating a file system:

Instead of adding a disk storage we create a file and use it as a storage on each of the node.

root@debian1:~# mkdir /replicated
root@debian1:~# dd if=/dev/zero of=drbd.img bs=1024K count=512
root@debian2:~# mkdir /replicated
root@debian2:~# dd if=/dev/zero of=drbd.img bs=1024K count=512
root@debian1:~# losetup /dev/loop0 /root/drbd.img
root@debian2:~# losetup /dev/loop0 /root/drbd.img

Configuring DRBD:

Add the configuration below on each of the node.

root@debian1:~# cat /etc/drbd.d/replicated.res
resource replicated {
protocol C;          
on debian1 {
                device /dev/drbd0;
                disk /root/drbd.img;
                address 192.168.122.70:7788;
                meta-disk internal;
                }
on debian2 {
                device /dev/drbd0;
                disk /root/drbd.img;
                address 192.168.122.71:7788;
                meta-disk internal;
                }
               
} 


Initializing metadata storage(on each node)

root@debian1:~# drbdadm create-md replicated
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.
root@debian1:~# 

Make sure that drbd service is running on both nodes.

● drbd.service - LSB: Control DRBD resources.
   Loaded: loaded (/etc/init.d/drbd; generated; vendor preset: enabled)
   Active: active (exited) since Fri 2019-02-01 15:32:34 +04; 6min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 1399 ExecStop=/etc/init.d/drbd stop (code=exited, status=0/SUCCESS)
  Process: 1420 ExecStart=/etc/init.d/drbd start (code=exited, status=0/SUCCESS)
    Tasks: 0 (limit: 4915)
   CGroup: /system.slice/drbd.service

root@debian2:~# lsblk 
NAME              MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0               7:0    0  512M  0 loop 
└─drbd0           147:0    0  512M  1 disk 
sr0                11:0    1 1024M  0 rom  
vda               254:0    0   10G  0 disk 
└─vda1            254:1    0   10G  0 part 
  ├─vgroot-lvroot 253:0    0  7.1G  0 lvm  /
  └─vgroot-lvswap 253:1    0  976M  0 lvm  [SWAP]

DRDB uses only one node at a time as a primary node where read and write can be preformed. We will at first specify node 1 as primary node.

root@debian1:~# drbdadm primary replicated --force

root@debian1:~# cat /proc/drbd 
version: 8.4.7 (api:1/proto:86-101)
srcversion: AC50E9301653907249B740E
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:516616 nr:0 dw:0 dr:516616 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:7620
[==================>.] sync'ed: 99.3% (7620/524236)K
finish: 0:00:00 speed: 20,968 (13,960) K/sec
root@debian1:~# cat /proc/drbd
version: 8.4.7 (api:1/proto:86-101)
srcversion: AC50E9301653907249B740E
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:524236 nr:0 dw:0 dr:524236 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
[===================>] sync'ed:100.0% (0/524236)K
finish: 0:00:00 speed: 20,300 (13,792) K/sec

Initializing the Filesystem:

root@debian1:~# mkfs.ext4 /dev/drbd0 
root@debian1:~# mount /dev/drbd0 /replicated/

Do not forget to format the disk partition(in this case /dev/drbd0) first node only. Do not issue the command(mkfs.ext4) on the second node again.

Switching-over the Second node

#On first node:
root@debian1:~# umount /replicated
root@debian1:~# drbdadm secondary replicated
#On second node:
root@debian2:~# drbdadm primary replicated
root@debian2:~# mount /dev/drbd0

Switching-back the First Node:

#On second node:
root@debian2:~# umount /replicated
root@debian2:~# drbdadm secondary replicated
#On first node:
root@debian1:~# drbdadm primary replicated
root@debian1:~# mount /dev/drbd0