KVM provisioning with Jenkins and Terraform

In this CI/CD activities we are going to provisioning a virtual guest on KVM host using Terraform and Jenkins.

Officially, Terraform does not have provider for KVM. But we are going to use third party provider for this.

I assume that host that runs terraform able to connect to the KVM hypervisor over ssh without password authentication. To do so, We already configured KVM hypervisor to logged in with SSH private key.

terraformm-provider-libvirt is a compiled binary, which needs to be put into the <PROJECT FOLDER>/terraform.d/plugins/linux_amd64 folder. You can find the compiled releases here or you can compile by yourself from the source. We are going to create following folder structure for Terraform.

libvirt/
├── libvirt.tf
├── terraform.d
│   └── plugins
│   └── linux_amd64
│   └── terraform-provider-libvirt

#libvirt.tf
provider "libvirt" {
    uri = "qemu+ssh://tesla@oregon.anatolia.io/system"
}

resource "libvirt_volume" "centos7-test" {
  name = "centos7-test"
  format = "qcow2"
  pool = "KVMGuests"
  source =  "http://oregon.anatolia.io/qcow2-repo/centos71.qcow2"

}

resource "libvirt_domain" "centos7-test" {
 autostart = "true"
  name   = "centos7-test"
  memory = "2048"
  vcpu   = 2
  running = "true"
  network_interface{
      hostname = "centos7-test"
      network_name = "default"
  }

 disk {
       volume_id = "${libvirt_volume.centos7-test.id}"
  }
console {
    type        = "pty"
    target_type = "virtio"
    target_port = "1"
  }

}

Creating a Pipeline in Jenkins

In this section Pipeline script has been created that needs to be defined in the Pipeline.

pipeline{
    agent {label 'master'}
    stages{
        stage('TF Init'){
            steps{
                sh '''
                cd /data/projects/terraform/libvirt
                terraform init
                '''
            }
        }
        stage('TF Plan'){
            steps{
                sh '''
                cd /data/projects/terraform/libvirt
                terraform plan -out createkvm
                '''
            }
        }

        stage('Approval') {
            steps {
                script {
                    def userInput = input(id: 'Approve', message: 'Do you want to Approve?', parameters: [ [$class: 'BooleanParameterDefinition', defaultValue: false, description: 'Apply terraform', name: 'confirm'] ])
                }
           }
        }
        stage('TF Apply'){
            steps{
                sh '''
                cd /data/projects/terraform/libvirt
                terraform apply createkvm
                '''
            }
        }
    }
}




You may realize that domain name is centos7-test but the hostname is centos71. this is because I used a one of the template that I already created before. Address of the template defined in the source section of libvirt.tf file. In the next post, I will integrate it with the cloud-init which allows machine to setup at first boot. By doing that even machine customization will be done automatically.

Creating VLANs on KVM with OpenVswitch

VLAN is a crucial L2 network technology for increasing broadcast domain at the end it gives you better network utilization and security. If you are familiar with vmWare technology you can create a port group on a dVS or Standard switch. But If you need to segregate your network on KVM hypervisor, you need some other packages . In this tutorial I will show you how to create VLANs by using openvswitch and integrating it to KVM.

For this post, I assume that you already had openvswitch installed on your system. If not, follow here. I am also assuming that you have a physical NIC to bridge it to your virtual bridge(switch) which is created via openvswitch. By doing that you can connect to the outside world.

tesla@ankara:~$ sudo ovs-vsctl -V
ovs-vsctl (Open vSwitch) 2.12.0
DB Schema 8.0.0

Creating a Virtual Bridge with Openvswitch

$ sudo ovs-vsctl add-br OVS0 

Adding Physcical NIC to OVS0 Bridge

sudo ovs-vsctl add-port OVS0 enp0s31f6

In order to integrate the bridge which is created by openvswitch to KVM, we need to create XML configuration file which needed to be defined on KVM. You can see my configuration below.

<network>
 <name>OVS0</name>
 <forward mode='bridge'/>
 <bridge name='OVS0'/>
 <virtualport type='openvswitch'/>
 <portgroup name='VLAN10'>
   <vlan>
     <tag id='10'/>
   </vlan>
 </portgroup>
 <portgroup name='VLAN20'>
   <vlan>
     <tag id='20'/>
   </vlan>
 </portgroup>
 <portgroup name='VLAN30'>
   <vlan>
     <tag id='30'/>
   </vlan>
 </portgroup>
  <portgroup name='VLAN40'>
   <vlan>
     <tag id='40'/>
   </vlan>
 </portgroup>
<portgroup name='VLAN99'>
   <vlan>
     <tag id='99'/>
   </vlan>
 </portgroup>
 <portgroup name='VLAN100'>
   <vlan>
     <tag id='100'/>
   </vlan>
 </portgroup>
<portgroup name='TRUNK'>
   <vlan trunk='yes'>
     <tag id='10'/>
     <tag id='20'/>
     <tag id='30'/>
     <tag id='40'/>
     <tag id='99'/>
     <tag id='100'/>
   </vlan>
 </portgroup>
</network>

As per XML configuration above, we are creating a VLAN ID: 10, 20, 30, 40, 99 and 100.

Defining the configuration with virsh

virsh # net-define --file OVS0.xml 
Network OVS0 defined from OVS0.xml
virsh # net-autostart --network OVS0
Network OVS0 marked as autostarted
virsh # net-list 
 Name      State    Autostart   Persistent
--------------------------------------------
 default   active   yes         yes
 OVS0      active   yes         yes

After defining it, you will see that your XML file modified by KVM with uuid.

<!--
WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE
OVERWRITTEN AND LOST. Changes to this xml configuration should be made using:
  virsh net-edit OVS0
or other application using the libvirt API.
-->

<network>
  <name>OVS0</name>
  <uuid>a38bdd43-7fba-4e23-98f1-8c0ab83cff2c</uuid>
  <forward mode='bridge'/>
  <bridge name='OVS0'/>
  <virtualport type='openvswitch'/>
  <portgroup name='VLAN10'>
    <vlan>
      <tag id='10'/>
    </vlan>
  </portgroup>
  <portgroup name='VLAN20'>
    <vlan>
      <tag id='20'/>
    </vlan>
  </portgroup>
  <portgroup name='VLAN30'>
    <vlan>
      <tag id='30'/>
    </vlan>
  </portgroup>
  <portgroup name='VLAN40'>
    <vlan>
      <tag id='40'/>
    </vlan>
  </portgroup>
  <portgroup name='VLAN99'>
    <vlan>
      <tag id='99'/>
    </vlan>
  </portgroup>
  <portgroup name='VLAN100'>
    <vlan>
      <tag id='100'/>
    </vlan>
  </portgroup>
  <portgroup name='TRUNK'>
    <vlan trunk='yes'>
      <tag id='10'/>
      <tag id='20'/>
      <tag id='30'/>
      <tag id='40'/>
      <tag id='99'/>
      <tag id='100'/>
    </vlan>
  </portgroup>
</network>

Experiments

Let’s check on virt-manager if we are able to see the port groups.

Capturing Packages with Wireshark on Pyhiscal NIC that connected to th e OVS0

Compiling Archer T600U Plus WiFi USB Adapter on GNU/Linux with dkms

In this very short post, I am going to show you how to compile TPLINK USB Adapter module on GNU/Linux with dkms. I am using Pop OS 19.10.

$ sudo apt install git dkms
$ git clone https://github.com/aircrack-ng/rtl8812au.git
$ cd rtl8812au
$ sudo ./dkms-install.sh

If everything goes well you should get an output similar to below.

About to run dkms install steps...

Creating symlink /var/lib/dkms/rtl8812au/5.6.4.2/source ->
                 /usr/src/rtl8812au-5.6.4.2

DKMS: add completed.

Kernel preparation unnecessary for this kernel.  Skipping...

Building module:
cleaning build area...
'make' -j8 KVER=5.3.0-20-generic KSRC=/lib/modules/5.3.0-20-generic/build........
cleaning build area...

DKMS: build completed.

88XXau.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/5.3.0-20-generic/updates/dkms/

depmod....

DKMS: install completed.
Finished running dkms install steps.
$ ip link show
...
3: wlx34e894b147cc: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2312 qdisc mq state UP mode DORMANT group default qlen 1000

Write Your own Custom Plugin on check_mk

The check_mk is a open source monitoring solution with hundreds of checks which enables you to monitor your IT infrastructure. Besides it allows you to configure most of your monitor related activities from the graphical interface called “WATO”. Most of the time it automatically discover the system, once it is added to the check_mk inventory. But Not always! Nevertheless, we are still able to monitor the system by writing custom plugin. In this post, I will share you how to write your own custom plugin on check_mk. As an experiment, I used my home router. which is Ubiquiti Edge Router X. I enabled the SNMP version 2 on my router.

Before writing a plugin we need to decide on what we are going to monitor and second thing figuring out the related SNMP oid. In this post, I am going to monitor the bandwidth usage of the interface eth0 on my router. To find the related oid’s you can use the snmpwalk tool. When you run the code with the correct parameters system give the all the information. If you do not download the necessary MIB file on your system you will see all the elements with its oid, which is hard to know.

#snmpwalk -v2c -c public 192.168.1.1 . | less
iso.3.6.1.2.1.1.1.0 = STRING: "EdgeOS v1.10.8.5142457.181120.1809"
iso.3.6.1.2.1.1.2.0 = OID: iso.3.6.1.4.1.41112.1.5
iso.3.6.1.2.1.1.3.0 = Timeticks: (11759) 0:01:57.59
iso.3.6.1.2.1.1.4.0 = STRING: "root"
iso.3.6.1.2.1.1.5.0 = STRING: "ubnt"
iso.3.6.1.2.1.1.6.0 = STRING: "Unknown"
iso.3.6.1.2.1.1.7.0 = INTEGER: 14
iso.3.6.1.2.1.1.8.0 = Timeticks: (18) 0:00:00.18
iso.3.6.1.2.1.1.9.1.2.1 = OID: iso.3.6.1.2.1.10.131
iso.3.6.1.2.1.1.9.1.2.2 = OID: iso.3.6.1.6.3.11.3.1.1
iso.3.6.1.2.1.1.9.1.2.3 = OID: iso.3.6.1.6.3.15.2.1.1
...(omitted)

If your output like the above, you may need to download MIB files. You can download the snmp-mibs-downloader on GNU/Linux. If you are lucky the snmpwalk give you all the information by translating to text notation which is more meaningful for us.

# apt-get install snmp-mibs-downloader
# download-mibs
# snmpwalk -v2c -c public 192.168.1.1 . | less

SNMPv2-MIB::sysDescr.0 = STRING: EdgeOS v1.10.8.5142457.181120.1809
SNMPv2-MIB::sysObjectID.0 = OID: SNMPv2-SMI::enterprises.41112.1.5
SNMPv2-MIB::sysUpTime.0 = Timeticks: (44730) 0:07:27.30
SNMPv2-MIB::sysContact.0 = STRING: root
SNMPv2-MIB::sysName.0 = STRING: ubnt
SNMPv2-MIB::sysLocation.0 = STRING: Unknown
SNMPv2-MIB::sysServices.0 = INTEGER: 14
SNMPv2-MIB::sysORLastChange.0 = Timeticks: (18) 0:00:00.18
SNMPv2-MIB::sysORID.1 = OID: SNMPv2-SMI::transmission.131
SNMPv2-MIB::sysORID.2 = OID: SNMPv2-SMI::snmpModules.11.3.1.1
SNMPv2-MIB::sysORID.3 = OID: SNMPv2-SMI::snmpModules.15.2.1.1
SNMPv2-MIB::sysORID.4 = OID: SNMPv2-SMI::snmpModules.10.3.1.1



As I am going to monitor the bandwidth usage of the the interface for eth0. I need to find the related oid number, which are below.

#Text Notation
IF-MIB::ifDescr.1 = STRING: lo
IF-MIB::ifDescr.2 = STRING: switch0
IF-MIB::ifDescr.3 = STRING: imq0
IF-MIB::ifDescr.4 = STRING: eth0
IF-MIB::ifDescr.5 = STRING: eth1
IF-MIB::ifDescr.6 = STRING: eth2
IF-MIB::ifDescr.7 = STRING: eth3
IF-MIB::ifDescr.8 = STRING: eth4
IF-MIB::ifDescr.9 = STRING: eth1.20
IF-MIB::ifDescr.10 = STRING: eth1.10


IF-MIB::ifInOctets.1 = Counter32: 38770
IF-MIB::ifInOctets.2 = Counter32: 0
IF-MIB::ifInOctets.3 = Counter32: 0
IF-MIB::ifInOctets.4 = Counter32: 307201
IF-MIB::ifInOctets.5 = Counter32: 0
IF-MIB::ifInOctets.6 = Counter32: 0
IF-MIB::ifInOctets.7 = Counter32: 0
IF-MIB::ifInOctets.8 = Counter32: 0
IF-MIB::ifInOctets.9 = Counter32: 0
IF-MIB::ifInOctets.10 = Counter32: 0

IF-MIB::ifOutOctets.1 = Counter32: 38770
IF-MIB::ifOutOctets.2 = Counter32: 424
IF-MIB::ifOutOctets.3 = Counter32: 0
IF-MIB::ifOutOctets.4 = Counter32: 295279
IF-MIB::ifOutOctets.5 = Counter32: 0
IF-MIB::ifOutOctets.6 = Counter32: 0
IF-MIB::ifOutOctets.7 = Counter32: 0
IF-MIB::ifOutOctets.8 = Counter32: 0
IF-MIB::ifOutOctets.9 = Counter32: 0
IF-MIB::ifOutOctets.10 = Counter32: 0

#OID Notation

.1.3.6.1.2.1.2.2.1.2.1 = STRING: lo
.1.3.6.1.2.1.2.2.1.2.2 = STRING: switch0
.1.3.6.1.2.1.2.2.1.2.3 = STRING: imq0
.1.3.6.1.2.1.2.2.1.2.4 = STRING: eth0
.1.3.6.1.2.1.2.2.1.2.5 = STRING: eth1
.1.3.6.1.2.1.2.2.1.2.6 = STRING: eth2
.1.3.6.1.2.1.2.2.1.2.7 = STRING: eth3
.1.3.6.1.2.1.2.2.1.2.8 = STRING: eth4
.1.3.6.1.2.1.2.2.1.2.9 = STRING: eth1.20
.1.3.6.1.2.1.2.2.1.2.10 = STRING: eth1.10

.1.3.6.1.2.1.2.2.1.10.1 = Counter32: 55901
.1.3.6.1.2.1.2.2.1.10.2 = Counter32: 0
.1.3.6.1.2.1.2.2.1.10.3 = Counter32: 0
.1.3.6.1.2.1.2.2.1.10.4 = Counter32: 967790
.1.3.6.1.2.1.2.2.1.10.5 = Counter32: 0
.1.3.6.1.2.1.2.2.1.10.6 = Counter32: 0
.1.3.6.1.2.1.2.2.1.10.7 = Counter32: 0
.1.3.6.1.2.1.2.2.1.10.8 = Counter32: 0
.1.3.6.1.2.1.2.2.1.10.9 = Counter32: 0
.1.3.6.1.2.1.2.2.1.10.10 = Counter32: 0

.1.3.6.1.2.1.2.2.1.16.1 = Counter32: 55901
.1.3.6.1.2.1.2.2.1.16.2 = Counter32: 424
.1.3.6.1.2.1.2.2.1.16.3 = Counter32: 0
.1.3.6.1.2.1.2.2.1.16.4 = Counter32: 1031485
.1.3.6.1.2.1.2.2.1.16.5 = Counter32: 0
.1.3.6.1.2.1.2.2.1.16.6 = Counter32: 0
.1.3.6.1.2.1.2.2.1.16.7 = Counter32: 0
.1.3.6.1.2.1.2.2.1.16.8 = Counter32: 0
.1.3.6.1.2.1.2.2.1.16.9 = Counter32: 0
.1.3.6.1.2.1.2.2.1.16.10 = Counter32: 0


As it is depicted above, we need a interface name, IfInOctets and IfOutOctets values of the Interface to monitor the bandwidth usage.

Math!

We also need some basic math to abstract the the data. Because SNMP gives us the bandwidth usage values counter based octets. So, we need to know how to abstract it.

Formula:

   (Counter2 - Counter1)
============================ * 8  Gives the bps.(bit-per-second)
       (Time2 - Time1)

If you still do not understand you can chek the page. It is a piece of cake! 🙂

Actually check_mk has a very nice function that keeps track of the time and counter values and takes the delta automatically . So we do not need to store anything. We only need to pass the correct values to the get_rate() function.

For custom plugin we need to put the custom plugin into the correct folder.

/omd/sites/<your site>/local/share/check_mk/checks

I created my plugin called “edge_router_bw”

#!/usr/bin/env python
edge_router_default_bw_values = (30.0, 35.0, 30.0, 35.0)
def inventory_edge_router_bw(info):
    for interface, inoctets, outoctets  in info:
        yield interface, "edge_router_default_bw_values"

def check_edge_router_bw(item, params, info):
    warntx, crittx, warnrx, critrx = params
    for interface, inoctets, outoctets  in info:
        if interface == item:
            this_time = time.time()
            if interface == "eth0":
                rx = 8.0 * get_rate("RX.%s" % interface, this_time, float(inoctets))
                tx = 8.0 * get_rate("TX.%s" % interface, this_time, float(outoctets))
                perfdata = [("RX", float(rx)/1024.0), ("TX", float(tx)/1024.0)]
                tx = float(tx)/1024.0
                rx = float(rx)/1024.0
                if rx >= critrx or tx >= crittx:
                    return 2, ("RX: %.2f Kbps, TX: %.2f Kbps" % (rx, tx)), perfdata
                elif rx >= warnrx or tx  >= warntx:
                    return 1, ("RX: %.2f Kbps, TX: %.2f Kbps" % (rx, tx)), perfdata
                else:
                    return 0, ("RX: %.2f Kbps, TX: %.2f Kbps" % (rx, tx)), perfdata
            
check_info["edge_router_bw"] = {
    "check_function"        : check_edge_router_bw,
    "inventory_function"    : inventory_edge_router_bw,
    "service_description"   : "Edge Router NICs bandwith %s",
    "snmp_info"             : ( ".1.3.6.1.2.1.2.2.1", [ "2", "10", "16"] ),
    "has_perfdata"          : True,
    "group"                 : "edge_router_bw",
}

One of of the nice feature of check_mk is that, you can set your threshold values for warning and critical levels. Then you can set/change these levels from the WATO. No more editing of the files. For that, you need to create below configuration file, which allows you to make changes from the WATO.

Create a file on the /omd/sites/local/share/check_mk/web/plugins/wato/.

I created a file called check_param_router_edge_bw.py

register_check_parameters(
        subgroup_networking,
        "edge_router_bw",
        _("Edge router Bandwith Kbps"),
        Tuple(
            title = _("Edge Router Interface Bandwith"),
            elements = [
                Float(title = _("Set WARNING if TX above Kbps"), minvalue = 0.0, maxvalue = 10000.0, default_value = 30.0),
                Float(title = _("Set CRITICAL if TX  above Kbps"), minvalue = 0.0, maxvalue = 10000.0, default_value = 35.0),
                Float(title = _("Set WARNING if RX above Kbps"), minvalue = 0.0, maxvalue = 10000.0, default_value = 30.0),
                Float(title = _("Set CRITICAL if RX above Kbps"), minvalue = 0.0, maxvalue = 10000.0, default_value = 35.0),
            ]),
            TextAscii(
                title = _("Inteface Bandwith Kbps"),
                allow_empty = False),
            "first"
)

Once everything has finished, we can check our plugin. Check the script with the –debug option if there is an error in the script.

OMD[monitoring]:~/local/share/check_mk/checks$ cmk --debug --checks=edge_router_bw -I edgerouter

If there is no any error we can inventory the host in the check _mk

OMD[monitoring]:~/local/share/check_mk/checks$ cmk -IIv edgerouter
Discovering services on edgerouter:
edgerouter:
   10 edge_router_bw
   10 edge_router_params
    1 hr_cpu
    4 hr_fs
    1 hr_mem
    2 if64
    1 snmp_info
    1 snmp_uptime

Finally, testing the plugin on the command line.

OMD[monitoring]:~/local/share/check_mk/checks$ cmk -nvp edgerouter Check_MK version 1.4.0p38<br> CPU utilization      OK - 1.5% used                                           (util=1.5;80;90;0;100)<br> Edge Router NICs bandwith eth0 OK - RX: 2.45 Kbps, TX: 6.59 Kbps                        (RX=2.453042;;;; TX=6.589328;;;;)<br>

You can then login to the check_mk graphical interface.

Experiment

As you see below, you can change the WARNING and CRITICAL levels from the WATO.

That’s all for now. Happy monitoring 🙂

Connect KVM over GRE

Hi Folks,

As you may know, Libvirt virtual network switches operates in NAT mode in default (IP Masquerading rather than SNAT or DNAT). In this mode Virtual guests can communicate outside world. But, Computers external to the host can’t initiate communications to the guests inside, when the virtual network switch is operating in NAT mode. One of the solution is creating a virtual switch in Routed-Mode. We have still one more option without changing underlying virtual switch operation mode. The Solution is creating a GRE Tunnel between the hosts.

What is GRE?

GRE (Generic Routing Encapsulation) is a communication protocol that provides virtually point-to-point communication. It is very simple and effective method of transporting data over a public network. You can use GRE tunnel some of below cases.

  • Use of multiple protocols over a single-protocol backbone
  • Providing workarounds for networks with limited hops
  • Connection of non-contiguous subnetworks
  • Being less resource demanding than its alternatives (e.g. IPsec VPN)

Reference: https://www.incapsula.com/blog/what-is-gre-tunnel.html

Example of GRE encapsulation
Reference: https://www.incapsula.com/blog/what-is-gre-tunnel.html

I have created GRE tunnel to connect to some of KVM guests from the external host. It is depicted in the Figure-2 how my topology looks like.

Figure-2 Connecting KVM guests over GRE Tunnel

I have two Physical hosts installed Mint and Ubuntu GNU/Linux distribution. KVM is running on the Ubuntu.

GRE Tunnel configuration on GNU/Linux hosts

Before create a GRE tunnel, we need to add ip_gre module on both GNU/Linux hosts.

mint@mint$ sudo modprobe ip_gre
tesla@otuken:~$ sudo modprobe ip_gre

Configuring Physical interface on both nodes.

mint@mint$ ip addr add 100.100.100.1/24 dev enp0s31f6
tesla@otuken:~$ ip addr add 100.100.100.2/24 dev enp2s0

Configuring GRE Tunnel (On the first node)

mint@mint$ sudo ip tunnel add tun0 mode gre remote 100.100.100.2 local 100.100.100.1 ttl 255
mint@mint$ sudo ip link set tun0 up
mint@mint$ sudo ip addr add 10.0.0.10/24 dev tun0
mint@mint$ sudo ip route add 10.0.0.0/24 dev tun0
mint@mint$ sudo ip route add 192.168.122.0/24 dev tun0

Configuring GRE Tunnel (On the Second Node)

tesla@otuken:~$ sudo ip tunnel add tun0 mode gre remote 100.100.100.1 local 100.100.100.2 ttl 255
tesla@otuken:~$ sudo ip link set tun0 up
tesla@otuken:~$ sudo ip addr add 10.0.0.20/24 dev tun0
tesla@otuken:~$ sudo ip route add 10.0.0.0/24 dev tun0

As GRE protocol adds additional 24 bytes of header, it is highly recommended to set MTU . Recommended MTU value is 1400.

Also do not forget to check iptables rules on both hosts.

Experiment:

Once configuration completed, I successfully ping the KVM guest(192.168.122.35) and transfer a file over SSH(scp). You can download the Wireshark pcap file here.

DRBD(without clustering)

Do you need transparent, real-time replication of block devices without the need for specialty hardware without paying anything ?

If your answer is YES. DRBD is your solution. DRBD is a distributed replicated storage system for the Linux platform. It is implemented as a kernel driver, several userspace management applications, and some shell scripts. DRBD is traditionally used in high availability (HA)

In this post, I am going create HA cluster block storage. Switching-over will be handled manually. But in the next post I will add cluster software. I have two Debian systems for this lab. It is depicted in the Figure-1 sample architecture.

Figure-1 Sample HA Block Storage

Reference:https://www.ibm.com/developerworks/jp/linux/library/l-drbd/index.html

Installing DRDB packages:

Install drbd8-utils on each of the node.

root@debian1:~# apt-get install drbd8-utils 

Add hostnames into the /etc/hosts file on each of the node.

192.168.122.70 debian1
192.168.122.71 debian2

Creating a file system:

Instead of adding a disk storage we create a file and use it as a storage on each of the node.

root@debian1:~# mkdir /replicated
root@debian1:~# dd if=/dev/zero of=drbd.img bs=1024K count=512
root@debian2:~# mkdir /replicated
root@debian2:~# dd if=/dev/zero of=drbd.img bs=1024K count=512
root@debian1:~# losetup /dev/loop0 /root/drbd.img
root@debian2:~# losetup /dev/loop0 /root/drbd.img

Configuring DRBD:

Add the configuration below on each of the node.

root@debian1:~# cat /etc/drbd.d/replicated.res
resource replicated {
protocol C;          
on debian1 {
                device /dev/drbd0;
                disk /root/drbd.img;
                address 192.168.122.70:7788;
                meta-disk internal;
                }
on debian2 {
                device /dev/drbd0;
                disk /root/drbd.img;
                address 192.168.122.71:7788;
                meta-disk internal;
                }
               
} 


Initializing metadata storage(on each node)

root@debian1:~# drbdadm create-md replicated
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.
root@debian1:~# 

Make sure that drbd service is running on both nodes.

● drbd.service - LSB: Control DRBD resources.
   Loaded: loaded (/etc/init.d/drbd; generated; vendor preset: enabled)
   Active: active (exited) since Fri 2019-02-01 15:32:34 +04; 6min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 1399 ExecStop=/etc/init.d/drbd stop (code=exited, status=0/SUCCESS)
  Process: 1420 ExecStart=/etc/init.d/drbd start (code=exited, status=0/SUCCESS)
    Tasks: 0 (limit: 4915)
   CGroup: /system.slice/drbd.service

root@debian2:~# lsblk 
NAME              MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
loop0               7:0    0  512M  0 loop 
└─drbd0           147:0    0  512M  1 disk 
sr0                11:0    1 1024M  0 rom  
vda               254:0    0   10G  0 disk 
└─vda1            254:1    0   10G  0 part 
  ├─vgroot-lvroot 253:0    0  7.1G  0 lvm  /
  └─vgroot-lvswap 253:1    0  976M  0 lvm  [SWAP]

DRDB uses only one node at a time as a primary node where read and write can be preformed. We will at first specify node 1 as primary node.

root@debian1:~# drbdadm primary replicated --force

root@debian1:~# cat /proc/drbd 
version: 8.4.7 (api:1/proto:86-101)
srcversion: AC50E9301653907249B740E
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:516616 nr:0 dw:0 dr:516616 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:7620
[==================>.] sync'ed: 99.3% (7620/524236)K
finish: 0:00:00 speed: 20,968 (13,960) K/sec
root@debian1:~# cat /proc/drbd
version: 8.4.7 (api:1/proto:86-101)
srcversion: AC50E9301653907249B740E
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:524236 nr:0 dw:0 dr:524236 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
[===================>] sync'ed:100.0% (0/524236)K
finish: 0:00:00 speed: 20,300 (13,792) K/sec

Initializing the Filesystem:

root@debian1:~# mkfs.ext4 /dev/drbd0 
root@debian1:~# mount /dev/drbd0 /replicated/

Do not forget to format the disk partition(in this case /dev/drbd0) first node only. Do not issue the command(mkfs.ext4) on the second node again.

Switching-over the Second node

#On first node:
root@debian1:~# umount /replicated
root@debian1:~# drbdadm secondary replicated
#On second node:
root@debian2:~# drbdadm primary replicated
root@debian2:~# mount /dev/drbd0

Switching-back the First Node:

#On second node:
root@debian2:~# umount /replicated
root@debian2:~# drbdadm secondary replicated
#On first node:
root@debian1:~# drbdadm primary replicated
root@debian1:~# mount /dev/drbd0


Checking Connection without Telnet

Some of the minimal Linux distributions have no telnet client utility or similar utilities such as nc,ncat unless you install it. Most of time we need to do troubleshooting to check connection if  server/service is accessible. Do not worry–You still have mechanism inside the Linux kernel without installing above utilities. Take a look at below examples and change it accordingly for your case.

Checking TCP connection:

root@debian2:# echo > /dev/tcp/8.8.8.8/53 && echo "PORT IS OPEN" || echo "PORT IS NOT OPEN"
PORT IS OPEN
root@debian2:~# echo > /dev/tcp/google.com/80 && echo "PORT IS OPEN" || echo "PORT IS NOT OPEN"
PORT IS OPEN
root@debian2:~# echo > /dev/tcp/google.com/443 && echo "PORT IS OPEN" || echo "PORT IS NOT OPEN"
PORT IS OPEN
#I have to send signal SIGINT.
root@debian2:~# echo > /dev/tcp/google.com/123 && echo "PORT IS OPEN" || echo "PORT IS NOT OPEN"
^C-su: connect: Network is unreachable


Checking UDP Connection:

root@debian2:~# echo > /dev/udp/0.pool.ntp.org/123 && echo "PORT IS OPEN" || echo "PORT IS NOT OPEN"
PORT IS OPEN

 

 

Minimizing Docker Images with Multistage

When you build your own  docker image from Dockerfile,  each instruction in Dockerfile creates a new layer to your base image with its all dependencies so that even your very tiny application image size may be in 1GiB  size and it is not desirable in the production environment to be such a big size due to the below reasons.

  • Large Images takes longer to Download
  • Large Images takes up more disk space
  • Large Images contains unnecessary components

 

How to Reduce Image Size ?

Answer is multi-stage build. Multi-Stage builds enables you to create smaller container images with better caching and smaller security footprint. In this post, It will be shown you how to minimize your docker image step by step. For this experiment, It is written very simple Go web application.

Let’s create a web application named main.go

package main

import (
    "fmt"
    "log"
    "net/http"
)

func handler(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintf(w, "Hi there, I love %s!", r.URL.Path[1:])
    log.Printf("connection from:%s",r.RemoteAddr)
}

func main() {
    http.HandleFunc("/", handler)
    log.Fatal(http.ListenAndServe(":8080", nil))
}

Creating a Dockerfile.

FROM  golang:alpine AS builder
WORKDIR /webapps/app
ADD . /webapps/app
RUN go build -o main .
EXPOSE 8080
CMD ["/webapps/app/main"]

Building docker image

tesla@otuken:~/DockerTraining/SimpleHello$ sudo docker build -t gokay/goweb:1 .

 

Figure-1 Image Size 317MB

As you see in the Figure-1 image size that simple application is 317MB.

Multi-Stage Build:

In this section, it will be shown you how to reduce the docker image size with Multi-Stage build. Only thing we need to do is adding some lines in our Dockerfile.

FROM  golang:alpine AS builder
WORKDIR /webapps/app
ADD . /webapps/app
RUN go build -o main .


FROM alpine
WORKDIR /app
ADD . /app
COPY --from=builder /webapps/app/main /app
EXPOSE 8080
CMD ["/app/main"]

 

tesla@otuken:~/DockerTraining/SimpleHello$ sudo docker build -t gokay/goweb:1 .

After building our new image with new Dockerfile image size considerably reduced.

Figure-2 Image size 11MB

As you see docker image size is now 11MB.

If this size enough for you you can skip reading rest of the post. We can even reduce the image size a bit more as we write our application in Go. We can disable the cross-compilation as below.

 

FROM  golang:alpine AS builder
WORKDIR /webapps/app
ADD . /webapps/app
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -ldflags '-extldflags "-static"' -o main .

FROM alpine
WORKDIR /app
ADD . /app
COPY --from=builder /webapps/app/main /app
EXPOSE 8080
CMD ["/app/main"]

 

Figure-3 Image size 10.9 MB

Reducing More ?

You can use scratch image which is the minimalist image. But I would recommend to use alphine as it the security-oriented Linux distribution.

FROM  golang:alpine AS builder
WORKDIR /webapps/app
ADD . /webapps/app
#RUN go build -o main .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -ldflags '-extldflags "-static"' -o main .
#EXPOSE 8080
#CMD ["/webapps/app/main"]


FROM scratch
WORKDIR /app
ADD . /app
COPY --from=builder /webapps/app/main /app
EXPOSE 8080
CMD ["/app/main"]

 

tesla@otuken:~/DockerTraining/SimpleHello$ sudo docker build -t gokay/goweb:1 .

 

 

 

 

 

Building RHEL7 Cluster Part-I

Hello Folks,

It has been long time, I could not create any post about the clustering with RHEL7. It uses Pacemaker as a high-availability cluster resource manager.

In this post, It will be used two-node cluster. Each of the node has been configured to resolve the hostnames into the IP addresses. Also, each node has been configured the sync its time from the outside world ntp servers. It is depicted in the Figure-1.

Nodes:

pck1 – 192.168.122.22

pck2- 192.168.122.23

It is used ntpd deamon, instead of chronyd.

Figure-1 Two-Node Cluster with Pacemaker

 

Installing Necessary Packages:

On both nodes;

[root@pcmk-1 ~]# yum update -y
[root@pcmk-1 ~]# yum install -y pacemaker pcs psmisc policycoreutils-python

Configuring Firewall:

On both nodes;

[root@pcmk-1 ~]# firewall-cmd --permanent --add-service=high-availability
[root@pcmk-1 ~]# firewall-cmd --reload

Disabling Selinux:

On both nodes;

[root@pcmk-1 ~]# setenforce 0
[root@pcmk-1 ~]# sed -i.bak "s/SELINUX=enforcing/SELINUX=permissive/g" /etc/selinux/config

Staring and Enabling Cluster Service:

On either node.

root@pck1 ~]# systemctl enable pcsd.service 
Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service.
[root@pck1 ~]# systemctl start pcsd.service 

Configuring Cluster:

On both node set password for the user hacluster

root@pck1 ~]# passwd hacluster

On either node authenticate the nodes.

[root@pck1 ~]# pcs cluster auth pck1 pck2
Username: hacluster
Password: 
pck2: Authorized
pck1: Authorized

Named the Cluster:

I named cluster as LATAM_GW

[root@pck1 ~]# pcs cluster setup LATAM_GW pck1 pck2
Error: A cluster name (--name <name>) is required to setup a cluster
[root@pck1 ~]# pcs cluster setup --name LATAM_GW pck1 pck2
Destroying cluster on nodes: pck1, pck2...
pck1: Stopping Cluster (pacemaker)...
pck2: Stopping Cluster (pacemaker)...
pck2: Successfully destroyed cluster
pck1: Successfully destroyed cluster

Sending 'pacemaker_remote authkey' to 'pck1', 'pck2'
pck1: successful distribution of the file 'pacemaker_remote authkey'
pck2: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
pck1: Succeeded
pck2: Succeeded

Synchronizing pcsd certificates on nodes pck1, pck2...
pck2: Success
pck1: Success
Restarting pcsd on the nodes in order to reload the certificates...
pck2: Success
pck1: Success
[root@pck1 ~]#

Starting Cluster

On either node.

[root@pck1 ~]# pcs cluster start --all
pck1: Starting Cluster...
pck2: Starting Cluster...

You can also start the specific node instead of all nodes in the cluster.

[root@pck1 ~]# pcs cluster start pck1
pck1: Starting Cluster...

Viewing the version of cluster software:

[root@pck1 ~]# pacemakerd --features
Pacemaker 1.1.18-11.el7_5.3 (Build: 2b07d5c5a9)
 Supporting v3.0.14:  generated-manpages agent-manpages ncurses libqb-logging libqb-ipc systemd nagios  corosync-native atomic-attrd acls
[root@pck1 ~]# 

Checking current cluster Configuration:

Cluster configuration stored in XML format. You can view the current cluster configuration with “pcs cluster cib”.

[root@pck2 ~]# pcs cluster cib 
<cib crm_feature_set="3.0.14" validate-with="pacemaker-2.10" epoch="5" num_updates="4" admin_epoch="0" cib-last-written="Sat Dec  1 14:55:56 2018" update-origin="pck2" update-client="crmd" update-user="hacluster" have-quorum="1" dc-uuid="2">
  <configuration>
    <crm_config>
      <cluster_property_set id="cib-bootstrap-options">
        <nvpair id="cib-bootstrap-options-have-watchdog" name="have-watchdog" value="false"/>
        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.18-11.el7_5.3-2b07d5c5a9"/>
        <nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="corosync"/>
        <nvpair id="cib-bootstrap-options-cluster-name" name="cluster-name" value="LATAM_GW"/>
      </cluster_property_set>
    </crm_config>
    <nodes>
      <node id="1" uname="pck1"/>
      <node id="2" uname="pck2"/>
    </nodes>
    <resources/>
    <constraints/>
  </configuration>
  <status>
    <node_state id="1" uname="pck1" in_ccm="true" crmd="online" crm-debug-origin="do_state_transition" join="member" expected="member">
      <lrm id="1">
        <lrm_resources/>
      </lrm>
    </node_state>
    <node_state id="2" uname="pck2" in_ccm="true" crmd="online" crm-debug-origin="do_state_transition" join="member" expected="member">
      <lrm id="2">
        <lrm_resources/>
      </lrm>
    </node_state>
  </status>
</cib>

Adding Resource to Cluster:

In this post, It will be built High Available(Active-Passive) Apache Web Server. For this purpose first resource that we need to create is the ClusterIP. In other words Floating IP. A floating IP address is used to support failover in a high-availability cluster. The cluster is configured such that only the active member of the cluster “owns” or responds to that IP address at any given time.

On either node.

[root@pck1 ~]# pcs resource create ClusterIP ocf:heartbeat:IPaddr2 \
    ip=192.168.122.24 cidr_netmask=32 op monitor interval=30s

Configuring Apache for the Active-Passive cluster(HA)

On both nodes. Install httpd and wget on both nodes. wget is necessary tool which used by the resource agent to check the healthiness of the node.

[root@pck1 ~]# yum install -y httpd wget

Create a webpage on both nodes like below.

On first node:

Create index.html in the /var/www/html

 <html>
 <body>LATAMWEBGW running on - pck1.localdomain</body>
 </html>

On second node.

Create index.html in the /var/www/html

<html>
 <body>LATAMWEBGW running on - pck2.localdomain</body>
 </html>

Configure apache for server-status:

This pages will be requested by the cluster resource agent to check the healthiness of nodes.

On first node; create configuration into the /etc/httpd/conf.d/status.conf

<Location /server-status>
    SetHandler server-status
    Require all denied
    Require ip 127.0.0.1
    Require ip ::1
    Require ip 192.168.122.22
</Location>

On second node; create configuration into the /etc/httpd/conf.d/status.conf

<Location /server-status>
    SetHandler server-status
    Require all denied
    Require ip 127.0.0.1
    Require ip ::1
    Require ip 192.168.122.23
</Location>

Creating Resource for the Apache

Only either one.

pcs resource create LATAMWEBGW ocf:heartbeat:apache  \
      configfile=/etc/httpd/conf/httpd.conf \
      statusurl="http://localhost/server-status" \
      op monitor interval=1min

As you may now, we have two resources now. One is the ClusterIP and the other one is LATAMWEBGW. If you do not do anything special, Cluster Manager load balances these to resources by running on the different nodes, that we do not want that. So, these two resources are dependent on each other. We need to add some constraints solve this problem.

Creating colocation constraint:

Colocation constraint tells the cluster manager that location of one resource is depended location of another resource.

On either node

pcs constraint colocation add LATAMWEBGW with ClusterIP INFINITY

By issuing the command above; we told cluster manager that LATAMWEBGW resource must be in the same node as ClusterIP resource.

Latest status of our cluster:

[root@pck1 conf.d]# pcs status
Cluster name: LATAM_GW
Stack: corosync
Current DC: pck1 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Sun Dec  2 13:43:46 2018
Last change: Sun Dec  2 13:43:42 2018 by root via cibadmin on pck1

2 nodes configured
3 resources configured

Online: [ pck1 pck2 ]

Full list of resources:

 virsh-fencing	(stonith:fence_virsh):	Stopped
 ClusterIP	(ocf::heartbeat:IPaddr2):	Started pck1
 LATAMWEBGW	(ocf::heartbeat:apache):	Started pck1

Creating Order Constraint:

We may have still an issue when sending a request to the our Web Server. Other than, colocation constraint, we also need to tell the cluster software to order of the resources to be started. It is called order constraint.

[root@pck1 conf.d]# pcs constraint order ClusterIP then LATAMWEBGW
Adding ClusterIP LATAMWEBGW (kind: Mandatory) (Options: first-action=start then-action=start)
[root@pck1 conf.d]#

By issuing the command above we are telling the cluster manager to which resource to be started first. After we configured constraints, we can test our Web Server if it handles the requests properly.

tesla@otuken:~$ curl http://latamwebgw
 <html>
 <body>My Test Site - pck1.localdomain</body>
 </html>

Relocating the Resource to Another Node:

Sometimes we need to relocate the resources to the other node in order to maintain or upkeep of the nodes.

[root@pck2 ~]# pcs status
Cluster name: LATAM_GW
Stack: corosync
Current DC: pck1 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Wed Dec  5 23:26:49 2018
Last change: Mon Dec  3 09:27:47 2018 by root via crm_resource on pck1

2 nodes configured
3 resources configured

Online: [ pck1 pck2 ]

Full list of resources:

 virsh-fencing	(stonith:fence_virsh):	Stopped
 ClusterIP	(ocf::heartbeat:IPaddr2):	Started pck1
 LATAMWEBGW	(ocf::heartbeat:apache):	Started pck1

Failed Actions:
* virsh-fencing_start_0 on pck1 'unknown error' (1): call=14, status=Error, exitreason='',
    last-rc-change='Wed Dec  5 21:26:32 2018', queued=0ms, exec=1404ms
* virsh-fencing_start_0 on pck2 'unknown error' (1): call=14, status=Error, exitreason='',
    last-rc-change='Wed Dec  5 21:26:36 2018', queued=0ms, exec=1398ms


Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

As you may see current resources running on pck1, which is the first node. Let’s relocate it to the pck2 which is the second node.

[root@pck2 ~]# pcs resource move LATAMWEBGW pck2
[root@pck2 ~]# pcs status
Cluster name: LATAM_GW
Stack: corosync
Current DC: pck1 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Wed Dec  5 23:34:10 2018
Last change: Wed Dec  5 23:33:42 2018 by root via crm_resource on pck2

2 nodes configured
3 resources configured

Online: [ pck1 pck2 ]

Full list of resources:

 virsh-fencing	(stonith:fence_virsh):	Stopped
 ClusterIP	(ocf::heartbeat:IPaddr2):	Started pck2
 LATAMWEBGW	(ocf::heartbeat:apache):	Started pck2

As you see all LATAMWEBGW resource is now running on pck2 which is the second node.

tesla@otuken:~$ curl http://latamwebgw
 <html>
 <body>My Test Site - pck2.localdomain</body>
 </html>

 

This is the end of  first part of the RHEL7 clustering part. We have not yet configured fencing, which is very crucial part of the clustering. In the next post we are going to configure fencing, stickiness and other cluster settings.

Happy Clustering 🙂

Sorting by Specific Column in PowerShell

Hello Folks,

I needed a script in Poweshell which sorts the data by specific column. I wanted to share it on my blog as I do not see much examples over the Internet.

Let’s say you have data as below and you want to sort this data by specific column. In this example I sort the data by

Elapsed Time.

Fields are in the sample data delimited by multiple space–in Regex terms (\s+)

Fields are; Batch Name Status Stage Batch Batch Date Start Time End Time Elapsed Time Avg. Elapsed Time

BNK/TEST001 0 A100-Application 20140325 20:01:38 20:01:38 0 0.2
BNK/TEST002 0 R050-Reporting 20140325 21:23:50 21:23:51 1 0.3
BNK/TEST003 0 D110-Start-of-Day 20140325 00:17:34 00:17:34 0 0.9
BNK/TEST004 0 D110-Start-of-Day 20140325 00:17:33 00:17:33 0 0.5
BNK/TEST005 0 S920-System-Wide 20140325 21:09:41 21:09:41 0 0.0
BNK/TEST006 0 S920-System-Wide 20140325 21:18:46 21:18:47 1 0.4
BNK/TEST007 0 S920-System-Wide 20140325 21:18:48 21:18:48 1 0.7
BNK/TEST008 0 S920-System-Wide 20140325 21:18:48 21:18:48 0 0.0
BNK/TEST009 0 S920-System-Wide 20140325 21:18:48 21:18:48 0 0.1
BNK/TEST010 0 S920-System-Wide 20140325 21:10:38 21:18:46 544 508.3

 

Sorting Script

Get-Content sample_data.txt | ForEach-Object {
 $Line = $_.Trim() -Split '\s+'
 New-Object -TypeName PSCustomObject -Property @{
                batchName = $Line[0]
                #status = $Line[1]
                stage = $Line[2]
                batchDate = $Line[3]
                startTime = $Line[4]
                endTime = $Line[5]
                elapsedTime = [double]$Line[6]
                avgElapsedtime = [double]$Line[7]
  }

} | Sort-Object elapsedTime -Descending | Format-Table -Property batchName,stage,batchDate,startTime,endTime,elapsedTime  -AutoSize | Out-String -Width 4096 | Out-file results.txt -Encoding default

 

Results

batchName   stage             batchDate startTime endTime  elapsedTime
---------   -----             --------- --------- -------  -----------
BNK/TEST010 S920-System-Wide  20140325  21:10:38  21:18:46         544
BNK/TEST007 S920-System-Wide  20140325  21:18:48  21:18:48           1
BNK/TEST006 S920-System-Wide  20140325  21:18:46  21:18:47           1
BNK/TEST002 R050-Reporting    20140325  21:23:50  21:23:51           1
BNK/TEST009 S920-System-Wide  20140325  21:18:48  21:18:48           0
BNK/TEST001 A100-Application  20140325  20:01:38  20:01:38           0
BNK/TEST008 S920-System-Wide  20140325  21:18:48  21:18:48           0
BNK/TEST003 D110-Start-of-Day 20140325  00:17:34  00:17:34           0
BNK/TEST004 D110-Start-of-Day 20140325  00:17:33  00:17:33           0
BNK/TEST005 S920-System-Wide  20140325  21:09:41  21:09:41           0

 

Happy Scripting 🙂