Kubernetes MetalLB Loadbalancer with BGP mode

In this post, MetalLB will be used on-premise as a Load Balancer with BGP to expose services on Kubernetes to outsideworld. As you know, if you have applications run on Kubernetes which is on your on-premise, that needs to be exposed to the outsideworld, should use LoadBalancer which is a bit tricky. it is quite easy if your applications in cloud environment such as AWS, Azure, or Google Cloud. Otherwise, you have to do similar in your on-premise network, with MetalLB or HAproxy. You can see my previous post how to achieve this with HAProxy here. For this post, we are going to achieve the same thing with MetalLB. You can see the Figure below, basic concept of MetalLB.

MetalLB in BGP mode.

I am assuming that you have Kubernetes installed on your Infrastructure. You can start configuring Metal-lb on your Kubernetes with the instruction in the link. If you follow the steps in the link you should get the results like below. MetalLB runs speaker pods in each worker and master nodes and controller pod in one of the node. As I understood in the metalLB website each speaker node establishes a BGP connection with router.

[tesla@k8s-m1 ~]$ kubectl get pods -n metallb-system 
NAME                          READY   STATUS    RESTARTS   AGE
controller-57f648cb96-ncbww   1/1     Running   1          3h8m
speaker-5n2c4                 1/1     Running   0          3h8m
speaker-7mwx2                 1/1     Running   8          3h8m
speaker-c7fgm                 1/1     Running   0          3h8m
speaker-vxcbd                 1/1     Running   0          3h8m

Next step is creating a configmap named ‘config‘ in the namespace metallb-system.

Creating a Configmap named config in the namespace metallb-system

apiVersion: v1
kind: ConfigMap
metadata:
  namespace: metallb-system
  name: config
data:
  config: |
    peers:
    - peer-address: 10.5.100.254
      peer-asn: 65000
      my-asn: 65000
    address-pools:
    - name: default
      protocol: bgp
      avoid-buggy-ips: true
      addresses:
      - 10.5.120.0/24

I faced a problem that, without adding an option avoid-buggy-ips: true, MetalLB allocated the ip 10.5.120.0, which is network address.

Configuring BGP on “Intelligent” Home router

I am using the physical router EdgeRouterX in my home lab, which is able to run BGP protocol with ECMP.

configure 
set protocols bgp 65000 parameters router-id 10.5.100.254
set protocols bgp 65000 neighbor 10.5.100.20 remote-as 65000
set protocols bgp 65000 neighbor 10.5.100.21 remote-as 65000
set protocols bgp 65000 neighbor 10.5.100.22 remote-as 65000
set protocols bgp 65000 maximum-paths ibgp 6
commit
save
exit

As it is my home lab, I just picked one of the AS number(65000) that is in the private range. For more info on ASN number, check here.

Experiment:

After deploying MetalLB and configuring the Router with BGP, we can do some checks, if BGP peering successfully between the Kubernetes worker nodes and my Home router.

ubnt@ubnt-R0:~$ show ip bgp summary 
BGP router identifier 10.5.100.254, local AS number 65000
BGP table version is 27
1 BGP AS-PATH entries
0 BGP community entries
1  Configured ebgp ECMP multipath: Currently set at 1
6  Configured ibgp ECMP multipath: Currently set at 6

Neighbor                 V   AS   MsgRcv    MsgSen TblVer   InQ   OutQ    Up/Down   State/PfxRcd
10.5.100.20              4 65000  494        485      27      0      0  03:25:32               1
10.5.100.21              4 65000  409        423      27      0      0  03:17:29               1
10.5.100.22              4 65000  494        486      27      0      0  03:25:32               1

Total number of neighbors 3

Total number of Established sessions 3

Checking the logs in one of speaker pod.

[tesla@k8s-m1 ~]$ kubectl logs -f  -n metallb-system speaker-5n2c4 
{"caller":"bgp_controller.go:232","event":"updatedAdvertisements","ip":"10.5.120.1","msg":"making advertisements using BGP","numAds":1,"pool":"default","protocol":"bgp","service":"default/nginx","ts":"2020-08-30T09:59:48.873159314Z"}

Creating a Sample Deployment

Last step is to creating a sample deployment to check if MetalLB works as expected.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1
        ports:
        - name: http
          containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx
spec:
  ports:
  - name: http
    port: 8080
    protocol: TCP
    targetPort: 80
  selector:
    app: nginx
  type: LoadBalancer

[tesla@k8s-m1 ~]$ kubectl get deployments.apps 
NAME    READY   UP-TO-DATE   AVAILABLE   AGE
nginx   1/1     1            1           145m
[tesla@k8s-m1 ~]$ kubectl get svc nginx 
NAME    TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
nginx   LoadBalancer   10.96.136.46   10.5.120.1    8080:30169/TCP   88m
gokay@angora:~$ curl http://10.5.120.1:8080 
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

The last but not least, I had a problem while trying to access to application from KVM host. VMs are able to reach without an issue. I solved the problem, once I added default gateway for my ‘Intelligent’ router.

default via 10.5.100.254 dev br100 > My ‘Intelligent router’
default via 192.168.0.1 dev wlp4s0 proto dhcp metric 600 > ‘My Wifi Router’

Highly available Load-balancer for Kubernetes Cluster On-Premise – II

In the first post of this series, haproxy and keepalived installed, configured and tested.

In this post, two stateless Kubernetes web application will be deployed and domain names will be registered to DNS for these two web applications to test if Load-balancer is working as expected.

Note: For my home-lab, I am using the domain nordic.io.

For the Kubernetes cluster, I am assuming that, nginx Ingress controller deployed as DaemonSet and listening on port 80 and port 443 on each worker node.

Deploying Kubernetes Web Applications:

apiVersion: v1
kind: Service
metadata:
  name: hello-kubernetes-svc
  namespace: default
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 8080
  selector:
    app: hello-kubernetes
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-kubernetes
spec:
  replicas: 3
  selector:
    matchLabels:
      app: hello-kubernetes
  template:
    metadata:
      labels:
        app: hello-kubernetes
    spec:
      containers:
      - name: hello-kubernetes
        image: paulbouwer/hello-kubernetes:1.8
        ports:
        - containerPort: 8080
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: hello-kubernetes-ingress
  annotations:
    ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - host: helloworld.nordic.io  
    http:
      paths:
        - path: /
          backend:
            serviceName: hello-kubernetes-svc
            servicePort: 80

apiVersion: v1
kind: Service
metadata:
  name: whoami-svc
  namespace: default
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    run: whoami
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    run: whoami
  name: whoami
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      run: whoami
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        run: whoami
    spec:
      containers:
      - image: yeasy/simple-web:latest
        name: whoami
      restartPolicy: Always
      schedulerName: default-scheduler
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: whoami-ingress
  annotations:
    ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - host: whoami.nordic.io  
    http:
      paths:
        - path: /
          backend:
            serviceName: whoami-svc
            servicePort: 80

Registering Web Apps to DNS:

Adding DNS Records one of the curial part. In order to use single Load Balancer IP to multiple services we are adding CNAME record. You can see bind dns configuration below to make it.

vip1 IN A 10.5.100.50
helloworld IN CNAME vip1
whoami IN CNAME vip1

Experiment:

Checking DNS Records.

[tesla@deployment ~]$ nslookup helloworld
Server:		10.5.100.253
Address:	10.5.100.253#53

helloworld.nordic.io	canonical name = vip1.nordic.io.
Name:	vip1.nordic.io
Address: 10.5.100.50

[tesla@deployment ~]$ nslookup whoami
Server:		10.5.100.253
Address:	10.5.100.253#53

whoami.nordic.io	canonical name = vip1.nordic.io.
Name:	vip1.nordic.io
Address: 10.5.100.50

Testing Services:

Hello World App:

Whoami App:

Highly available Load-balancer for Kubernetes Cluster On-Premise – I

In this post, we are going to build highly available HAProxy Load-balancer for our Kubernetes cluster on-premise. For this, HaProxy will be used for external Load-balancer which takes the requests from outside world sends them to Kubernetes worker nodes on which nginx ingress controller listens incoming requests on port 80 and 443.

Another curial software component is Keepalived which provides a highly available HAProxy load-balancer, in case of any of HAProxy loadbalancer is down.

Keepalived is a Robust Virtual Router Redundancy Protocal (VRRP )implementation in GNU/Linux

To build cluster, Ubuntu 18.04.4 used. You can see below diagram how environment looks like.

Installing necessary Software Suits

# sudo apt-get install haproxy
# sudo apt-get install keepalived
# sudo systemctl enable haproxy
# sudo systemctl enable keepalived

Configuring Necessary Kernel Parameters

The below configuration is very important to implement it on both nodes.

In order for the Keepalived service to forward network packets properly to the real servers, each node must have IP forwarding turned on in the kernel. Log in as root and change the line which reads net.ipv4.ip_forward = 0 in /etc/sysctl.conf to the following:

net.ipv4.ip_forward = 1

The changes take effect when you reboot the system. Load balancing in HAProxy and Keepalived at the same time also requires the ability to bind to an IP address that are nonlocal, meaning that it is not assigned to a device on the local system. This allows a running load balancer instance to bind to an IP that is not local for failover. To enable, edit the line in /etc/sysctl.conf that reads net.ipv4.ip_nonlocal_bind to the following:

net.ipv4.ip_nonlocal_bind = 1

The changes take effect when you reboot the system.

Configuring Keepalived on both nodes

Some keepalived settings has to be changed accordingly in the second node. So, check the commented lines in the keepalived.conf

global_defs {
   notification_email {
     admin@manintheit.org
   }
   notification_email_from keepalived@manintheit.org
   smtp_server localhost
   smtp_connect_timeout 30
   router_id ha1 #router_id ha2 on the second node(ha2)
   vrrp_skip_check_adv_addr
   vrrp_garp_interval 0.5
   vrrp_garp_master_delay 1
   vrrp_garp_master_repeat 5
   vrrp_gna_interval 0
   enable_script_security
   script_user root
   vrrp_no_swap
   checker_no_swap
}

# Script used to check if HAProxy is running
vrrp_script check_haproxy {
       script "/usr/bin/pgrep haproxy 2>&1 >/dev/null"
        interval 1
        fall 2
        rise 2
}

# Virtual interface
vrrp_instance VI_01 {
        state MASTER #state BACKUP on the second node(ha2)
        interface enp1s0
        virtual_router_id 120
        priority 101  #priority 100 on the second node(ha2). Higher number wins.
        nopreempt
        advert_int 1
        unicast_src_ip 10.5.100.51  #unicast_src_ip 10.5.100.52 on the second node(ha2)
        unicast_peer {
                10.5.100.52    #unicast_peer 10.5.100.51 on the second (ha2)
        }
        virtual_ipaddress {
                10.5.100.50/24 dev enp1s0 label enp1s0:ha-vip1
        }
        authentication {
                auth_type PASS
                auth_pass MANINTHEIT
        }
        track_script {
                check_haproxy
        }
}

HAProxy Config

#Default configurations in the haproxy.cfg has been omitted.

frontend stats
    bind 10.5.100.51:9000
    mode http
    maxconn 10
    stats enable
    stats show-node
    stats hide-version
    stats realm Haproxy\ Statistics
    stats uri /hastats
    stats auth haadmin:haadmin

frontend k8s-service-pool
  mode tcp
  bind 10.5.100.50:80
  default_backend k8s-service-backend

backend k8s-service-backend
    mode tcp
    balance source
    server k8s-worker-01 10.5.100.21 check port 80 inter 10s rise 1 fall 2 
    server k8s-worker-02 10.5.100.22 check port 80 inter 10s rise 1 fall 2
    server k8s-worker-03 10.5.100.23 check port 80 inter 10s rise 1 fall 2

Restart the service keepalived and haproxy on both nodes.

# sudo systemctl restart haproxy
# sudo systemctl restart keepalived

Experiment:

1- Lets check with tcpdump utility if master node sends VRRP advertisement packets to every second to all members of VRRP group.

# tcpdump proto 112 -n

2- Lets check interface IPs; As you see, first node(ha1) looks active node as it register to VIP 10.5.100.50.

Second Part of the Post will be published soon, Stay healthy!