KubeVirt:
KubeVirt is an upstream community project designing a virtualization API for Kubernetes for managing virtual machines in addition to containers.
KubeVirt was developed to address the needs of development teams that have adopted or want to adopt Kubernetes, but also require support for Virtual Machine-based workloads that cannot be easily containerized. More specifically, the technology provides a unified development platform where developers can build, modify, and deploy applications residing in both application containers as well as Virtual Machines in a common, shared environment.
Teams with a reliance on existing virtual machine-based workloads are empowered to containerize applications. With virtualized workloads placed directly in development workflows, teams can decompose them into microservices over time, while still leveraging their remaining virtualized components as they exist.
Service Mesh:
The Red Hat OpenShift Service Mesh is based on Istio. It provides behavioral insight and operational control over the service mesh, designed to provide a uniform way to connect, better secure, and monitor microservice applications.
As a service mesh grows in size and complexity, it can become harder to understand and manage. Red Hat OpenShift Service Mesh adds a transparent communication layer to existing distributed applications, without requiring any changes to the service code. You can add Red Hat OpenShift Service Mesh support to services by deploying a special sidecar proxy throughout your environment. It then intercepts network communication between each microservice. The service mesh is then configured through the control plane features.
Red Hat OpenShift Service Mesh provides a simpler way to create a network of deployed services that provides discovery, load balancing, service-to-service authentication, failure recovery, metrics, and monitoring. A service mesh also provides more complex operational functionality, including A/B testing, canary releases, rate limiting, access control, and end-to-end authentication.
Bookinfo application:
In this blog post we are going to talk about the bookinfo application, which consists of four separate microservices running on Red Hat OpenShift.
This simple application displays information about a book, similar to a single catalog entry of an online book store. Displayed on the page is a description of the book, the book’s details (ISBN, number of pages, and so on), and a few book reviews.
The four microservices used are:
Productpage: The productpage microservice calls the details and reviews microservices to populate the page.( based on Python)
Details: The details microservice contains the book information.This microservice is running on Kubevirt. ( based on Ruby)
Reviews: The reviews microservice contains the book reviews. It also calls the ratings microservice.( java based)
Ratings: The ratings microservice contains the book ranking information that accompanies a book review.
Notice that the details microservice is actually running in the KubeVirt pod.
Requirements:
This blog post is based on these versions of the following environments:
Red Hat OpenShift version 3.11
KubeVirt version 0.10
Service Mesh version 0.2
Deployment:
The following are the steps to setup this environment :
On a freshly installed OpenShift 3.11, use this link to enable Service Mesh.
After Service Mesh is up and running, use this github to deploy KubeVirt.
After KubeVirt is deployed, apply bookinfo.yaml and bookinfo-gateway.yaml from this github.
Walkthrough of the setup:
We are going to use the bookinfo sample application from the Istio webpage.
The Bookinfo application will run with a small change: the details service will run on a virtual machine inside our kubernetes cluster! Once the environment is deployed it looks like this:
Pods Details
[root@bastion ~]# oc get pods
NAME READY STATUS RESTARTS AGE
productpage-v1-57f4d6b98-gdhcf 2/2 Running 0 7h
ratings-v1-6ff8679f7b-5lxdh 2/2 Running 0 7h
reviews-v1-5b66f78dc9-j98pj 2/2 Running 0 7h
reviews-v2-5d6d58488c-7ww4t 2/2 Running 0 7h
reviews-v3-5469468cff-wjsj7 2/2 Running 0 7h
virt-launcher-vmi-details-bj9h6 3/3 Running 0 7h
Service details
[root@bastion ~]# oc get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
details ClusterIP 172.30.18.170 <none> 9080/TCP 12h
productpage ClusterIP 172.30.93.172 <none> 9080/TCP 12h
ratings ClusterIP 172.30.91.26 <none> 9080/TCP 12h
reviews ClusterIP 172.30.37.239 <none> 9080/TCP 12h
Virtual services details
[root@bastion ~]# oc get virtualservices
NAME AGE
bookinfo 12h
Gateway details
[root@bastion ~]# oc get gateway
NAME AGE
bookinfo-gateway 12h
As you can see above, “virt-launcher-vmi-details” is a KubeVirt VM pod running the details microservice!
This pod consist of following containers:
Volumeregistryvolume -- this container provides the persistent storage for the VM and the initial QEMU image
Compute -- this container represent the virtualization layer running the VM instance
Istio-proxy -- this is the sidecar Envoy proxy used by Istio to enforce policies
From Qemu perspective , virtual machine in this case will looks like as shown below .
[root@node2 ~]# ps -eaf | grep qemu
root 17147 17131 0 15:41 ? 00:00:00 /usr/bin/virt-launcher --qemu-timeout 5m --name vmi-details --uid 7176552f-02db-11e9-b39a-06a5cb0d3fce --namespace test --kubevirt-share-dir /var/run/kubevirt --ephemeral-disk-dir /var/run/kubevirt-ephemeral-disks --readiness-file /tmp/healthy --grace-period-seconds 15 --hook-sidecars 0 --use-emulation
root 17580 17147 0 15:42 ? 00:00:01 /usr/bin/virt-launcher --qemu-timeout 5m --name vmi-details --uid 7176552f-02db-11e9-b39a-06a5cb0d3fce --namespace test --kubevirt-share-dir /var/run/kubevirt --ephemeral-disk-dir /var/run/kubevirt-ephemeral-disks --readiness-file /tmp/healthy --grace-period-seconds 15 --hook-sidecars 0 --use-emulation --no-fork true
107 18025 17147 30 15:42 ? 00:18:26 /usr/bin/qemu-system-x86_64 -name guest=test_vmi-details <snip>user,id=testSlirp,net=10.0.2.0/24,dnssearch=test.svc.cluster.local,dnssearch=svc.cluster.local,dnssearch=cluster.local,dnssearch=ec2.internal,dnssearch=2306.internal,hostfwd=tcp::9080-:9080 -device e1000,netdev=testSlirp,id=testSlirp -msg timestamp=on
To have a look under the hood we can look at the iptables entries injected by kube-proxy at the node executing the Pod running the VM.
[root@node2 ~]# iptables-save | grep -i details
-A KUBE-SEP-GRSCMVBAWU6K5QZC -s 10.1.10.123/32 -m comment --comment "test/details:http" -j KUBE-MARK-MASQ
-A KUBE-SEP-GRSCMVBAWU6K5QZC -p tcp -m comment --comment "test/details:http" -m tcp -j DNAT --to-destination 10.1.10.123:9080
-A KUBE-SERVICES -d 172.30.46.76/32 -p tcp -m comment --comment "test/details:http cluster IP" -m tcp --dport 9080 -j KUBE-SVC-U5GFLZHENRTXEBQ7
-A KUBE-SVC-U5GFLZHENRTXEBQ7 -m comment --comment "test/details:http" -j KUBE-SEP-GRSCMVBAWU6K5QZC
From the following output we can see that in our lab the KubeVirt Pod is running with IP 10.1.10.123.
[root@bastion ~]# oc get pods virt-launcher-vmi-details-pb7fg -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
virt-launcher-vmi-details-pb7fg 3/3 Running 0 16m 10.1.10.123 node2.2306.internal <none>
From the iptables output we can see how kube-proxy create the masquerade rules to redirect the traffic to the corresponding Kubernetes service details which in our lab has the Cluster IP 172.30.46.76. This is identical to how Kubernetes handles the routes for any other service defined in the cluster.
[root@node2 ~]# iptables-save | grep -i details
-A KUBE-SEP-GRSCMVBAWU6K5QZC -s 10.1.10.123/32 -m comment --comment "test/details:http" -j KUBE-MARK-MASQ
-A KUBE-SEP-GRSCMVBAWU6K5QZC -p tcp -m comment --comment "test/details:http" -m tcp -j DNAT --to-destination 10.1.10.123:9080
-A KUBE-SERVICES -d 172.30.46.76/32 -p tcp -m comment --comment "test/details:http cluster IP" -m tcp --dport 9080 -j KUBE-SVC-U5GFLZHENRTXEBQ7
-A KUBE-SVC-U5GFLZHENRTXEBQ7 -m comment --comment "test/details:http" -j KUBE-SEP-GRSCMVBAWU6K5QZC
[root@node2 ~]#
Looking at the details service definition we can see all the standard Kubernetes constructs are still in use. In this particular service definition we are exposing TCP 9080 over the Cluster IP.
[root@bastion ~]# oc get svc details -o yaml
apiVersion: v1
kind: Service
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"details"},"name":"details","namespace":"test"},"spec":{"ports":[{"name":"http","port":9080}],"selector":{"app":"details"}}}
creationTimestamp: 2018-12-18T15:41:53Z
labels:
app: details
name: details
namespace: test
resourceVersion: "1884449"
selfLink: /api/v1/namespaces/test/services/details
uid: 717339bc-02db-11e9-b39a-06a5cb0d3fce
spec:
clusterIP: 172.30.46.76
ports:
- name: http
port: 9080
protocol: TCP
targetPort: 9080
selector:
app: details
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
[root@bastion ~]#
Taking a closer look into the actual KubeVirt Pod definition we have a lot more details around what we saw before over the GUI:
We have a Pod named virt-launcher-vmi-details-pb7fg
This Pod contains the containers: volumeregistryvolume (the volumes and QEMU image), compute (the instance of the QEMU image with the volumes), istio-proxy (the Isto sidecar) and istio-init. This last one is only used during the instantiation of the istio-proxy to inject its configuration.
[root@bastion ~]# oc get pods virt-launcher-vmi-details-pb7fg -o yaml
apiVersion: v1
kind: Pod
metadata:
<snip>
name: virt-launcher-vmi-details-pb7fg
<snip>
kind: VirtualMachineInstance
<snip>
containers:
<snip>
image: kubevirt/fedora-cloud-registry-disk-demo:latest
imagePullPolicy: IfNotPresent
name: volumeregistryvolume
<snip>
volumeMounts:
- mountPath: /var/run/kubevirt-ephemeral-disks
name: ephemeral-disks
<snip>
image: docker.io/kubevirt/virt-launcher:v0.10.0
imagePullPolicy: IfNotPresent
name: compute
ports:
- containerPort: 9080
name: http
protocol: TCP
<snip>
- args:
- proxy
- sidecar
- --configPath
- /etc/istio/proxy
- --binaryPath
- /usr/local/bin/envoy
- --serviceCluster
- details
- --drainDuration
- 45s
- --parentShutdownDuration
- 1m0s
- --discoveryAddress
- istio-pilot.istio-system:15005
- --discoveryRefreshDelay
- 1s
- --zipkinAddress
- zipkin.istio-system:9411
- --connectTimeout
- 10s
- --statsdUdpAddress
- istio-statsd-prom-bridge.istio-system:9125
- --proxyAdminPort
- "15000"
- --controlPlaneAuthPolicy
- MUTUAL_TLS
<snip>
image: openshift-istio-tech-preview/proxyv2:0.2.0
imagePullPolicy: IfNotPresent
name: istio-proxy
<snip>
image: openshift-istio-tech-preview/proxy-init:0.2.0
imagePullPolicy: IfNotPresent
name: istio-init
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
<snip>
hostIP: 192.199.0.24
<snip>
podIP: 10.1.10.123
qosClass: Burstable
[root@bastion ~]#
Now let's take a closer look in the virtual machine part inside the bookinfo.yaml file. The following section creates an interface of type slirp and connect it to the pod network by matching the pod network name and add our application port.
It is to be noted that in real production scenario we do not recommend to use slirp interface because it is inherently slow. Instead we prefer to use masquerade interface (placing the VM behind NAT inside the Pod) with this PR.
The main problem from networking perspective to use istio with kubevirt is that in order to communicate with the qemu process, we create new network interfaces on the Pod network namespace (Linux bridge and tap device). These interfaces are affected by iptables rules (PREROUTING table that send the inbound traffic to the envoy process). This problem can be addressed by this PR.
interfaces:
- name: testSlirp
slirp: {}
ports:
- name: http
port: 9080
And following cloud init will download install and run the details application.
- cloudInitNoCloud:
userData: |-
#!/bin/bash
echo "fedora" |passwd fedora --stdin
yum install git ruby -y
git clone https://github.com/istio/istio.git
cd istio/samples/bookinfo/src/details/
ruby details.rb 9080 &
It is to be noted that our environment is configured to run the services in an Istio-enabled environment, with Envoy sidecars injected alongside each service. The following policy in the istio-sidecar-injector configmap will ensure that sidecar injection happens by default
[root@bastion ~]# oc -n istio-system edit configmap istio-sidecar-injector
Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
config: "policy: enabled\ntemplate: |-\n initContainers:\n - name: istio-init\n
And the lines below, when added to the istio-sidecar-injector configmap will ensure that the cloud init will install the required rpms from the external repo i.e. https://github.com/istio/istio.git.
\"traffic.sidecar.istio.io/includeOutboundIPRanges\" ]]\"\n [[ else -]]\n
\ - \"172.30.0.0/16,10.1.0.0/16\"\n
-------
It is to be noted that all of the microservices will be packaged with an Envoy sidecar that intercepts incoming and outgoing calls for the service, providing the hooks needed to externally control--via the Istio control plane--routing, telemetry collection, and policy enforcement for the application as a whole.
Finally, the Bookinfo sample application will run, as shown below.
Distributed tracing:
Since Service Mesh is installed in our setup, lets see how monitoring and troubleshooting of microservices is possible with Jaeger.
Jaeger:
Jaeger is an open source distributed tracing system. You use Jaeger for monitoring and troubleshooting microservices-based distributed systems. Using Jaeger you can perform a trace, which follows the path of a request through various microservices that make up an application. Jaeger is installed by default as part of the Service Mesh.
Below is an example of how the details microservice will look like on Jaeger in terms of traceability and monitoring.
With OpenShift, your cloud, your way
Hybrid use cases of using virtual machine in container can bring a lot of flexibility and advantages over traditional approach of using Container platform in conjunction with virtualization platform . Using Service mesh can reduce complexity of joint solution . The main benefit of using microservices is that it allows us to decompose an application into different smaller services and hence it improves modularity.
By standardizing on OpenShift, you’ve also laid a foundation for entire stack along with service management and security, as well as simplified application procurement and deployment, across clouds and on-prem. Stay tuned in the coming months for more about Kube-virt use cases along with Service mesh, microservices, and OpenShift Container Platform.