Part 7 – Deploying EFK (Elasticsearch, Fluentd, Kibana) Stack on OKE

The Elasticsearch, Fluentd, Kibana (EFK) logging stack is one of the most popular combinations in terms of open platforms. In fact, I would say the only debate is around the mechanism used to do log shipping, aka the F (fluentd), which is sometimes swapped out for L (logstash). Otherwise, Kibana visualizing Elasticsearch indexed data seems to be the dominant pattern.

The good news is that it is quite easy to set up this stack on Kubernetes. I’m going to assume that you have an OKE Cluster at hand for this walkthrough. If you don’t, take a moment to read my previous post and spin one up.

I first tried the various helm charts for each of these three platforms but did not have much success with any of them. Instead, I discovered that the Kubernetes project itself manages a complete and up to date set of resource definitions for these, so I started with that:

In this article, I will talk about standing up this stack using Helm Chart.

Motivation

In the first six articles in our OKE series, we deployed our application using kubectl

This works, but it’s painful because we have to manually run a command for each resource in our Kubernetes application. This is prone to error because we might forget to deploy one resource or introduce a typo when writing our kubectl commands. As we add more parts to our application, the probability of these problems occurring increases.

You could avoid this by writing an automation script, but if you change the filenames or paths of your Kubernetes resources, then you need to update the script too.

The real problem is that we have to remember exactly how to deploy the application step by step. Our “application” (i.e, all of our Kubernetes resources packaged together) is something kubectl has no idea about.

Fluentd vs Fluent Bit

EFK stack usually refers to ElasticsearchFluentd, and Kibana. Fluentd is a flexible log data collector. It supports various inputs like log files or syslog and supports many outputs like elasticsearch or Hadoop. Fluentd converts each log line to an event. Those events can be processed and enriched in the fluentd pipeline. I have chosen fluentd since there is a good Kubernetes metadata plugin. This plugin parses the filename of the log file and uses this information to fetch additional metadata from the Kubernetes API. The metadata like labels and annotations are attached to the log event as additional fields so you can search and filter by this information.

However, you can go ahead with Fluent Bit as well, which is much lighter and it has built-in Kubernetes support. Fluent Bit can read Kubernetes or Docker log files from the file system or through Systemd journal, enrich logs with Kubernetes metadata, deliver logs to third-party storage services like Elasticsearch, InfluxDB, HTTP, etc.

To deploy fluentd into the Kubernetes cluster I have chosen a DaemonSet. A DaemonSet ensures that a certain pod is scheduled to each kubelet exactly once. The fluentd pod mounts the /var/lib/containers/ host volume to access the logs of all pods scheduled to that kubelet as well as a host volume for a fluentd position file. This position file saves which log lines are already shipped to the central log store.

Prepare Helm & Tiller

I will be using helm from my CentOS VM running on my MacOS.

curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get | bash

Tiller, the server portion of Helm, typically runs inside of your Kubernetes cluster (In this case, it was already deployed on your OKE). But for development, it can also be run locally and configured to talk to your remote OKE cluster.

helm init

Optionally you can also add the Stable and Incubator repo for all Helm Charts.

# helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator
# helm repo add stable https://kubernetes-charts.storage.googleapis.com
# helm repo list
NAME URL 
stable https://kubernetes-charts.storage.googleapis.com 
local http://127.0.0.1:8879/charts 
incubator http://storage.googleapis.com/kubernetes-charts-incubator 

You can also check whether your OKE Cluster has the Tiller up and running or not. It’s deployed under kube-system namespace.

Deploying EFK Stack using Helm

How to deploy the chart on our OKE is pretty much explained already in my README file hosted on my Github repo.

After it’s deployed, you can see the PODs up and running on your OKE Cluster.

Using the configuration file, we have deployed the Kibana using LoadBalancer. So you should be able to get the Public LB IP either from the OCI Console or from the command line itself. We will use command line here to extract the LB Public IP for Kibana pod.

Set up the Kibana Dashboard

Pick up the External-IP from the output for Kibana POD and open up the browser and hit http://<external-ip>:5601. This should open up the Kibana Dashboard where you need to create Indices.

  • Click on Set up index patterns.
  • Click on the checkbox Include system indices.
  • Write down .monitoring-es* as the pattern and click on Next Step.
  • Select Timestamp from the Time Filter field name.
  • Click on Create index pattern.
  • Click on Discover side menu and see the data up there.

From here you can create Dashboards and Visualization as per your requirement. However, this won’t give you system and application metrics. We will use Metricbeat for that. Follow the next article to understand the use case and how to deploy and use it for OKE.

About Prasenjit Sarkar

Prasenjit Sarkar is a Product Manager at Oracle for their Public Cloud with primary focus on Cloud Strategy, Cloud Native Applications and API Platform. His primary focus is driving Oracle’s Cloud Computing business with commercial and public sector customers; helping to shape and deliver on a strategy to build broad use of Oracle’s Infrastructure as a Service (IaaS) offerings such as Compute, Storage, Network & Database as a Service. He is also responsible for developing public/private cloud integration strategies, customer’s Cloud Computing architecture vision, future state architectures, and implementable architecture roadmaps in the context of the public, private, and hybrid cloud computing solutions Oracle can offer.

Leave a Reply