Collecting application logs

When you start managing the application, the log collection and analysis are two of the important routines to keep tracking the application's status.

However, there are some difficulties when the application is managed by Docker/Kubernetes; because the log files are inside the container, it is not easy to access them from outside the container. In addition, if the application has many pods by the replication controller, it will also be difficult to trace or identify in which pod the issue that has happened.

One way to overcome this difficulty is to prepare a centralized log collection platform that accumulates and preserves the application log. This recipe describes one of the popular log collection platforms ELK (Elasticsearch, Logstash, and Kibana).

Getting ready

First, we need to prepare the Elasticsearch server at the beginning. Then, the application will send a log to Elasticsearch using Logstash. We will visualize the analysis result using Kibana.

Elasticsearch

Elasticsearch (https://www.elastic.co/products/elasticsearch) is one of the popular text indexes and analytic engines. There are some examples YAML files that are provided by the Kubernetes source file; let's download it using the curl command to set up Elasticsearch:

Note

An example YAML file is located on GitHub at https://github.com/kubernetes/kubernetes/tree/master/examples/elasticsearch.

# curl -L -O https://github.com/kubernetes/kubernetes/releases/download/v1.1.4/kubernetes.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   593    0   593    0     0   1798      0 --:--:-- --:--:-- --:--:--  1802
100  181M  100  181M    0     0  64.4M      0  0:00:02  0:00:02 --:--:-- 75.5M
# tar zxf kubernetes.tar.gz 
# cd kubernetes/examples/elasticsearch/
# ls
es-rc.yaml  es-svc.yaml  production_cluster  README.md  service-account.yaml

Create ServiceAccount (service-account.yaml) and then create the Elasticsearch replication controller (es-rc.yaml) and service (es-svc.yaml) as follows:

# kubectl create -f service-account.yaml 
serviceaccount "elasticsearch" created

//As of Kubernetes 1.1.4, it causes validation error
//therefore append --validate=false option
# kubectl create -f es-rc.yaml --validate=false
replicationcontroller "es" created

# kubectl create -f es-svc.yaml 
service elasticsearch" created

Then, you can access the Elasticsearch interface via the Kubernetes service as follows:

//Elasticsearch is open by 192.168.45.152 in this example
# kubectl get service
NAME            CLUSTER_IP       EXTERNAL_IP   PORT(S)             SELECTOR                  AGE
elasticsearch   192.168.45.152                 9200/TCP,9300/TCP   component=elasticsearch   9s
kubernetes      192.168.0.1      <none>        443/TCP             <none>                    110d

//access to TCP port 9200
# curl http://192.168.45.152:9200/
{
  "status" : 200,
  "name" : "Wallflower",
  "cluster_name" : "myesdb",
  "version" : {
    "number" : "1.7.1",
    "build_hash" : "b88f43fc40b0bcd7f173a1f9ee2e97816de80b19",
    "build_timestamp" : "2015-07-29T09:54:16Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  },
  "tagline" : "You Know, for Search"
}

Now, get ready to send an application log to Elasticsearch.

How to do it…

Let's use a sample application, which was introduced in the Moving monolithic to microservices recipe in Chapter 5, Building a Continuous Delivery Pipeline. Prepare a Python Flask program as follows:

# cat entry.py

from flask import Flask, request 
app = Flask(__name__) 

@app.route("/") 
def hello(): 
    return "Hello World!" 

@app.route("/addition/<int:x>/<int:y>") 
def add(x, y): 
    return "%d" % (x+y) 

if __name__ == "__main__": 
    app.run(host='0.0.0.0') 

Use this application to send a log to Elasticsearch.

Logstash

Send an application log to Elasticsearch; using Logstash (https://www.elastic.co/products/logstash) is the easiest way, because it converts from a plain text format to the Elasticsearch (JSON) format.

Logstash needs a configuration file that specifies the Elasticsearch IP address and port number. In this recipe, Elasticsearch is managed by Kubernetes service; therefore, the IP address and port number can be found using the environment variable as follows:

Item

Environment Variable

Example

Elasticsearch IP address

ELASTICSEARCH_SERVICE_HOST

192.168.45.152

Elasticsearch port number

ELASTICSEARCH_SERVICE_PORT

9200

However, the Logstash configuration file doesn't support an environment variable directly. Therefore, the Logstash configuration file uses the placeholder as _ES_IP_ and _ES_PORT_ as follows:

# cat logstash.conf.temp 

input { 
    stdin {}
}

filter {
  grok {
    match => {
        "message" => "%{IPORHOST:clientip} %{HTTPDUSER:ident} %{USER:auth} [%{DATA:timestamp}] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-)"
    }
  }
}

output {
  elasticsearch {
    hosts => ["_ES_IP_:_ES_PORT_"]
    index => "mycalc-access"
}

stdout { codec => rubydebug }
}

Startup script

The startup script will read an environment variable, and then replace the placeholder to set the real IP and port number, as follows:

#!/bin/sh

TEMPLATE="logstash.conf.temp"
LOGSTASH="logstash-2.2.2/bin/logstash"

cat $TEMPLATE | sed "s/_ES_IP_/$ELASTICSEARCH_SERVICE_HOST/g" | sed "s/_ES_PORT_/$ELASTICSEARCH_SERVICE_PORT/g" > logstash.conf

python entry.py 2>&1 | $LOGSTASH -f logstash.conf

Dockerfile

Finally, prepare Dockerfile as follows to build a sample application:

FROM ubuntu:14.04

# Update packages
RUN apt-get update -y

# Install Python Setuptools
RUN apt-get install -y python-setuptools git telnet curl openjdk-7-jre

# Install pip
RUN easy_install pip

# Bundle app source
ADD . /src
WORKDIR /src

# Download LogStash
RUN curl -L -O https://download.elastic.co/logstash/logstash/logstash-2.2.2.tar.gz

RUN tar -zxf logstash-2.2.2.tar.gz

# Add and install Python modules
RUN pip install Flask

# Expose
EXPOSE  5000

# Run
CMD ["./startup.sh"]

Docker build

Let's build a sample application using the docker build command:

# ls
Dockerfile  entry.py  logstash.conf.temp  startup.sh

# docker build -t hidetosaito/my-calc-elk .
Sending build context to Docker daemon  5.12 kB
Step 1 : FROM ubuntu:14.04
 ---> 1a094f2972de
Step 2 : RUN apt-get update -y
 ---> Using cache
 ---> 40ff7cc39c20
Step 3 : RUN apt-get install -y python-setuptools git telnet curl openjdk-7-jre
 ---> Running in 72df97dcbb9a

(skip…)

Step 11 : CMD ./startup.sh
 ---> Running in 642de424ee7b
 ---> 09f693436005
Removing intermediate container 642de424ee7b
Successfully built 09f693436005

//upload to Docker Hub using your Docker account
# docker login
Username: hidetosaito
Password: 
Email: [email protected]
WARNING: login credentials saved in /root/.docker/config.json
Login Succeeded

//push to Docker Hub
# docker push hidetosaito/my-calc-elk
The push refers to a repository [docker.io/hidetosaito/my-calc-elk] (len: 1)
09f693436005: Pushed 
b4ea761f068a: Pushed 

(skip…)

c3eb196f68a8: Image already exists 
latest: digest: sha256:45c203d6c40398a988d250357f85f1b5ba7b14ae73d449b3ca64b562544cf1d2 size: 22268

Kubernetes replication controller and service

Now, use this application by Kubernetes to send a log to Elasticsearch. First, prepare the YAML file to load this application using the replication controller and service as follows:

# cat my-calc-elk.yaml 
apiVersion: v1
kind: ReplicationController
metadata:
  name: my-calc-elk-rc
spec:
  replicas: 2
  selector:
        app: my-calc-elk
  template:
    metadata:
      labels:
        app: my-calc-elk
    spec:
      containers:
      - name: my-calc-elk
        image: hidetosaito/my-calc-elk
---
apiVersion: v1
kind: Service
metadata:
  name: my-calc-elk-service

spec:
  ports:
    - protocol: TCP
      port: 5000
  type: ClusterIP
  selector:
     app: my-calc-elk

Use the kubectl command to create the replication controller and service as follows:

# kubectl create -f my-calc-elk.yaml 
replicationcontroller "my-calc-elk-rc" created
service "my-calc-elk-service" created

Check the Kubernetes service to find an IP address for this application as follows. It indicates 192.168.121.63:

# kubectl get service
NAME                  CLUSTER_IP        EXTERNAL_IP   PORT(S)             SELECTOR                  AGE
elasticsearch         192.168.101.143                 9200/TCP,9300/TCP   component=elasticsearch   15h
kubernetes            192.168.0.1       <none>        443/TCP             <none>                    19h
my-calc-elk-service   192.168.121.63    <none>        5000/TCP            app=my-calc-elk           39s

Let's access this application using the curl command as follows:

# curl http://192.168.121.63:5000/
Hello World!

# curl http://192.168.121.63:5000/addition/3/5
8

Kibana

Kibana (https://www.elastic.co/products/kibana) is a visualization tool for Elasticsearch. Download Kibana, and specify the Elasticsearch IP address, and port number, to launch Kibana:

//Download Kibana 4.1.6
# curl -O https://download.elastic.co/kibana/kibana/kibana-4.1.6-linux-x64.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 17.7M  100 17.7M    0     0  21.1M      0 --:--:-- --:--:-- --:--:-- 21.1M

//unarchive
# tar -zxf kibana-4.1.6-linux-x64.tar.gz

//Find Elasticsearch IP address
# kubectl get services
NAME                  CLUSTER_IP        EXTERNAL_IP   PORT(S)             SELECTOR                  AGE
elasticsearch         192.168.101.143                 9200/TCP,9300/TCP   component=elasticsearch   19h
kubernetes            192.168.0.1       <none>        443/TCP             <none>                    23h

//specify Elasticsearch IP address
# sed -i -e "s/localhost/192.168.101.143/g" kibana-4.1.6-linux-x64/config/kibana.yml

//launch Kibana
# kibana-4.1.6-linux-x64/bin/kibana 

Then, you will see the application log. Create a chart as follows:

Note

This cookbook doesn't cover how to configure Kibana; please visit the official page to learn about the Kibana configuration via https://www.elastic.co/products/kibana.

Kibana

How it works…

Now, the application log is captured by Logstash; it is converted into the JSON format and then sent to Elasticsearch.

Since Logstash is bundled with an application container, there are no problems when the replication controller increases the number of replicas (pods). It captures all the application logs with no configuration changes.

All the logs will be stored in Elasticsearch as follows:

How it works…

See also

This recipe covers how to integrate to the ELK stack. The centralized log collection platform is important for the Kubernetes environment. The container is easy to launch, and is destroyed by the scheduler, but it is not easy to know which node runs which pod. Check out the following recipes:

  • Working with Kubernetes logs
  • Working with etcd log
  • The Moving monolithic to microservices recipe in Chapter 5, Building a Continuous Delivery Pipeline
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset