• AWS Cloud
  • DevOps
  • Kubernetes
  • Microservices
  • Terraform
  • Ansible
  • Blog
    RegisterLogin

    Have a question?  1-800-690-2675  [email protected]

    CloudNative and MicroservicesCloudNative and Microservices
    • AWS Cloud
    • DevOps
    • Kubernetes
    • Microservices
    • Terraform
    • Ansible
    • Blog

      Cloud-native

      • Home
      • Blog
      • Cloud-native
      • Deploy Fluentd on Kubernetes

      Deploy Fluentd on Kubernetes

      • Posted by Damian Igbe
      • Categories Cloud-native, Public Cloud
      • Date June 14, 2018

      Kubernetes Logging With Fluentd

      1. Introduction

      Fluentd solves a major problem in today’s distributed and complex infrastructure–logging. Deploy fluentd on kubernetes is a howto on deploying logging in your Kubernetes infrastructure. System logs and application logs help you to understand the activities  inside your Kubernetes cluster. Once logs are collected, they can be used for:

      • Security–may be needed for compliance
      • Monitoring – application and system logs can help you understand what is happening inside your cluster and help detect potential problems. e.g monitoring memory
      • Troubleshooting and Debugging – help solve problems

      Like most modern applications, Kubernetes support logging to help with debugging and monitoring. Kubernetes usually reads from the underlying container engine like Docker. How much logs Kubernetes collects therefore depends on  logging level enabled at the underlying container engine.

      There are different types of logging:

      1. Local logging:This is writing to  the standard output and standard error streams inside the container itself. The problem with this method of logging  is that when the container dies or is evicted, you may not have access to the logs.
      2. Node-level logging: Node-level logging is when the container engine redirects everything from the container’s stdout and stderr to another location. For example, Docker container engine redirects the 2 streams to a logging driver. Logrotation is a good way to ensure that the logs don’t clog the node. This method is better than local logging but still not a perfect solution because logs are localized on every node.  Ideal solution is to have all the logs sent to a centralized node for centralized management.
      3.  Cluster-level-logging. This requires a separate backend to store, analyze, and query logs. The backend can either be within or outside the cluster. A node-level logging agent (e.g fluentd) runs on each node and sends log data to a central logging node. Typically, the logging agent is a container that has access to a directory with log files from all of the application containers on that node. Kubernetes does not provide a native backend to store and analyze  logs, but many existing logging solutions exists that integrates well with the Kubernetes cluster such as ElasticSearch and Stackdriver.

      2. Fluentd, ElasticSearch and Kibana

      Deploy fluentd on kubernetes tutorial discusses how to perform Kubernetes Cluster-level-logging using Fluentd, Elasticsearch and Kibana. Fluentd is the logging agent deployed on every node. Fluentd sends the standard output and standard error  for each container logs collected to Elasticsearch for analysis.  Visualization is done on Kibana. The diagram below (most diagrams are from Fluentd website) depicts pictorial view of Fluentd, Elasticsearch and Kibana.

       

      2.1 What is Elasticsearch?

      Elasticsearch is a search engine that provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected.

      2.2 What is Kibana?

      Kibana is an open source data visualization plugin for Elasticsearch. It provides visualization capabilities on top of content indexed on an Elasticsearch cluster. Users can create bar, line and scatter plots, or pie charts and maps on top of large volumes of data.

      2.3 What is Fluentd?

      Fluentd Is a  free and open-source log collector that instantly enables you to have a ‘Log Everything’ architecture. It has 3 main attributes:

      • Unify all facets of processing log data: collecting, filtering, buffering, and outputting logs across multiple sources and destinations
      • Fluentd treats logs as JSON, a popular machine-readable format.
      • Fluentd is extensible and currently has over 600 plugins.

      Fluentd agents are deployed on every node  to gather all of the logs that are stored within individual nodes in the Kubernetes cluster. The logs can usually  found under the /var/log/containers directory.

      Below is the simplified architecture of Fluentd. Fluentd is pluggable, extensible and reliable. It can do buffering, HA and load balancing.

      Input: Tell fluentd what to log

      Engine:  The main engine containing the common concerns for the logging. E.g buffering, error handling, message routing.

      Output:  Where to send output logs in the correct format e.g MongoDB or ProgreSQL or Elasticsearch

      The input and output are pluggable and plugins can be classified into Read, Parse, Buffer, Write and Format plugins. Plugins are further discussed below.

      As can be seen from the Architecture, Fluentd collects logs from different sources/Applications to be logged. It can collect data from infinite number of sources.Data collected is then output to the desired storage backend such as Mysql, MongoDB or postgreSQL. This is illustrated with this diagram.

      2.4 Understanding Fluentd Logging

      Fluentd works with plugins to get its mission accomplished. Some plugins are inbuilt but custom plugins can be developed as Fluentd is extensible. The different types of plugins are illustrated in the diagram below. A good reference for each of the plugins is on the fluentd website.

      A brief description of each is presented here:

      Interface Description
      Input Entry point of data. This interface allows to gather or receive data from external sources. E.g: log file content, data over TCP, built-in metrics, etc. Can  periodically pull data from data sources.A Fluentd event consists of a tag, time and record.

      • tag: Where an event comes from. For message routing
      • time: When an event happens. Epoch time
      • record: Actual log content. JSON object

      The input plugin is responsible for generating Fluentd events from specified data sources.

      Parser Parsers enables the user to create their own parser formats to read user’s custom data format.convert unstructured data gathered from the Input interface into a structured one. Parsers are optional and depends on Input plugins.
      Filter Filter plugins enables Fluentd to modify event streams by the Input Plugin. Example use cases are:

      1. Filtering out events by grepping the value of one or more fields.
      2. Enriching events by adding new fields.
      3. Deleting or masking certain fields for privacy and compliance.
      Buffer By default, the data ingested by the Input plugins, resides in memory until is routed and delivered to an Output interface.
      Output  An output defines a destination for the data.There are three types of output plugins: Non-Buffered, Buffered, and Time Sliced.

      • Non-Buffered output plugins do not buffer data and immediately write out results.
      • Buffered output plugins maintain a queue of chunks (a chunk is a collection of events)
      • Time Sliced output plugins are  a type of Buffered plugin where the chunks are keyed by time.
      Formatter Lets the user extend and re-use custom output formats

      2.5 Understanding how Fluentd Sends Kubernetes Logs to ElasticSearch

      The installation instructions to deploy fluentd on kubernetes are below but its important to understand how Fluentd is configured. Fluentd will contact Elasticsearch on a well defined URL and port, configured inside the Fluentd container. 3 Plugins are used here: Input, Filter and Output. The diagram below depicts the configuration architecture and the different plugins are explained. The configuration file is called td-agent.conf. td in td-agent.conf  implies Treasure Data, the company behind Fluentd.

      The configuration file is located at /etc/td-agent/td-agent.conf

      2.5.1 Input Plugin:

      Here is the configuration in td-agent.conf to collect logs from /var/log/containers

      <source>
        type tail
        path /var/log/containers/*.log
        pos_file fluentd-docker.pos
        time_format %Y-%m-%dT%H:%M:%S
        tag kubernetes.*
        format json
        read_from_head true
      </source>

      2.5.2 Filter Plugin

      To get more information out of Docker containers suitable for Kubernetes, a plugin called Kubernetes metadata is required.

      To install the filter plugin:

      gem install fluent-plugin-kubernetes_metadata_filter
      

      Here is the configuration in td-agent.conf to scrap additional kubernetes parameters

      <filter kubernetes.var.log.containers.**.log>
      type kubernetes_metadata
      </filter> 

      2.5.3 Output Plugin

      For the output, Elasticsearch plugin will be installed. Full details of Elasticsearch plugin can be found here

      Prepare for ruby gem to run and then install fluent-plugin:

      apt install ruby
      sudo apt-get install make libcurl4-gnutls-dev 
      sudo apt-get install build-essential 
      sudo apt-get install ruby2.3-dev
      gem install fluent-plugin-elasticsearch

      The configuration in td-agent.conf to send log files to Elasticsearch is here:

      <match **>
      type elasticsearch
      user "#{ENV['FLUENT_ELASTICSEARCH_USER']}"
      password "#{ENV['FLUENT_ELASTICSEARCH_PASSWORD']}"
      log_level info
      include_tag_key true
      host elasticsearch-logging
      port 9200
      logstash_format true
      # Set the chunk limit the same as for fluentd-gcp.
      buffer_chunk_limit 2M
      # Cap buffer memory usage to 2MiB/chunk * 32 chunks = 64 MiB
      buffer_queue_limit 32
      flush_interval 5s
      # Never wait longer than 5 minutes between retries.
      max_retry_wait 30
      # Disable the limit on the number of retries (retry forever).
      disable_retry_limit
      # Use multiple threads for processing.
      num_threads 8
      </match> 

      Note that Fluentd, Elasticsearch and Kibana will be deployed as different containers so the fluentd configurations above will be on the fluentd container.

      3. Installing Fluentd, Elasticsearch and Kibana

      To deploy these services, let’s use Kubernetes manifest files which are already publicly available. We need to create a deployment and a service for each of the application. You can find the manifest files cloned to this github location. Only little modifications were done to the yaml templates.

      The Kubernetes installation was performed following kubernetes with KOPS, one of the earlier blog tutorials. One master and one slave node were used but you can use as many nodes desired. Fluentd is deployed as a daemonset so whenever an additional node is added, it will join the cluster and start sending logs to Elasticsearch on the master node.

      Step 1: Clone the repository on your master Kubernetes node and then create  the deployments and service objects:

      kubectl create -f elastic-search-rc.yaml
      kubectl create -f elasticsearch-svc.yaml
      kubectl create -f kibana-rc.yaml
      kubectl create -f kibana-svc.yaml
      

      Step 2: Create the  fluentd daemonsets:

      kubectl create -f fluentd-daemonset.yaml
      

      Step 3: Check all is well, that all the kubernetes objects are properly deployed:

      $ kubectl get pods --all-namespaces
      
      NAMESPACE   NAME          READY        STATUS RESTARTS AGE
      kube-system elasticsearch-logging-h68v6 1/1 Unknown 0 59d
      kube-system elasticsearch-logging-mpdkv 1/1 Running 5 59d
      kube-system etcd-fluentdmaster 1/1 Running 11 63d
      kube-system fluentd-es-1.24-2z7w5 1/1 Running 5 59d
      kube-system kibana-logging-5874ff6996-5wqfg 1/1 Running 5 59d
      kube-system kube-apiserver-fluentdmaster 1/1 Running 11 63d
      kube-system kube-controller-manager-fluentdmaster 1/1 Running 11 63d
      kube-system kube-dns-6f4fd4bdf-655tv 3/3 Running 30 63d
      kube-system kube-proxy-4ff9h 1/1 Running 10 63d
      kube-system kube-proxy-vclr9 1/1 Running 6 59d
      kube-system kube-scheduler-fluentdmaster 1/1 Running 11 63d
      kube-system weave-net-6w9sd 2/2 Running 19 59d
      kube-system weave-net-m24wm 2/2 Running 3 6h
      
      kubectl get svc --all-namespaces
      NAMESPACE NAME     TYPE      CLUSTER-IP      EXTERNAL-IP PORT(S) AGE
      default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 63d
      kube-system elasticsearch-logging ClusterIP 10.111.123.66 <none> 9200/TCP 63d
      kube-system kibana-logging NodePort 10.96.204.66 <none> 80:30560/TCP 63d
      kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 63d
      

      Step 4: Test  Elasticsearch with basic query searches. If elasticsearch is not working properly, Kibana will give errors when loaded in the browser. Note that the IP address is the service address of Easticsearch as can be seen in the kubectl get svc command above.

      curl 10.111.123.66:9200/_search?q=*pretty 
      curl 10.111.123.66:9200/_search?q=*warning
      
      The warning search will give a long output as follows:
      [email protected]:~/Kubernetes-efk-stack$ curl 10.111.123.66:9200/_search?q=*warning
      {"took":20,"timed_out":false,"_shards":{"total":6,"successful":6,"failed":0},"hits":{"total":5,"max_score":1.0,"hits":[{"_index":"logstash-2018.06.12","_type":"fluentd","_id":"AWP1MbMPybVhNMr2IQmo","_score":1.0,"_source":{"type":"log","@timestamp":"2018-06-12T18:10:50Z","tags":["warning","elasticsearch","admin"],"pid":6,"message":"No living connections","log":"{\"type\":\"log\",\"@timestamp\":\"2018-06-12T18:10:50Z\",\"tags\":[\"warning\",\"elasticsearch\",\"admin\"],\"pid\":6,\"message\":\"No living connections\"}\n","stream":"stdout","docker":{"container_id":"ecb40acfaf294458d95a48c7c0f6993536bfe598a70dd12a27fa22a596490a15"},"kubernetes":{"container_name":"kibana-logging","namespace_name":"kube-system","pod_name":"kibana-logging-5874ff6996-5wqfg","pod_id":"d6402b2f-3f9b-11e8-902e-08002728f91c","labels":{"k8s-app":"kibana-logging","pod-template-hash":"1430992552"},"host":"fluentdslave1","master_url":"https://10.96.0.1:443/api"},"tag":"kubernetes.var.log.containers.kibana-logging-5874ff6996-5wqfg_kube-system_kibana-logging-ecb40acfaf294458d95a48c7c0f6993536bfe598a70dd12a27fa22a596490a15.log"}},{"_index":"logstash-2018.06.12","_type":"fluentd","_id":"AWP1Ma5tybVhNMr2IQkj","_score":1.0,"_source":{"log":"WARNING: Tini has been relocated to /sbin/tini.\n","stream":"stderr","docker":{"container_id":"ecb40acfaf294458d95a48c7c0f6993536bfe598a70dd12a27fa22a596490a15"},"kubernetes":{"container_name":"kibana-logging","namespace_name":"kube-system","pod_name":"kibana-logging-5874ff6996-5wqfg","pod_id":"d6402b2f-3f9b-11e8-902e-08002728f91c","labels":{"k8s-app":"kibana-logging","pod-template-hash":"1430992552"},"host":"fluentdslave1","master_url":"https://10.96.0.1:443/api"},"@timestamp":"2018-06-12T18:10:06+00:00","tag":"kubernetes.var.log.containers.kibana-logging-5874ff6996-5wqfg_kube-system_kibana-logging-ecb40acfaf294458d95a48c7c0f6993536bfe598a70dd12a27fa22a596490a15.log"}},{"_index":"logstash-2018.06.12","_type":"fluentd","_id":"AWP1MbMPybVhNMr2IQmu","_score":1.0,"_source":{"type":"log","@timestamp":"2018-06-12T18:10:52Z","tags":["warning","elasticsearch","admin"],"pid":6,"message":"No living connections","log":"{\"type\":\"log\",\"@timestamp\":\"2018-06-12T18:10:52Z\",\"tags\":[\"warning\",\"elasticsearch\",\"admin\"],\"pid\":6,\"message\":\"No living connections\"}\n","stream":"stdout","docker":{"container_id":"ecb40acfaf294458d95a48c7c0f6993536bfe598a70dd12a27fa22a596490a15"},"kubernetes":{"container_name":"kibana-logging","namespace_name":"kube-system","pod_name":"kibana-logging-5874ff6996-5wqfg","pod_id":"d6402b2f-3f9b-11e8-902e-08002728
      

      You can ssh into any of the containers if need to troubleshoot a service:

      $ kubectl exec -it fluentd-es-1.24-2z7w5 --namespace=kube-system -- /bin/bash
      
      

      Step 5: If all goes well, put the ip of kibana service (obtained with kubectl get svc –all-namespaces) in your URL and you will see the Kibana Dashboard.

       

      4. Conclusion

      Deploy fluentd on kubernetes tutorial discusses how to deploy Fluentd, Kibana and Elasticserach on a Kubernetes cluster. You’ll have a fully functional Kubernetes cluster together with Logging by following this tutorials. Fluentd is very important and almost becoming the standard in modern architecture logging, replacing syslog. If you like the tutorials, do subscribe to our blog and youtube channel for more coming your way.

      Tag:Elastic search, Fluentd, fluentd logging, fluentd tutorials, Kibana, Kubernetes logging

      • Share:
      author avatar
      Damian Igbe
      Damian holds a PhD in Computer Science and has decades of experience in Information Technology and Cloud services. Damian holds a couple of certifications including AWS Certified Solutions Architect- Associate, AWS Certified Developer-Associate and AWS Certified SysOp-Associate. He is the founder and CTO of Cloud Technology Experts. When not writing or teaching or consulting, Damian likes running and spending time with the family.

      Previous post

      Deploy Prometheus on Kubernetes
      June 14, 2018

      Next post

      Kubernetes Persistent Volume with Rook
      March 23, 2019

      You may also like

      kubernetes-networking
      Understanding Networking of Microservices Applications
      29 August, 2020
      ci-cd-blog-5
      CI/CD of Microservices in Kubernetes
      9 August, 2020
      service-object-kubernetes
      Accessing Microservices with the Kubernetes Service Object
      22 July, 2020

      Leave A Reply Cancel reply

      Your email address will not be published. Required fields are marked *

      Search

      Categories

      • Cloud Automation
      • Cloud Security
      • Cloud-native
      • General
      • HA & Autoscaling
      • Kubernetes
      • Kubernetes Volumes
      • Monitoring
      • Public Cloud

      Latest Courses

      LPI Linux Essentials

      Free

      AWS Certified Cloud Practitioner

      $300.00 $275.00

      Kubernetes Certified Administrator

      $275.00

      Training, Consulting & Research
      © 2016-2020 CTE, All Rights Reserved.
      14330 Midway Rd, Suite 211, Farmers Branch, TX 75244

      No apps configured. Please contact your administrator.

      Login with your site account

      No apps configured. Please contact your administrator.

      Lost your password?

      Not a member yet? Register now

      Register a new account

      Are you a member? Login now