Update (Oct. 22, 2018): Since this post was written, several other solutions have popped up to help configure logging with Splunk including https://github.com/splunk/splunk-connect-for-kubernetes.

In my work as an open source developer on the Partner Catalyst Team within Microsoft, I get a chance to work with partners and help them succeed on Azure. Recently, we hosted a hackfest with a partner to help them migrate some of their workload to Kubernetes. The problem that we helped solve with the partner is a common problem with any microservices architecture -- with your multitude of services deployed across a variety of machines, how does one centralize their logs such that they are able to debug services when things go awry?

In the case of the partner I was working with, they were an internal platform team within a large corporation and were tasked with figuring out a way to migrate their existing services (some in containers, most of them were not) to Kubernetes. In terms of logging, we needed to come up with a strategy that would work for their various existing services that each used an assortment of logging strategies and push said logs to their Splunk instance.

The Kubernetes documentation on cluster-level logging is good starting place to understand the various patterns. After much discussion, we came up with a two-prong strategy:

  1. Node Logging Agent - For applications that logged to stdout/stderr, Docker logs are persisted at /var/lib/docker/containers of the node. A Splunk forwarder will operate at the node level forwarding logs from all containers.
  2. Sidecar Container Logging Agent - For applications that send their logs to a different file (e.g. Apache access logs), a sidecar container running a Splunk forwarder will forward logs from a shared volume.

The following post will walk through how we set this up.

  1. Download Universal Forwarder Credentials
  • Login to your Splunk instance, and on the home page, click on "Universal Forwarder".
  • Download the Universal Forwarder Credential Package.
  • Untar the .spl file
  1. Create Config Map

    • Create a config map of the Splunk Universal Forwarder Credentials:

         kubectl create configmap splunk-forwarder-config --from-file {PATH-TO-YOUR-SPLUNK-CREDENTIALS} --dry-run -o yaml > splunk-forwarder-config.yaml
      
    • Don't apply it just yet! We'll need to make a slight addition. Add an inputs.conf to the data section of the ConfigMap. Here's what our's looked like:

        kind: ConfigMap
        apiVersion: v1
        metadata:
          name: splunk-forwarder-config
        data:
          cacert.pem: ...
          client.pem: ...
          limits.conf: ...
          outputs.conf: ...
          server.pem: ...
          inputs.conf: |
             # watch all files in <path>
             [monitor:///var/log/containers/*.log]
             # extract `host` from the first group in the filename
             host_regex = /var/log/containers/(.*)_.*_.*\.log
             # set source type to Kubernetes
             sourcetype = kubernetes
      

      Docker logs are persisted on the node under the following directory structure: /var/lib/docker/containers/{ContainerId}. The logs don't have knowledge of the Kubernetes cluster which makes it difficult to parse as one requires the container id to search for logs.

      Kubernetes resolves this by creating symbolic links that capture the pod name, namespace, container name, and Docker container ID: /var/log/containers/{PodName}_{Namespace}-{ContainerName-ContainerId}.log --> /var/lib/docker/containers/{ContainerId}/{ContainerId}-json.log. It is this directory that we will instruct our Splunk forwarder to watch.

    • Apply the config map

        kubectl create -f splunk-forwarder-config.yaml
      
  2. Create Splunk Forwarder DaemonSet

    We want to run the Splunk forwarder on every node to forward logs from /var/log/containers/*.log. We acheive this by creating a DaemonSet:

     apiVersion: extensions/v1beta1
     kind: DaemonSet
     metadata:
       name: splunk-forwarder-daemonset
     spec:
       template:
         metadata:
           labels:
              app: splunk-forwarder
         spec:
           containers:
           - name: splunkuf
             image: splunk/universalforwarder:6.5.2-monitor
             env:
             - name: SPLUNK_START_ARGS
               value: "--accept-license --answer-yes"
             - name: SPLUNK_USER
               value: root
             volumeMounts:
             - mountPath: /var/run/docker.sock
               readOnly: true
               name: docker-socket
             - mountPath: /var/lib/docker/containers
               readOnly: true
               name: container-logs
             - mountPath: /opt/splunk/etc/apps/splunkclouduf/default
               name: splunk-config
             - mountPath: /var/log/containers
               readOnly: true
               name: pod-logs
           volumes:
             - name: docker-socket
               hostPath:
                 path: /var/run/docker.sock
             - name: container-logs
               hostPath:
                 path: /var/lib/docker/containers
             - name: pod-logs
               hostPath:
                 path: /var/log/containers
             - name: splunk-config
               configMap:
                 name: splunk-forwarder-config
    
    • We're using Splunks' Universal Forwarder image
    • As /var/log/containers is a symbolic link to /var/lib/docker/containers, we'll need to mount both paths in our Pod.
    • The config map is mounted as a volume in the Pod to /opt/splunk/etc/apps/splunkclouduf/default
  3. Start your workload on Kubernetes, and you should see the container logs being forwarded to your splunk instance with sourcetype = kubernetes and host = {PodName}