Kubernetes Logging with Splunk

In my work as an open source developer on the Partner Catalyst Team within Microsoft, I get a chance to work with partners and help them succeed on Azure. Recently, we hosted a hackfest with a partner to help them migrate some of their workload to Kubernetes. The problem that we helped solve with the partner is a common problem with any microservices architecture -- with your multitude of services deployed across a variety of machines, how does one centralize their logs such that they are able to debug services when things go awry?

In the case of the partner I was working with, they were an internal platform team within a large corporation and were tasked with figuring out a way to migrate their existing services (some in containers, most of them were not) to Kubernetes. In terms of logging, we needed to come up with a strategy that would work for their various existing services that each used an assortment of logging strategies and push said logs to their Splunk instance.

The Kubernetes documentation on cluster-level logging is good starting place to understand the various patterns. After much discussion, we came up with a two-prong strategy:

  1. Node Logging Agent - For applications that logged to stdout/stderr, Docker logs are persisted at /var/lib/docker/containers of the node. A Splunk forwarder will operate at the node level forwarding logs from all containers.
  2. Sidecar Container Logging Agent - For applications that send their logs to a different file (e.g. Apache access logs), a sidecar container running a Splunk forwarder will forward logs from a shared volume.

The following post will walk through how we set this up.

  1. Download Universal Forwarder Credentials

    • Login to your Splunk instance, and on the home page, click on "Universal Forwarder".
    • Download the Universal Forwarder Credential Package.
    • Untar the .spl file
  2. Create Config Map

    • Create a config map of the Splunk Universal Forwarder Credentials:

      kubectl create configmap splunk-forwarder-config --from-file {PATH-TO-YOUR-SPLUNK-CREDENTIALS} --dry-run -o yaml > splunk-forwarder-config.yaml
      
    • Don't apply it just yet! We'll need to make a slight addition. Add an inputs.conf to the data section of the ConfigMap. Here's what our's looked like:

      kind: ConfigMap
      apiVersion: v1
      metadata:
        name: splunk-forwarder-config
      data:
        cacert.pem: ...
        client.pem: ...
        limits.conf: ...
        outputs.conf: ...
        server.pem: ...
        inputs.conf: |
           # watch all files in <path>
           [monitor:///var/log/containers/*.log]
           # extract `host` from the first group in the filename
           host_regex = /var/log/containers/(.*)_.*_.*\.log
           # set source type to Kubernetes
           sourcetype = kubernetes
      

      Docker logs are persisted on the node under the following directory structure: /var/lib/docker/containers/{ContainerId}. The logs don't have knowledge of the Kubernetes cluster which makes it difficult to parse as one requires the container id to search for logs.

      Kubernetes resolves this by creating symbolic links that capture the pod name, namespace, container name, and Docker container ID: /var/log/containers/{PodName}_{Namespace}-{ContainerName-ContainerId}.log --> /var/lib/docker/containers/{ContainerId}/{ContainerId}-json.log. It is this directory that we will instruct our Splunk forwarder to watch.

    • Apply the config map

      kubectl create -f splunk-forwarder-config.yaml
      
  3. Create Splunk Forwarder DaemonSet

    We want to run the Splunk forwarder on every node to forward logs from /var/log/containers/*.log. We acheive this by creating a DaemonSet:

    apiVersion: extensions/v1beta1
    kind: DaemonSet
    metadata:
      name: splunk-forwarder-daemonset
    spec:
      template:
        metadata:
          labels:
             app: splunk-forwarder
        spec:
          containers:
          - name: splunkuf
            image: splunk/universalforwarder:6.5.2-monitor
            env:
            - name: SPLUNK_START_ARGS
              value: "--accept-license --answer-yes"
            - name: SPLUNK_USER
              value: root
            volumeMounts:
            - mountPath: /var/run/docker.sock
              readOnly: true
              name: docker-socket
            - mountPath: /var/lib/docker/containers
              readOnly: true
              name: container-logs
            - mountPath: /opt/splunk/etc/apps/splunkclouduf/default
              name: splunk-config
            - mountPath: /var/log/containers
              readOnly: true
              name: pod-logs
          volumes:
            - name: docker-socket
              hostPath:
                path: /var/run/docker.sock
            - name: container-logs
              hostPath:
                path: /var/lib/docker/containers
            - name: pod-logs
              hostPath:
                path: /var/log/containers
            - name: splunk-config
              configMap:
                name: splunk-forwarder-config
    
    • We're using Splunks' Universal Forwarder image
    • As /var/log/containers is a symbolic link to /var/lib/docker/containers, we'll need to mount both paths in our Pod.
    • The config map is mounted as a volume in the Pod to /opt/splunk/etc/apps/splunkclouduf/default
  4. Start your workload on Kubernetes, and you should see the container logs being forwarded to your splunk instance with sourcetype = kubernetes and host = {PodName}