Placeholder image

Toader Sebastian

Tue, May 1, 2018


A complete guide to Kubernetes Operator SDK

At Banzai Cloud we are always looking for new and innovative technologies to support our users with their transition towards microservices deployed to Kubernetes, using Pipeline. In the recent months we have been partnered with CoreOS and RedHat to work on operators and the project it has just been made open source today and available on GitHub. If you read through this blog you’ll learn what is an operator, how to use the operator sdk to develop an operator through a concrete example that we developed and used here at Banzai Cloud. Also there a few operators available on our GitHub all built on the new operators SDK.

tl;dr:

  • a new Kubernetes operator framework has been released today
  • we were actively involved in the new SDK and as a result we have released a few
  • the operator discussed in this blog provides seamless and ouf the box monitoring for any JVM based applicatiojn without actually having a scrape interface

Deploying and operating complex applications that consists of multiple interdependent components/services on Kubernetes is not always trivial with the constructs provided by Kubernetes. Just as a simple example if an application requires a minimum number of instances, that can be solved with Kubernetes deployments. However if these instances have to be reconfigured or reinitialized at runtime whenever the number of instances changes (upscale/downscale) than we need to react to these events and perform the necessary reconfiguration steps. Trying to solve these by implementing scripts that uses Kubernetes command line tools can easily get cumbersome especially as we get closer to real life use cases where we have to deal with resiliency, log collection, monitoring etc.

CoreOS introduced operators for automating the handling these kind of complex operational scenarios. In a nutshell operators extend the Kubernetes API through the third party resources mechanism (custom resource) and gives fine granular access and control over what’s going on inside the cluster.

Before we go further, a few words about Kubernetes custom resources to better understand what an operator is. A resource in Kubernetes is an endpoint in the Kubernetes API that stores Kubernetes objects (e.g. Pod object) of certain kind (e.g. Pod). A custom resource is essentially a resource that can be added to Kubernetes to extend the basic Kubernetes API. Once a custom resource is installed users can manage objects of this kind with kubectl the same way as they do for the built-in Kubernetes resources like pods for example. There must be a controller that carries out the operations induced via kubectl. Custom controllers are controllers for custom resources. To summarize, an operator is a custom controller to work with custom resources of certain kind.

CoreOS also developed and SDK for developing such operators. The SDK eases the implementation of an operator as it provides high level APIs to write operational logic, generates the skeleton for it saving developers from writing boilerplate code.

Let’s have a look how can we use the Operator SDK.

First we need to install the Operator SDK onto our development machine. If you’re ready to adventure with the latest and greatest install the CLI from the master branch. Once the CLI is installed the development flow would look as follows:

  1. Create a new operator project
  2. Define the Kubernetes resources to watch
  3. Define the operator logic in a designated handler
  4. Update and generate code for custom resources
  5. Build and generate the operator deployment manifests
  6. Deploy the operator
  7. Create custom resources

Create a new operator project

Run the CLI to create a new operator project.

$ cd $GOPATH/src/github.com/<your-github-repo>/
$ operator-sdk new <operator-project-name> --api-version=<your-api-group>/<version> --kind=<custom-resource-kind>
$ cd <operator-project-name>
  • operator-project-name - the CLI generates the project skeleton under this directory
  • your-api-group - this is the Kubernetes API group for the custom resource handled by our operator (e.g. mycompany.com)
  • version - this is the Kubernetes API version for the custom resource handled by our operator (e.g. v1alpha, beta etc see Kubernetes API versioning)
  • custom-resource-kind - the name of custom resource type

Define the Kubernetes resources to watch

The main.go placed under cmd/<operator-project-name> is the main entry point to start and initialize the operator. This is the place to configure the list of resource types the operator is interested in getting notifications about from Kubernetes.

Define the operator logic in a designated handler

The events related to the watched resources received from Kubernetes are channeled into the func (h *Handler) Handle(ctx types.Context, event types.Event) error defined in pkg/stub/handler.go. This is the place to implement your operator logic that reacts to various events published by Kubernetes.

Each custom resource has structure. The structure of the custom resource handled by our operator must be specified in types.go which resides under pkg/apis/<api-group>/<version>. The Spec field where we can define the structure for the specification of the custom resource. There is also a Status field that is meant to be populated with information that describe the state of the custom resource object.

The Operator SDK exposes functions for performing CRUD operations on Kubernetes resources:

  • query package - defines functions for retrieving Kubernetes resources available in the cluster
  • action package - defines functions for creating, updating and deleting Kubernetes resources

For more details on how to use these functions see the concrete operator example below.

Update and generate code for custom resources

Whenever there are changes done to types.go there are some generated code that needs refreshing as it depends on the types defines in the types.go.

$ operator-sdk generate k8s

Build and generate the operator deployment manifests

Build the operator and generate deployment files.

operator-sdk build <your-docker-image>

A docker image that contains the binary of your operator is built and this image needs to be pushed to a registry.

The deployment files for creating custom resource and deploying the operator handling these are generated under deploy directory.

  • operator.yml - this is for installing the customer resource definition and deploying the operator (custom controller). Any changes to this file will be overwritten whenever operator-sdk build <your-docker-image> is executed.
  • cr.yaml - this is for defining the specs of the custom resource. This will be unmarshalled into an object and passed to the operator.
  • rbac.yaml - this defines the RBAC to be created for the operator in case the Kubernetes cluster has RBAC enabled.

Deploy the operator

$ kubectl create -f deploy/rbac.yaml
$ kubectl create -f deploy/operator.yaml

Create custom resources

Once the operator is running you can start creating custom resources that your operator was implemented for. Populate the spec section of deploy/cr.yaml with data that you want to pass to the operator. The structure of spec must comply with the structure of Spec field in types.go.

$ kubectl create -f deploy/cr.yaml

To see the customer resource objects in the cluster:

$ kubectl get <custom-resource-kind>

To see a specific custom resource instance:

$ kubectl get <custom-resource-kind> <custom-resource-object-name>

The Prometheus JMX Exporter case

Our PaaS Pipeline deploys applications to Kubernetes clusters and provides enterprise feature like monitoring, centralized logging to name a a few ones.

For monitoring we use Prometheus to collect metrics from applications that we deploy. If you’re interested why we chose Prometheus read our monitoring blog series.

Applications may not publish metrics to Prometheus by themselves so faced the question what can we do to enable publishing metrics to Prometheus out of the box for these apps. There is handy component Prometheus JMX Exporter written for Java applications that can query data from mBeans via JMX and expose these in a format required by Prometheus.

The requirements here are:

  • identify pods that run Java applications that don’t publish metrics themselves for Prometheus
  • inject the Prometheus JMX Exporter java agent into the application to expose metrics
  • provide a configuration for the Prometheus JMX Exporter java agent that controls what metrics to be published
  • make Prometheus server automatically aware of the endpoint where it can scrape metrics from
  • these operations should not be intrusive (should not restart the pod)

In order to achieve the requirements listed above we’d to perform quite a few operations thus we decided to implement an operator for it. Let’s see how this is implemented.

Prometheus JMX Exporter as it is implemented can be loaded into Java processes only at JVM startup. Happily only a small change was required to make it loadable into an already running Java process. You can take a look at the change in our jmx_exporter fork

We need a loader that loads the JMX exporter Java agent into a running Java process identified by PID. The loader is a fairly small application and its source code is available here

Prometheus JMX Exporter requires a configuration to be passed in. We’ll store the configuration for the exporter in a Kubernetes config map

The custom resource for our operator(types.go):

type PrometheusJmxExporter struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata"`
	Spec              PrometheusJmxExporterSpec   `json:"spec"`
	Status            PrometheusJmxExporterStatus `json:"status,omitempty"`
}

type PrometheusJmxExporterSpec struct {
	LabelSelector map[string]string `json:"labelSelector,required"`
	Config        struct {
		ConfigMapName string `json:"configMapName,required"`
		ConfigMapKey  string `json:"configMapKey,required"`
	} `json:"config"`
	Port int `json: port,required`
}
  • LabelSelector - specifies the labels which Pods are selected
  • ConfigMapName, ConfigMapKey - the config map that contains the configuration for Prometheus JMX Exporter
  • Port - the port number for the endpoint where metrics will be exposed for Prometheus server.

An example yaml file to create a customer resource object:

apiVersion: "banzaicloud.com/v1alpha1"
kind: "PrometheusJmxExporter"
metadata:
  name: "example-prom-jmx-exp"
spec:
  labelSelector:
    app: dummyapp
  config:
    configMapName: prometheus-jmx-exporter-config
    configMapKey: config.yaml
  port: 9400

The custom resource spec holds the data that instructs the operator logic what pods to process, the port to expose the metrics at, the config map that stores the metrics configuration for the exporter.

The status of a PrometheusJmxExporter custom resource object should list the metrics endpoints that were created based on it’s specs thus the structure for Status field is:

type PrometheusJmxExporterStatus struct {
	MetricsEndpoints []*MetricsEndpoint `json: metricsEndpoints,omitempty`
}

type MetricsEndpoint struct {
	Pod  string `json:"pod,required"`
	Port int    `json:"port,required"`
}

The operator has to react to events related to PrometheusJmxExporter custom resources and Pods thus it has to set up watches on these kind of resources(main.go):

func main() {
    ...
    namespace := os.Getenv("OPERATOR_NAMESPACE")
    sdk.Watch("banzaicloud.com/v1alpha1", "PrometheusJmxExporter", namespace, 0)
    sdk.Watch("v1", "Pod", namespace, 0)
    ...
}

The handler for handling the events related to PrometheusJmxExporter custom resources and Pods is defined in handler.go:

func (h *Handler) Handle(ctx types.Context, event types.Event) error {
    switch o := event.Object.(type) {
    case *v1alpha1.PrometheusJmxExporter:
        prometheusJmxExporter := o
    ...
    ...
    case *v1.Pod:
        pod := o
    ...
    ...
}

When a PrometheusJmxExporter custom resource object is created/updated the operator:

  1. queries all pods in the current namespace of which labels matches the labelSelector of the PrometheusJmxExporter custom resource object spec.
  2. verifies which of the returned pods were already processed to skip those
  3. process the remaining pods
  4. update the status of the current PrometheusJmxExporter custom resource with the newly created metrics endpoints

When a Pod is created/updated/deleted the operator:

  1. searches for the PrometheusJmxExporter custom resource object of which labelSelector matched the pod
  2. if a PrometheusJmxExporter custom resource object is found than continues with processing the pod
  3. update the status of the PrometheusJmxExporter custom resource with the newly created metrics endpoints

In order to query Kubernetes resources we use the query package of the Operator SDK.

e.g.:

podList := v1.PodList{
    TypeMeta: metav1.TypeMeta{
        Kind:       "Pod",
        APIVersion: "v1",
    },
}

listOptions := query.WithListOptions(&metav1.ListOptions{
    LabelSelector:        labelSelector,
    IncludeUninitialized: false,
})

err := query.List(namespace, &podList, listOptions)
if err != nil {
    logrus.Errorf("Failed to query pods : %v", err)
    return nil, err
}

jmxExporterList := v1alpha1.PrometheusJmxExporterList{
    TypeMeta: metav1.TypeMeta{
        Kind:       "PrometheusJmxExporter",
        APIVersion: "banzaicloud.com/v1alpha1",
    },
}

listOptions := query.WithListOptions(&metav1.ListOptions{
    IncludeUninitialized: false,
})

if err := query.List(namespace, &jmxExporterList, listOptions); err != nil {
    logrus.Errorf("Failed to query prometheusjmxexporters : %v", err)
    return nil, err
}

To update Kubernetes resources we use the action package of the Operator SDK. e.g.:

// update status
newStatus := createPrometheusJmxExporterStatus(podList.Items)

if !prometheusJmxExporter.Status.Equals(newStatus) {
    prometheusJmxExporter.Status = createPrometheusJmxExporterStatus(podList.Items)

    logrus.Infof(
        "PrometheusJmxExporter: '%s/%s' : Update status",
        prometheusJmxExporter.Namespace,
        prometheusJmxExporter.Name)

    action.Update(prometheusJmxExporter)
}

The processing of a pod consists of the following steps:

  1. execute jps inside the containers of the pod to get the PID of java processes
  2. copy the Prometheus JMX Exporter and java agent loader artifacts into the containers where there is a Java process found
  3. read the exporter configuration from the config map and copy it into the container as a config file
  4. run the loader inside the container to load the exporter into the Java process
  5. add to the container’s exposed port list the port of the exporter such as Prometheus server will be able to scrape this port.
  6. annotate the pod with prometheus.io/scrape and prometheus.io/port as Prometheus server scrapes pods with these annotations.
  7. flag the pod with an annotation to mark that it has been successfully processed.

As the Kubernetes API doesn’t support directly the execution of a command inside a container we borrowed the implementation from kubectl exec. The same is true for kubectl cp.

The source code of Prometheus JMX Exporter operator is available on GitHub

If you are interested in our technology and open source projects, follow us on GitHub, LinkedIn or Twitter:

Star



Comments

comments powered by Disqus