Banzai Cloud Logo Close
Home Products Benefits Blog Company Contact
Sign in
Author Balint Molnar

Kafka security on Kubernetes, automated

Two weeks ago we introduced our Kafka Spotguide for Kubernetes - the easiest way to deploy and operate Apache Kafka on Kubernetes. Since then, it’s been integrated into our application and DevOps container management platform, Pipeline, among other spotguides such as Spark on Kubernetes, Zeppelin, NodeJS and Golang, just to name a few.

Because we’ve already met our goal of making it easy set up a Kafka cluster on Kubernetes with just few clicks, and in less than ten minutes - provisioning and operating its entire infrastructure, both in Kubernetes and Kafka - we’ve shifted our focus to Kafka security.

The Pipeline platform enables easy enterprise grade security consumption; you can read more on how we tackle security through multiple layers and components, here, or read about the CIS Kubernetes benchmark we passed, here.

On a default Kafka installation, any user or application can write messages to topics, as well as read data from topics. Because Kafka is usually accessed by multiple applications or teams and/or the information flying through it is, confidential security is a must. While there are multiple ways of tackling this problem, cloud and Kubernetes based-environments bring an added level of complexity. This is exactly what the Banzai Cloud Pipeline platform makes simple and automates. Keep reading to learn about our method for securing Kafka on Kubernetes.

Kafka security - overview

Kafka security (or general security) can be broken down into three main areas. Documentation pertaining to Kafka security is available on the Apache Kafka site, but these are the high level topics one should go over when considering how best to secure Kafka:

  • Authentication verifies the identity of consumers and producers using SASL or SSL
  • Authorization sets authenticated identity ACLs, in order to check whether they can read/write from a particular broker
  • Encryption uses TLS to encrypt in-flight data between consumers and producers

The Kafka documentation uses the term SSL when it actually means TLS. For consistency’s sake, we will use the term SSL, as well. However, what we mean to say is TLS.

Kafka security on Kubernetes

This post is not intended to be an exhaustive Kakfa security guideline, since there’s already a whole lot of documentation out there. In the following sections, we’ll discuss only those security options made available with the Kafka Spotguide.

Transport layer encryption

Messages routed towards, within, or out of a Kafka cluster are unencrypted by default. By enabling SSL support we can avoid man-in-the-middle attacks and securely transmit data over the network. The Banzai Cloud Pipeline Kafka spotguide allows users to chose between four strategies, then the Kafka spotguide does the rest:

  • Internal - only internal broker communications are secured
  • External - all connections coming from outside the cluster require SSL authentication, internal broker communication is in PLAINTEXT
  • Internal/External - both internal or external communication are secured
  • None - both internal and external communication are in PLAINTEXT

Kafka SSL

In the event someone chooses None, the widely popular (but equally unsecure) gRPC and REST proxy for Kafka - Mailgun’s kafka-pixy - is installed. Unfortunately that proxy does not support encryption, thus it’s only available in this case.

The Banzai Cloud Pipeline platform generates the required certificates, but the user can still bring their own. As is usual for Pipeline, the certificates are stored in Vault and managed by our Vault operator for Kubernetes.

Kafka authentication

Kafka supports multiple auth options; our focus is currently on SASL/SCRAM support, or, to be more specific, SCRAM_SSL. SASL stands for Simple Authorization Service Layer but it’s not simple at all. No problem, we’ve automated everything. This approach comes to us from big data’s legacy - the idea being that authentication should be separated from the Kafka protocol, and username and password hashes should be stored in Zookeeper.

Kafka auth_UI

  • SASL/SCRAM - is a username/password combination alongside a challenge (salt) and requires TLS encryption

When choosing this option, the Spotguide performs all the required changes, from configuring the brokers to accepting secure connections, to generating a JAAS file.

Kafka authorization

Once Kafka clients are authenticated, Kafka needs to be able to decide what they can or can’t do. Authorization is our friend in this case, controlled by Access Control Lists (ACL). The Kafka Spotguide adds a set of ACLs when configuring the brokers. There is an admin user (which works only inside the cluster) with all the rights super.users=User:admin necessary to create topics, ACLs, and to read/write on all topics. Another user (username) is created to access topics (spotguide-kafka topic) from outside of the cluster.

Note that we are using authorizer.class.name=kafka.security.auth.SimpleAclAuthorizer, however, this can always be changed in the broker config.

Kafka auth_UI

What’s next?

Our work doesn’t stop here. Some of our Kafka Spotguide users have been asking for additional features, while at the same time, there are limitations we’d like to address. These are the high level changes coming soon:

  • Currently, there is only one admin user with a password for inner broker communications, and one configurable user with a password for external communication. Multiple user support is currently being tested, and will be released soon.
  • Kafka does not support connections via SSL to Zookeeper but it does support SASL authentication, this feature is coming soon.
  • Support for the widely popular Kafka UI has been added, though it works (by design) with full priviledges. This is already available via our Spotguide, but we’re going to work on making it more restrictive.

Happy streaming!

About Pipeline

Banzai Cloud’s Pipeline provides a platform which allows enterprises to develop, deploy and scale container-based applications. It leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive, yet flexible environment for developers and operations teams alike. Strong security measures—multiple authentication backends, fine-grained authorization, dynamic secret management, automated secure communications between components using TLS, vulnerability scans, static code analysis, CI/CD, etc.—are a tier zero feature of the Pipeline platform, which we strive to automate and enable for all enterprises.

If you’re interested in our technology and open source projects, follow us on GitHub, LinkedIn or Twitter:


Comments

comments powered by Disqus