Supertubes automates the deployment of the ksqlDB event streaming database by introducing a new custom resource called KsqlDB. Supertubes provides two modes to manage ksqlDB backend(s):
Both methods use the KsqlDB Custom Resource Definition under the hood to manage ksqlDB instances.
Imperative management of ksqlDB instances 🔗︎
The Supertubes CLI provides commands to deploy ksqlDB instances with either default or custom settings with ease.
Note: To deploy ksqlDB instances or manage existing ones, run the
supertubes cluster ksql createand
supertubes cluster ksql updatecommands.
Declarative management of ksqlDB instances 🔗︎
Managing ksqlDB instances with Supertubes is as simple as creating and updating the
KsqlDB custom resource. Supertubes automatically monitors the ksqlDB deployment and configuration settings specified using the
KsqlDB custom resource. For details on the custom resource, see the description of the custom resource.
These will perform the necessary steps to spin up new ksqlDB instances or reconfigure existing ones with the desired configuration.
Introduction to ksqlDB 🔗︎
For a detailed description on how to manage ksqlDB with Supertubes, see our Managing ksqlDB with Supertubes blog post.
Modes of operation 🔗︎
The ksqlDB server has two modes of operation: interactive and non-interactive (or headless) mode. For details, see the official ksqlDB documentation.
Supertubes supports both modes, and uses the interactive mode by default. To enable and configure ksqlDB in headless mode, see Running ksqlDB in headless mode.
Scaling by HPA 🔗︎
Supertubes takes care of scaling ksqlDB using a Horizontal Pod Autoscaler (HPA). The twist here is that by default, HPAs only support scaling through basic CPU or memory usage. While that’s generally enough for most workloads, in the case of ksqlDB it’s a much better to scale by
When ksqlDB cannot keep up with the rate of messages produced on your Kafka topics, it can fall behind in its processing of incoming data. Scaling by consumer lag helps solve this issue far better than scaling by any traditional metric. In the Supertubes ecosystem, we already track consumer lag in our Prometheus instance.
To enable HPA to understand the consumer lag metrics, deploy the kube-metrics-adapter helm chart. An already deployed and configured HPA will do the rest for you.
# Default HPA configuration scaling: prometheusUrl: http://prometheus-operator-prometheus.supertubes-system.svc:9090 # Name of the ksqlDB streams that the PrometheusMetric will be filtered by streams:  # Minimum number of replicas minValue: 1 # Maximum number of replicas maxValue: 5 # Threshold for the hpa to activate threshold: 30
Supertubes security features (like Kafka ACLs) apply to the ksqlDB deployment as well. The following sections detail the additional options that allow you to configure security for ksqlDB.
You can configure the authorization policy through the
authorizations field of the
KsqlDB custom resource. Only the listed principals can access the ksqlDB server.
You can list arbitrary number of
ServiceAccount entities in the specification.
Example authorization settings 🔗︎
Here’s an example authorization spec, that allows traffic to the ksqlDB server for the
user-1 user and the
default service account.
Authorizations: - Principal: Kind: KafkaUser Namespace: kafka Name: user-1 - Principal: Kind: ServiceAccount Namespace: kafka Name: default
Access ksqlDB from outside the service mesh 🔗︎
In order to access the ksqlDB from a CLI instance which is outside the service mesh, you have to configure the certificates manually.
- Extract the certificates from Istio as described in Client applications outside the Istio mesh.
- Use that certificate to configure the CLI as described in the ksqlDB’s documentation.
Access control 🔗︎
Supertubes manages ACLs for
ksqlDB and even provides a way to fine grain your configuration through the KsqlDB Custom Resource Definition. For example:
... Spec: # Input topics to be used in ksql queries for reading inputTopics:  # Output topics to be used in ksql queries for write and create outputTopics:  ...
The KsqlDB custom resource definition 🔗︎
apiVersion: kafka.banzaicloud.io/v1beta1 kind: KsqlDB metadata: name: ksqldb-sample namespace: kafka spec: # Name of the KafkaCluster custom resource that represents the Kafka cluster this ksqlDB instance to connect to clusterRef: name: kafka # Name of the SchemaRegistry custom resource that represents the Schema registry to be made available for ksqlDB schemaRegistryRef: # Name of the KafkaConnect custom resource that represents the Kafka Connect to be made available for ksqlDB kafkaConnectRef: # Controls whether mTLS is enforced between ksqlDB and client applications (default: true) MTLS: true # Affinity settings for ksqlDB pods # see https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#node-affinity affinity: # Controls the list of principals who are authorized to access the ksqlDB REST API authorizations: # Settings for exposing ksqlDB REST API outside the Kubernetes cluster when running in interactive mode externalEndpoint: # Controls whether the ksqlDB is running in headless or interactive mode (default: false) headless: false # Heap settings for ksqlDB (default: -Xms512M -Xmx2G) heapOpts: -Xms512M -Xmx2G image: # PullPolicy describes a policy for if/when to pull a container image imagePullPolicy: imagePullSecrets: # Input topics to be used in ksql queries for reading inputTopics: # Output topics to be used in ksql queries for write and create outputTopics: # JmxExporterSpec defines the configuration for jmx exporter jmxExporter: # Defines the config values for ksqlDB ksqlDBConfig: # Node selector setting for ksqlDB pods # https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector nodeSelector: # Annotations to be applied to ksqlDB pod podAnnotations: # Labels to be applied to ksqlDB pod podLabels: # Controls the name of the configmap which contains the ksqldb queries executed in headless mode. (default: <ksqldb cr name>-ksql-queries-configmap) Inside the configmap the query should be named as `queries.sql` queryConfigMapName: # Resources describes the compute resource requirements # default: # requests: # cpu: 1 # memory: 1.5Gi # limits: # cpu: 2 # memory: 2.5Gi resources: # Defines HPA configurations scaling: # Service account for ksqlDB pod serviceAccountName: # Annotations to be applied on the service that exposes ksqlDB API on port `ServicePort` serviceAnnotations: # Labels to be applied to the service that exposes ksqlDB API on port `ServicePort` serviceLabels: # The port ksqlDB listens for REST API requests servicePort: # Toleration settings for ksqlDB pods # see (https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) tolerations: # Volume mounts for ksqlDB pods # see (https://kubernetes.io/docs/concepts/storage/volumes/) volumeMounts: # Volumes for ksqlDB pods # see (https://kubernetes.io/docs/concepts/storage/volumes/) volumes:
KsqlDB configurations are computed and maintained by Supertubes, and cannot be overridden:
- ksql.schema.registry.url (if
- ksql.connect.url (if
The default KsqlDB custom resource 🔗︎
apiVersion: kafka.banzaicloud.io/v1beta1 kind: KsqlDB metadata: name: ksqldb-sample spec: clusterRef: name: "kafka"