Banzai Cloud Logo Close
Home Products Benefits Blog Company Contact
Get Started
Author Robbie Blaine

Automating Vault Deployment and Configuration on OKD with Bank-Vaults

The following is a guest blog post from Robbie Blaine, Site Reliability Engineer at EOH Big Data Lab. Contributions from the community are a key factor in driving our products forward. BIG thanks to all of you who have engaged with us by raising issues, giving feedback, or creating pull requests. Keep them coming, we love them!

Automating Vault Deployment and Configuration on OKD with Bank-Vaults

Hashicorp Vault is an Encryption-as-a-Service tool that is used to securely store and access secrets. A secret can be anything such as API keys, passwords, certificates, etc. Vault offers an interface to access secrets, while allowing access control and detailed audit logging.

At zenAptix we are using Vault as an internal CA for various OpenShift application services (for example: SearchGuard for ElasticSearch), as well as using the dynamic secret store for short lived access to databases on our Analytics-as-a-Service Platform (Aqueduct - Big Data Flow, Technical Overview) which is built off of OKD.

What was your Bank-Vaults Journey like?

We initially started using CoreOS/vault-operator in a development environment when we found Bank-Vaults. We appreciated that it was under active development, built off of the Operator SDK, and Cloud Agnostic.
We also saw an opportunity to contribute towards the Open Source community with Bank-Vaults. We found the maintainers to be friendly on community issues and that the community was active.

As soon as we started experimenting with Bank-Vaults, we got a proof of concept cluster up and running within seconds and we started playing around with adapting the configuration to our needs.

We started by deploying in an OKD Development cluster (oc cluster up) with some permission restrictions disabled (such as giving the system:authenticated group anyuid permissions).
When we started to reintroduce the permission restrictions to run Bank-Vaults in a more production ready environment, we started hitting some permission related issues, namely the fact that Vault had to be run as the root user. We aren’t comfortable running any container as root in our OpenShift cluster, however, we do allow nonroot privileges when needed.

In K8s, to run pods as a specific UID, you can set a securityContext.runAsUser(int) parameter (so long as the Service Account has the nonroot Security Context Constraint), however the Vault Operator couldn’t interpret the securityContext field in the manifest.
This was quite simple to write in Go, so I opened a feature request.
After some discussion with one of the maintainers of Bank-Vaults, NĂ¡ndor asked if I could try to do it myself as the team was busy with product release, I took a stab at it and opened a Pull Request. The maintainers were very friendly and quick to respond to my Pull Request, which they approved and merged.

Snippet of the Go code in the commit:

// operator/pkg/apis/vault/v1alpha1/types.go:64
SecurityContext v1.PodSecurityContext `json:"securityContext,omitempty"`

// operator/pkg/stub/handler.go:486
SecurityContext: withSecurityContext(v),
// operator/pkg/stub/handler.go:495
func withSecurityContext(v *v1alpha1.Vault) *v1.PodSecurityContext {
  if v.Spec.SecurityContext.Size() == 0 {
    return nil
  }
  return &v.Spec.SecurityContext
}

The second part of the problem was that the official Hashicorp Vault docker entrypoint script has a couple of commands that would cause the pod to enter a crash loop.

  • Firstly, Bank-Vaults mounts volumes in the Vault container as emptydir which are mounted with 0777 permissions, however with root:root user & group ownership. The Vault entrypoint script would see the user ownership as being different from the Vault user and try to change ownership (chown operation) of the directories, which would result in a Permission Denied error and a subsequent crash loop.

  • Secondly, even after the directory ownership issue is resolved, Vault will try to enable the IPC_LOCK capability, another root command that, in the context of the container, the Vault user did not have permission to execute.

Excerpt of the relevant section of the Docker Entrypoint script:

if [ "$1" = 'vault' ]; then
  # If the config dir is bind mounted then chown it
  if [ "$(stat -c %u /vault/config)" != "$(id -u vault)" ]; then
    chown -R vault:vault /vault/config || echo "Could not chown /vault/config (may not have appropriate permissions)"
  fi
  # If the logs dir is bind mounted then chown it
  if [ "$(stat -c %u /vault/logs)" != "$(id -u vault)" ]; then
    chown -R vault:vault /vault/logs
  fi
  # If the file dir is bind mounted then chown it
  if [ "$(stat -c %u /vault/file)" != "$(id -u vault)" ]; then
    chown -R vault:vault /vault/file
  fi
  if [ -z "$SKIP_SETCAP" ]; then
    # Allow mlock to avoid swapping Vault memory to disk
    setcap cap_ipc_lock=+ep $(readlink -f $(which vault))
    # In the case vault has been started in a container without IPC_LOCK privileges
  if ! vault -version 1>/dev/null 2>/dev/null; then
    >&2 echo "Couldn't start vault with IPC_LOCK. Disabling IPC_LOCK, please use --privileged or --cap-add IPC_LOCK"
    setcap cap_ipc_lock=-ep $(readlink -f $(which vault))
  fi
fi

The IPC_LOCK capability is a linux kernel IPC utility that allows for applications to prevent application data from being swapped from memory onto disk.

The solution was to have a modified Docker image with the breaking sections of the Docker Entrypoint commented out, the IPC_LOCK capability already set in the image, and granting the Service Account running the Vault Pod the permission to use the IPC_LOCK capability (SecurityContextConstraint).

Our Vault image is available as an automated build in our Quay repository: Docker Repository on Quay

FROM vault:0.11.0
ENV SKIP_SETCAP true
USER root
RUN sed -i 's/chown -R/printf "running in K8s - no chown needed\\n"\n \#chown -R/g' /usr/local/bin/docker-entrypoint.sh && \
    setcap cap_ipc_lock=+ep $(readlink -f $(which vault))
USER vault

To follow Robbie and his company please use the following channels.

Follow @rblaine95 Follow @zenAptix-lab Twitter Follow

If you’re interested in our technology and open source projects, follow us on GitHub, LinkedIn or Twitter:


Comments

comments powered by Disqus