Placeholder image

Balint Molnar

Wed, Apr 25, 2018


Kubernetes persistent volume options

At Banzai Cloud we push different types of workload to Kubernetes with our open source PaaS, Pipeline. There are lots of deployments we support and have defined the Helm charts however Pipeline is able to deploy applications from any repository. These deployments are pushed on-prem or in the cloud but among many there is one common feature, the need for persistent volumes. The options provided by Kubernetes are abundant and every cloud provider has a custom/additional offering as well. This post would like to shed some light and offer some guidance on the available options.

Volumes

In nearly every scenario applications require some kind of storage, for example to store the config files or maybe the logs what they are producing, or just need to share some results with other applications. This blog will give a brief overview about Kubernetes supported volumes especially on Persistent Volumes with ReadWriteMany requirements.

Docker volumes

By default Docker provides volumes for containers but they are not managed, so if the container crashes for whatever reason, all the data written to them are lost. Kubernetes addresses this problem by providing various managed volumes, and the lifecycle of them is not depending simply on the container which uses them.

Kubernetes EmptyDir

The most simple one is the EmptyDir. Kubernetes creates it when the assigned Pod is created and as its name says it is empty. This volume outlives any Container or even Pod failures. If the Pod is rescheduled to a different Node, all information will be lost and the dir will be deleted forever in the previous Node and will be created on the new one. Emptydir by default uses the underlying machine’s storage, but it can be configured to use the machine’s memory instead. For details check the EmptyDir documentation.

Kubernetes Persistent Volumes

Persistent Volumes is the most durable storage solution that Kubernetes offers. The lifetime of a PV is the Kubernetes cluster, so as far as the cluster is healthy these volumes can be reached. Various cloud providers support different storage solutions for Persistent Volumes. For example on Azure there is AzureDisk and AzureFile, but Google has GooglePersistentDisks. Regardless of the used cloud provider the following configurations are mandatory for using a Persistent Volume:

Dynamic or Static Provisioning

The Persistent Volumes can be provisioned in two ways:

  • Static: In this case the admin creates the Persistent Volumes, and the application developer needs to bound them by specifying their name in the Pod yaml file.
  • Dynamic: Unlike static, the volume is not set up in advance by an administrator rather the user specifies its volume requirements through a Persistent Volume Claim(like requested storage size, access mode, etc). Also administrators should create at least one Storage Class, which classifies the underlying storage solution. For example how redundant is the storage or how fast it will be. Cloud providers that support managed Kubernetes sets up a default one (the default storage class must have the following annotation storageclass.beta.kubernetes.io/is-default-class=true). App developers should specify the StorageClass name in the VolumeClaim otherwise the default will be used, the cluster will try to dynamically bound the Volume for the application.
Provisioner

In case of Static a Persistent Volume otherwise the Storage Class needs to contain information about the provisioner. Provisioners are the Kubernetes plugins which bounds the required volume to the pods. Use supported provisioners, for example on a GKE cluster AzureFile cannot be used. There are cloud unbounded solutions as well like GlusterFS but they have significantly bigger configuration impacts (complex, can be automated, etc - welcome Pipeline).

Access Mode

Persistent Volumes can be mounted on the VM in three different ways. >Keep in mind not all modes are supported by all the resource providers. This table lists the supported modes by providers.

  • ReadWriteOnce (the volume can be mounted as read-write by a single node)
  • ReadOnlyMany (the volume can be mounted read-only by many nodes)
  • ReadWriteMany (the volume can be mounted as read-write by many nodes)

Let’s try it out

Create the cluster

We are going to create an AKS cluster with a Storageclass that uses Azure File as provider and it enables acquiring a ReadWriteMany volume. To do that first we need to create a Kubernetes cluster. Pipeline helps you to create a Kubernetes cluster from scratch in minutes on all major cloud providers, or adopt an on-prem cluster as well. Just to recap:

  • Create Kubernetes clusters on all major cloud providers
  • Provide an end-to-end language agnostic CI/CD solution
  • Manage application (Helm) repositories
  • Manage cluster profiles
  • Deploy applications using Helm and manage app lifecycle
  • Deploy spotguides
  • Provides out of the box observability (log collection, tracing, monitoring)

Please create an AKS cluster with the help of this Postman collection. Create an AKS cluster using the Cluster Create AKS API call. Please modify the body of the request to install 1.7.9 Kubernetes instead of 1.9.2, we will update the cluster in a different step later.

If you need help creating a Kubernetes cluster with Pipeline please read the following readme.

We are going to use a mysql chart with a dynamically allocated Azurefile with access mode ReadWriteMany.

Those who are unfamiliar with Azurefile please follow this link to get more information about it.

Create the StorageClass

To use the AzureFile as a StorageClass a Storage Account has to be created first, on the same resource group where the cluster is located.

az group list --output table
MC_RGbaluchicken_azclusterbaluchicken0_westeurope    westeurope  Succeeded

If you identified your resource group create the Storage Account.

az storage account create --resource-group MC_RGbaluchicken_azclusterbaluchicken787_westeurope --name banzaicloudtest --location westeurope --sku Standard_LRS

Define a StorageClass.yaml:

kubectl create -f - <<EOF
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: azurefile
provisioner: kubernetes.io/azure-file
parameters:
  location: westeurope
  skuName: Standard_LRS
  storageAccount: banzaicloudtest
EOF

To interact with the created cluster download the Kubernetes config using the Postman’s call Cluster Config.

Deploy Mysql

Use the given Postman collection to deploy the mysql chart into the cluster. Look for Deployment Create API call and replace the body with this:

{
	"name": "stable/mysql",
	"values": {
		"persistence": {
			"accessMode": "ReadWriteMany",
			"storageClass": "azurefile"
		}
	}
}

Check if the pod is up and running:

kubectl get pods -w
NAME                                  READY     STATUS            RESTARTS   AGE
ardent-ferrit-mysql-140208387-f0mhb   0/1       PodInitializing   0          7s
ardent-ferrit-mysql-140208387-f0mhb   0/1       Running   0         20s
ardent-ferrit-mysql-140208387-f0mhb   1/1       Running   0         1m

Everything looks good, it means the volume is bounded successfully, and mysql is ready to use. Now update the cluster to use a more recent Kubernetes version 1.8.6.

We are adding the programmatic cluster upgrade to AKS as well (as we already do for the other providers), please watch this issue

az aks upgrade --name azclusterbaluchicken787 --resource-group RGbaluchicken --kubernetes-version 1.8.6

If you want to upgrade to an even higher Kubernetes version 1.9.2 you can also do that but AKS cannot upgrade from version 1.7.9 directly to 1.9.2 you need to update the cluster to 1.8.x first.

Pipeline’s next release 0.4.0 will support updating the Kubernetes version.

Now check back to the cluster:

kubectl get pods
NAME                                  READY     STATUS             RESTARTS   AGE
ardent-ferrit-mysql-140208387-vhm4q   0/1       CrashLoopBackOff   7          17m

Something went wrong, it turned out that the default directory mode and file mode differs between Kubernetes versions. So while the the access mode is 0777 for Kubernetes v1.6.x, v1.7.x, in case of v1.8.6 or above it is 0755.

The Helm chart for mysql shows it uses root to do the setup, so what could be the problem. If we check the Dockerfile, it creates a mysql user so chmod 0755 does not allow to write anything to the required directory.

To solve this, modify the StorageClass created earlier and add the following (this way you force the access mode to 0777)

mountOptions:
  - dir_mode=0777
  - file_mode=0777

You may wonder, why don’t we simply put this option to the Storageclass declaration when Kubernetes version is 1.7.9, the reason is that Kubernetes supports this feature only in 1.8.5 and above.

Hope this was helpful. If you are interested in our technology and open source projects, follow us on GitHub, LinkedIn or Twitter:

Star



Comments

comments powered by Disqus