Banzai Cloud Logo Close
Home Benefits Blog Company Contact
Sign in
Author Flora Piszker

The challenges (and resolutions) of working with Azure AKS

TRY PIPELINE FOR FREE

We are moving relatively quickly, implementing new Pipeline features and releases, with our second major release scheduled for this week. Among other new features we’ve already added a new managed Kubernetes provider, Microsoft’s Azure AKS.

Azure Container Service (AKS) is a preview feature of the Azure Cloud - and we’re proud to be among its earliest adopters. We can provision and deploy apps to Kubernetes on Azure VMs the same way we do on EC2, however, at Banzai Cloud we strongly believe that the future is in managed Kubernetes services; most of our investment regarding cloud neutrality and provisioning is built on managed Kubernetes services both in the cloud (GKE, OCI and ACS in beta, or under development) and on-prem.

No official AKS Golang SDK has so far been made available, thus we’ve created and opensourced an Azure AKS Golang client/SDK

We are already successfully pushing Spark on Kubernetes, Zeppelin on Kubernetes, TensorFlow on Kubernetes, TiDB on Kubernetes, Java on Kubernetes and several other spotguides to AKS the cloud native way, and using our CI/CD pipeline to automate the build, deployment, provisioning and monitoring of these applications.

The case for managed Kubernetes (with Pipeline)

  • Cost - there is no direct cost associated with using managed Kubernetes, sometimes even the master nodes come free of charge
  • Ease of use - Kubernetes can be difficult to set up and manage, especially in large-scale environments. Pipeline does this for free on behalf of cloud providers
  • Infrastructure - most managed Kubernetes services include infrastructure along with k8s management, particularly in public clouds
  • Vendor unlocking - we provide a standard interface that allows for the provisioning of managed Kubernetes clusters across multiple providers, even allowing us to build federated clusters across multiple providers

During AKS integration we came across several issues with, or limitations inherent in, the platform. We’re happy to report that we’ve collected, fixed and contributed these back to the community to make AKS a little better and speed up the progress towards GA. Just as in Pipeline, every product in preview mode needs early adopters: individuals or companies on the cutting edge of software development.

Azure Managed Kubernetes AKS issues and resolutions

  1. Pipeline tries to install Helm (Tiller) inside an AKS cluster (as a pod) after successfully creating one. Even though the cluster creation API call has been consistently returned successfully there have been cases when installing Helm failed with net/http: TLS handshake timeout. Apparently this error message was encountered by a few other people as well:

    Until this is fixed upstream we overcome it by retrying the Helm install steps in a (crash)-loop

  2. Pipeline deploys an ingress controller into the cluster and exposes it to the user through a LoadBalancer-type Kubernetes service. When using a LoadBalancer-type service, Kubernetes talks to cloud providers to get an Azure Load Balancer that is reachable from outside. We’ve seen cases in which it takes minutes for the URL of the AzureLB to be created and assigned to the Kubernetes service. Until the Kubernetes service is created, it shows Pending... as its PublicIP.

  3. The Pipeline CI/CD workflow uses Persistent Volumes (PV) to pass data between workflow steps. Pipeline gets a PV by issuing a request for storage via a Persistent Volume Claim (PVC). The PVC describes the type of the requested storage (Storage Class) and size. The Storage Class describes what kind of Storage is requested from the cloud provider. By default, Pipeline asks for storage with the name default. In this case, AKS will provide a Managed/Standard_LRS storage account, which works fine.

    If we need to create our own Storage Class with the same params (Managed/Standard_LRS), or to create a new storage class with the same properties, the Storage Account is never created but the PVC bounds successfully.

    If the storage class is changed to Shared instead of Managed, the storage account is created but the PVC remains in a pending state.

  4. If we want to programmatically create a Storage Account, we need to pass a Security Group(SG) to the AKS API. The SG must be the one that corresponds to the AKS cluster and not the one the cluster was created into. We couldn’t find any AKS API method that would return the SG of the cluster. The only solution we found was to reference the SG using MC_<RG that contains the cluster>_<cluster name>_<cluster_location> (e.g. MC_SGBanzaiCloud_cluster1_westeurope). The problem with this approach is that the format of the name of the SG that corresponds to the cluster may change over time, thus we’d prefer that the AKS API support retrieval of the SG.

  5. The AKS API allows specifying Kubernetes versions, but it doesn’t allow specifying what Docker version is to be installed.

  6. There is a discrepancy in the cluster name validation between the AKS API and CLI. The API does not allow hyphens but only underscores, while the CLI allows only hyphens and no underscores.

We hope this was helpful - if you’re interested in how we provision AKS clusters and deploy apps, make sure to check back later this week to read about the new AKS release.

If you’re interested in our technology and open source projects, follow us on GitHub, LinkedIn or Twitter:

Star

TRY PIPELINE FOR FREE

Comments

comments powered by Disqus