Banzai Cloud Logo Close
Home Products Benefits Blog Company Contact
Sign in

We are moving relatively quickly, implementing new Pipeline features and releases, with our second major release scheduled for this week. Among other new features we’ve already added a new managed Kubernetes provider, Microsoft’s Azure AKS. Azure Container Service (AKS) is a preview feature of the Azure Cloud - and we’re proud to be among its earliest adopters. We can provision and deploy apps to Kubernetes on Azure VMs the same way we do on EC2, however, at Banzai Cloud we strongly believe that the future is in managed Kubernetes services; most of our investment regarding cloud neutrality and provisioning is built on managed Kubernetes services both in the cloud (GKE, OCI and ACS in beta, or under development) and on-prem.

Read more...

Last time we discussed how our Pipeline PaaS deploys and provisions an AWS EFS filesystem on Kubernetes and what the performance benefits are for Spark or TensorFlow. This post is gives: An introduction to TensorFlow on Kubernetes The benefits of EFS for TensorFlow (image data storage for TensorFlow jobs) Pipeline uses the kubeflow framework to deploy: A JupyterHub to create & manage interactive Jupyter notebooks A TensorFlow Training Controller that can be configured to use CPUs or GPUs A TensorFlow Serving container Note that Pipeline also has default Spotguides for Spark and Zeppelin to help support your datascience experience

Read more...

At Banzai Cloud we provision different frameworks and tools like Spark, Zeppelin and, most recently, Tensorflow, all of which run on our Pipeline PaaS (built on Kubernetes). One of Pipeline’s early adopters runs a Tensorflow Training Controller using GPUs on AWS EC2, wired into our CI/CD pipeline, which needs significant parallelization for reading training data. We’ve introduced support for Amazon Elastic File System and made it publicly available in the forthcoming release of Pipeline.

Read more...

At Banzai Cloud we provision different applications or frameworks to Pipeline, the PaaS we built on Kubernetes. We practice what we preach, and our PaaS’ control plane also runs on Kubernetes and requires a layer of data storage. It was therefore necessary that we explore two different use cases: how to deploy and to run a distributed, scalable and fully SQL compliant DB to cover our client’s, and our own, internal needs.

Read more...

At Banzai Cloud we run and deploy containerized applications to Pipeline, our PaaS. Those of you who (like us) run Java applications inside Docker, have probably already come across the problem of JVMs inaccurately detecting available memory when running inside a container. Instead of accurately detecting the memory available in a Docker container, JVMs see the available memory of the machine. This can lead to cases wherein applications that run inside containers are killed whenever they try to use an amount of memory that exceeds the limits of the Docker container.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

Modern applications and services usually expose their functions via REST; moreover, modules and components also make use of external services that are exposed as REST. Thus, developers often need to design RESTful services and write REST service clients. It’s a given in this kind of work that these services will be called thousands of times during the development process (developers need to understand the API, as well as the messages and the resources involved), and even after, to make sure everything works as desired.

Read more...

Apache Spark on Kubernetes series: Introduction to Spark on Kubernetes Scaling Spark made simple on Kubernetes The anatomy of Spark applications on Kubernetes Monitoring Apache Spark with Prometheus Apache Spark CI/CD workflow howto Spark History Server on Kubernetes Spark scheduling on Kubernetes demystified Spark Streaming Checkpointing on Kubernetes Deep dive into monitoring Spark and Zeppelin with Prometheus Apache Spark application resilience on Kubernetes Apache Zeppelin on Kubernetes series: Running Zeppelin Spark notebooks on Kubernetes Running Zeppelin Spark notebooks on Kubernetes - deep dive CI/CD flow for Zeppelin notebooks

Read more...

As 2017 comes to an end, we’re looking back at the three blog posts that were most popular with our readers. We can’t go too far back (though we’ve had 13 posts and one release already), since we founded our startup just a little over one month ago (on November 20, 2017, to be precise), but during this short period we’ve achieved a whole lot, and laid the foundation for some exciting new projects we plan to ship out early next year.

Read more...

This post is part of the Debug 101 series. If you missed the previous post in this series, check it out here: Nodes successfully joined, not! We’re in the middle of deploying Apache Kafka to Kubernetes the cloud native-way - by totally removing the Zookeeper dependency and using etcd, instead. This means that service registry/discovery and other internal Kafka to Zookeeper operations will be dispatched to a pre-existing etcd cluster.

Read more...