Creating Kubernetes based UberCloud HPC Application Clusters using Containers (2020-12-16)

This article was originally published at UberCloud's Blog

UberCloud provides all necessary automation for integrating cloud based self-service HPC application portals in enterprise environments. Due to the significant differences inside the IT landscape of large organizations we are continuously challenged providing the necessary flexibility within our own solution stack. Hence we continuously evaluate newly adopted tools and technologies about the readiness to interact with UberCloud’s technology.

Recent adoption of Kubernetes not just for enterprise workloads but for all sorts of applications, be it on the edge, AI, or for HPC, has strong focus. We created hundreds of Kubernetes clusters on various cloud providers hosting HPC applications like Ansys, Comsol, OpenFoam and many more. We can deploy fully configured HPC clusters which are dedicated to an engineer on GKE or AKS within minutes. We can also use EKS but the deployment time of an EKS cluster is at this point in time significantly slower as on the other platforms (around 3x times). While GKE is excellent and has been my favorite service (due to its deployment speed and its good APIs), AKS has begun in the last months to get really strong. Many features which are relevant for us (like using spot instances and placement groups) and its speed in terms of AKS cluster allocation time (now even almost one minute faster as GKE - 3:30 min. from 0 to a fully configured AKS cluster) have been implement on Azure. Great!

When managing HPC applications in dedicated Kubernetes clusters one challenge remains: How to manage fleets of clusters distributed across multiple clouds? At UberCloud we are building simple tools which takes HPC application start requests and turns it into a fully automated cluster creation and configuration job. One very popular way is to put this logic behind self-service portals where the user selects an application he/she want to use. One other way is creating those HPC applications based on events in workflows, CI/CD and gitops pipelines. Use cases are automated application testing, running automated compute tasks, cloud bursting, infrastructure as code integrations, and more. To support those tasks we’ve developed a container which turns an application and infrastructure description into a managed Kubernetes cluster independent of where the job runs and on which cloud provider and regions the cluster is created.

Due to the flexibility of containers UberCloud’s cluster creation container can be used in almost all modern environments which support containers. We are using it as a Kubernetes job and as CI/CD tasks. When the job is finished, the engineer has access to a fully configured HPC desktop including a HPC cluster attached.

Another integration we just tested is Argo. Argo is a popular workflow engine targeted and working on top of Kubernetes. We have a test installation running on GKE. As the UberCloud HPC cluster creation is fully wrapped inside a container running a single binary the configuration required to integrate it in a Argo workflow is very minimal.

After the workflow (task) is finished, the engineer get’s automatically access to the freshly created remote visualization application running on a newly allocated AKS cluster spanning two node pools having GUI based remote Linux desktop access setup.

The overall AKS cluster creation, configuration, and deployment of our services and HPC application containers just took a couple of minutes. A solution targeted for IT organizations challenged by the task of rolling out HPC applications for their engineers but required to work with modern cloud based technologies.