If you follow CSC on social media you might have noticed a recent announcement about a new service based on OKD/Kubernetes called Rahti. This new service allows you to run your own software packaged in Docker containers on a shared computing platform. The most typical use case is web applications of all sorts. In this blog post I will provide additional context for the announcement and more detail and examples about what Rahti is and why it’s useful.
CSC has been running cloud computing services for a while. The first pilot systems were built in 2010 so the tenth anniversary of cloud computing at CSC is coming up next year. All of CSC’s previous offerings in this area – cPouta, ePouta and their predecessors – have been Infrastructure as a Service (IaaS) clouds. In this model, users can create their own virtual servers, virtual networks to connect those servers and virtual disks to store persistent data on the servers. This gives you a lot of flexibility as you get to choose your own operating system and what software to run on that operating system and how. The flip side is that after you get your virtual servers, you are on your own in terms of managing their configuration.
Rahti takes a different approach. Instead of a virtual machine, the central concept is an application. The platform itself provides many of the things that you would need to manage yourself in more flexible IaaS environments. For example:
- Scaling up applications by adding replicas
- Autorecovery in case of hardware failures
- Rolling updates for a set of application replicas
- Load balancing of traffic to multiple application replicas
Not having to manage these yourself means you can get your applications up and running faster and don’t have to spend as much time maintaining them. What enables this is standardization of the application container and the application lifecycle. In IaaS clouds you have a lot of choice in terms of how you want to make your application fault tolerant and scalable. There are many software products available that you can install and configure yourself to achieve this. With Rahti and other Kubernetes platforms, there is one standard way. This simplifies things greatly while still providing enough flexibility for most use cases.
Based on the description above you might think that Rahti fits into the Platform as a Service (PaaS) service model. While there are many similarities, traditional PaaS platforms have typically been limited in terms of what programming languages, library versions and tools are supported. It says so right in the NIST Definition of Cloud Computing: “The capability provided to the consumer is to deploy onto the cloud infrastructure consumer-created or acquired applications created using programming languages, libraries, services, and tools supported by the provider.” These limitations are largely not there in Rahti or other Kubernetes platforms: if it runs in a Docker container, it most likely also runs (or can be made to run) in Rahti. You are free to choose your own programming language and related libraries and tooling yourself.
Setting up Spark in Rahti
One of the big benefits of Rahti is that complex distributed applications that would be difficult to install and configure on your own on virtual machines can be packaged into templates and made available for a large number of users. This means figuring out how to run the application has to be done only once – end users can simply take the template, make a few small customizations and quickly get their own instance running. You are of course also free to create your own templates and run your own software.
One example of a distributed application that can be difficult to install and manage is Apache Spark. It is a cluster software meant for processing large datasets. While it is relatively simple to install it on a single machine, using it that way would defeat the point of running Spark in the first place: it is meant for tasks that are too big for a single machine to handle. Clustered installations on the other hand mean a lot of additional complications: you need to get the servers to communicate with each other, you need to make sure the configuration of the cluster workers is (and stays) somewhat identical and you need to have some way to scale the cluster up and down depending on the size of your problem – and the list goes on.
Let’s see how one can run Spark in Rahti. The template that we use in Rahti is available on GitHub and the credit for it goes to my colleagues Apurva Nandan and Juha Hulkkonen. And yes, I know that is actually the Hadoop logo.
First select “Apache Spark” from a catalog of applications:
You can also find other useful tools in the catalog such as databases and web servers. After selecting Apache Spark, you’ll get this dialog:
Click next and enter a few basic configuration options. There are many more that you can customize if you scroll down, but most can be left with their default values:
After filling in a name for the cluster, a username and a password, click “Create” and go to the overview page to see the cluster spinning up. After a short wait you’ll see a view like this:
The overview page shows different components of the Spark cluster: one master, four workers and a Jupyter Notebook for a frontend to the cluster. These run in so called “pods” that are a collection of one or more containers that share the same IP address. Each worker in the Spark cluster is its own pod and the pods are distributed by Rahti on separate servers.
From the overview page you can get information about the status of the cluster, monitor resource usage and add more workers if needed. You can also find a URL to the Jupyter Notebook web interface at the top and if you expand the master pod view you can find a URL to the Spark master web UI. These both use the username and password you specified when creating the cluster.
If you need a more powerful cluster you can scale it up by adding more workers. Expand the worker pod view and click the up arrow next to the number of pods a few times:
You can then follow the link from the overview page to Jupyter Notebook which acts as a frontend for the Spark cluster.
And that’s all there is to it! The process for launching other applications from templates is very similar to the Spark example above. The plan for the future is to add more of these templates to Rahti for various types of software in addition to the ones that are already there.
If you’re interested in learning more about Rahti, you can find info at the Rahti website or you can contact email@example.com.
Photo: Adobe Stock