Here’s the second part of our HPC technology predictions for next year. Missed the first part? No worries, you can still read it here: HPC Predictions for 2016 - Part 1
Olli-Pekka Lehto: Containers make custom environments easy
Containers can provide a mechanism that’s more lightweight than traditional virtualization for supporting custom compute environments in a platform-independent way. For example, users can run their own containers on HPC systems and HPC centers‘ can provide their software in containers for people to run on their laptops. There are also many other potential use cases to explore both on user application as well as systems management side.
During the last year the notion of having HPC workloads in containers (especially Docker) has gained increasing traction. Container support has been emerging including IBM’s container support with LSF and the User Defined Images service in NERSC, based on their Shifter software. The latter has also led Cray to announce nearly imminent support for containers on their XC systems.
There’s still a lot of work to do to and a lot of open questions. Thus I expect that next year will see a boom in pilot projects testing and refining containerization in a HPC context. We at CSC will definitely be testing these features in the upcoming year. I also predict that it will be almost certain that containers will become a standard feature of HPC systems within a couple of years, occupying a middle ground of use cases between full-on virtualization and “traditional” bare-metal HPC.
I recently also wrote a more detailed article about this topic.
Pekka Manninen: Increased application performance via burst I/O buffering
In HPC systems of today, there is a horrendous performance gap between DRAM-based main memory and the storage system, typically based on Lustre or similar parallel file system. On Lustre, the I/O bandwidth is very high because of its highly parallel design, but with a downside of pronounced I/O latency. This is due to the access over an interconnect plus the separate metadata server. In practice, this means that small and frequent I/O accesses are poisonous for application performance - and there are several algorithms and codes in scientific computing that require the small but frequent non-contiguous reads and writes.
These bottlenecks have been alleviated in the current generation of HPC systems by introducing some kind of local storage on a compute node; earlier by spinning disks and more recently by SSD (solid state disk) based technologies. Typically these options have been available only on mid-range cluster systems, since designs aimed towards the Top-100 systems cannot host these due to packaging density considerations. Furthermore, the local disk solutions are separate mount points i.e. separate directories. It has been up to the user or the application code to use the proper storage location.
Next year will bring some improvements to the situation by introducing more levels to the storage hierarchy, between the main memory and the Lustre-type file system. For instance, Cray is introducing to the current XC family of supercomputers (for which addition of compute node disks of any kind has not been possible) their DataWarp technology, which is a flash-based burst I/O buffering and a cache level for Lustre. The final version of the technology will be fully transparent to the application, DataWarp absorbing all the bursty I/O and taking it to the Lustre in the background, and being also capable of prefetching data. DDN will hit the market with their Infinite Memory Engine solution, aiming at providing similar outcome but with a different philosophy.
Early versions of these technologies have shown very promising performance benefits for real-world applications. CSC’s Cray XC “Sisu” will get two DataWarp blades in mid-January in order to experiment with the burst buffer technology. The prediction? These technologies will become more or less an auto-include in systems aimed for hosting real-world, heterogeneous workloads deployed in 2016!
Olli-Pekka Lehto: Alternative processors inching to the mainstream
Let’s face it: Intel is completely dominating the processor scene in the HPC server market. However, serious challengers are in the horizon with multiple ARM vendors (Broadcom, Cavium, etc.) building increasingly capable processors. Also increasingly capable HPC prototype systems are being developed, most prominently within the pan-European Mont Blanc project. Furthermore IBM has opened up their POWER processor ecosystem via the very active OpenPOWER consortium and is closely partnering with NVidia to tightly integrate GPUs. Time seems to be ripe for some of these challengers to really step up.
On the flipside, there is such a large inertia built around Intel and the related ecosystem that I expect that in the near future there will not be a huge surge of systems in the HPC space featuring such alternative processor architectures.
That said, these challengers to Intel’s throne will continue build up momentum in the datacenters of web-scale IT companies like Facebook, Rackspace and Google. The massive amounts of hardware that these giants consume will surely drive economies of scale and probably make these chips very attractive from a price/performance standpoint for the HPC market as well.