Find our blogs at www.csc.fi/blog.
This site is an archive version and is no longer updated.
Our previous blog post was about the hits and misses we had in our 2016 HPC predictions. For better or worse, let us give it a new try.
Here is what we think the HPC landscape will look like next year, enjoy!
Next year will be an interesting year for Intel. Its two main HPC-focused server processors will go head to head, to fight for the hearts of the supercomputing crowd.
In one corner will be the new Xeon processor, codenamed Skylake, featuring 20 or more very fast cores and AVX-512 vector instructions. In the other corner one will find the new Xeon Phi processor, codenamed Knights Landing, which is only now being installed in large supercomputers. This processor features up to 72 thinner cores, each sporting an HPC-tuned version of AVX-512, as well as 16 GB of on-package high-bandwidth memory. The fight will be fair in the sense that the peak performance per node is approximately 3 TFlops for both of them, but the balance in core counts, cache subsystems and memory bandwidth will be different.
Skylake vs. Knights Landing – who will win the crowd?
We predict that we will see Xeon Phi finding some traction in the high-end HPC market, with large installations featuring this architecture. Key applications will be modernized so that they can extract good performance from the processor, and we will see impressive performance numbers. Especially in terms of Flops per watt.
Skylake on the other hand will have a huge success across the spectrum, from small clusters to very large installations. Most codes will perform very well even without modernization. Those tuned for Xeon Phi will also fly on this architecture even if they will most likely be less efficient in terms of performance per watt. Memory-bandwidth-sensitive applications may prove to work much better on the Xeon Phi, if they are carefully modernized to fully utilize the high bandwidth memory.
In the years to come Intel will face stiffer competition also in the server processor side.
In 2017 we are going to see signs that in the years to come Intel will face stiffer competition also in the server processor side. Here one could mention Qualcomm's 48 core server ARM chip, OpenPOWER based processors, and possibly even predict AMD’s resurrection with the upcoming Zen-based Naples architecture.
The power density of modern racks filled with these new CPUs will be very difficult to cool with traditional air-cooling technologies, driving a trend towards fully liquid-cooled systems. This is something we at CSC will also need to take in careful consideration when upgrading our compute infrastructure.
We would like to re-cast our former collegue Olli-Pekka Lehto's prediction from last year - he was just ahead of time. First, Bull's BXI still looks like capable and interesting piece of technology, and since they have already installed a system featuring it, there should not be any manufacturing issues preventing its rise during 2017. Then again, the new 200 Gb/s HDR Infiniband technology by Mellanox looks so great that it is unlikely Mellanox is going to lose much ground during 2017, possibly on the contrary.
With HDR one can double the number of ports on a switch by halving the bandwidth (100 Gb/s is indeed sufficient in HPC; many other interconnect characteristics become a bottleneck before the raw bandwidth with the node technologies of today). This, together with in-network computing capabilities, makes it a viable alternative also for capability systems. One should not forget about Intel's Omni-Path either, as the early user experiences have been mostly positive and its pricing competitive. On the high-end, Cray's Aries network will still sell a number of systems, even if being technology several years old.
Is the western HPC market ready to embrace Chinese interconnect technology?
We also wonder whether the western HPC market is ready to embrace Chinese interconnect technology - undeniably demonstrated in the unrivalled No.1 system in the world, Sunway TaihuLight. Our hunch is "no" - there will not be Top-20 systems installed in the US or Europe featuring proprietary technology of the Chinese vendors.
On contrary to what Olli-Pekka wrote, we do not believe that Ethernet (as it is) will still have any role in top-end HPC, at least in 2017.
In the last few years, Artificial Intelligence (AI), especially in the form of deep learning, has been an extremely hot topic. Neural networks have been around for dozens of years, but in the last few years highly parallel hardware in the form of GPUs, together with huge amounts of training data have enabled much deeper networks to be trained. Practical applications are cropping up in a number of fields, for example image recognition, text translation, and self-driving cars. This is becoming a key market for vendors who have earlier focused on traditional HPC.
At the moment Nvidia with their line of GPUs are in a strong position, with good software support and the new Pascal generation GPU. It supports half-precision floating point numbers to further improve performance of deep learning. Intel, on the other hand, is moving fast to get a greater piece of this market, bringing to market a deep learning optimized Xeon Phi processor (Knights Mill) together with more specialized hardware based on ASICs for training workloads from the Nervana acquisition.
These are forces which will have a large impact on also more traditional HPC in the next years.
Deep learning and HPC will become more integrated.
First, the new market for highly parallel processors will help to fund and drive their development. The computational needs of deep learning are a close match to those of many more traditional computational sciences, so this will mostly have a positive outcome. Even the new half-precision floating point representation may find use, at least when taken in use in mixed precision supporting numerical libraries. On the other hand, it is also clear that there is an increasing interest in developing very specialized hardware for both training and inference tasks, which may not support other tasks well.
Second, we predict that deep learning and HPC will become more integrated. Deep learning needs large scale computational resources to train in parallel multiple networks, and improved training algorithms have better scalability across multiple nodes on HPC clusters. At the same time traditional simulations will in increasing numbers use them as a pre- or post-processing tool to either tune parameters for models to accurately model phenomena or to use a large number of simulations to train models with true predictive powers.
The surge of cognitive computing and the whole data-centric paradigm in HPC takes the storage solution to the heart of a HPC environment; a place that has belonged to a supercomputer from the day one. Many vendors offer seamless solutions for computing and storage environment with several more or less proprietary components. Building an advanced data-centric environment today most likely requires (or is at least much more straightforward by) purchasing a holistic solution from a single vendor or a partnership of vendors that jointly design the solution.
The data-centric paradigm takes the storage solution to the heart of a HPC environment.
We doubt this would be a big news or change for many sites, but for us at CSC this means an all new approach, since traditionally we have been vendor-agnostic with one generation of our HPC environment containing components from several vendors procured separately. As the trend seems to be towards incremental development of the environment, signing one deal may lead to a vendor-lock lasting for years. We are not saying this would be a bad thing by default. Vendor reps reading this: we are still open for everyone's suggestions concerning our next-generation systems!
Like it or not, we live in interesting times!