WHY THIS MATTERS IN BRIEF
The first ARM powered supercomputer has taken a long time to materialise, and the first unit to arrive could threaten Intel’s dominance in the space.
In a move that will have Intel peering over its shoulder even more than it was before, US supercomputer giant Cray has announced that it is going to build the world’s first ARM based supercomputer. The system, known as “Isambard,” will be the basis of a new UK based High Performance Computing (HPC) service that will offer the machine as a platform to support scientific research and to evaluate ARM technologies for HPC. Installation of Isambard is scheduled to begin this March and be up and running before the end of the year.
Professor Simon McIntosh-Smith, leader of the project and Professor of HPC at the University of Bristol made a presentation about the upcoming system at the Mont-Blanc ARM event taking place at the Barcelona Supercomputing Centre (BSC) a few weeks ago.
“I think this is really exciting for a number of reasons,” said McIntosh-Smith, “it’s one of, it not the first serious, large scale ARMv8 64-bit production machine. And it’s the first time Cray has explicitly announced an ARMv8 product meant for more than just prototyping.”
Whether this actually turns into a commercial offering though remains to be seen and the announcement at Barcelona didn’t coincide with any product announcement on the other side of the world at Cray headquarters.
According to McIntosh-Smith though, Isambard will be based on Cray’s CS400 platform, an existing product the company offers to provide mid-sized x86 InfiniBand clusters. They are currently outfitted with Intel Xeons, along with optional Xeon Phi processors or Nvidia GPUs as accelerators. If Cray intends to add an ARM CPU option to the CS400 product, that announcement has been deferred to another day.
Product or not, Isambard looks to be a formidable machine – probably on the order of tens of teraflops. Isambard will include over 10,000 64-bit ARMv8 cores, in addition to a smattering of x86 CPUs, Intel Knights Landing Xeon Phi processors, and Nvidia P100 GPUs.
The project’s rationale for this architectural diversity is to compare application performance across a range of processors on the same machine, and from Cray’s perspective, such diversity fits neatly into its vision of a heterogeneous computing future.
“Scientists have a growing choice of potential computer architectures to choose from, including new 64-bit ARM CPUs, graphics processors, and manycore CPUs from Intel,” said McIntosh-Smith.
It was not revealed what type of ARM processor or SoC would be used, but given that Cray was working on Cavium-based HPC systems as far back as 2014, it’s a good bet that Isambard will be outfitted with the ThunderX or ThunderX2 chips. The latter is the second-generation ARM server SoC under development at Cavium, which is supposed to be generally available sometime this year. ThunderX2 also happens to be the same processor that the future Mont-Blanc prototype will be using. That system, which was also revealed at Barcelona, will be built by Bull, which is now owned by Atos, and is intended to be used strictly as a proof-of-concept machine for the purpose of developing ARM based exascale technology.
By contrast, Isambard will be a production machine, and it will be used to run a new national “Tier 2 service” by the UK’s Great Western 4 (GW4) consortium, which comprises universities in Bristol, Bath, Cardiff and Exeter. The consortium’s mission is to strengthen the regional economy via scientific research with industry partners.
Procurement of the system is the result of a £3 million award from the Engineering and Physical Sciences Research Council (EPSRC), the UK’s principle agency for funding technology and engineering R&D in the UK public sector, and it’s expected that an additional £1.7 million will be allocated to operate the system over its projected three year lifetime.
The UK’s Met Office is also a partner in the effort, since they want to evaluate Isambard’s ability to run its own weather and climate simulations. The rationale here is to see if these compute-heavy workloads can be supported on a more energy-efficient platform. These workloads are currently being run on their in-house 8-teraflop (peak) Cray XC40 supercomputer powered by x86-based Intel CPUs, specifically the 18-core Xeon E5-2695 v4 processors. Like many public agencies with petascale supercomputing infrastructure, the Met Office is looking to reduce the considerable cost involved in running and cooling these large beasts. Comparing Isambard with its Cray XC40 will be made easier by the fact that the Office will be hosting the ARM-based machine on behalf of the GW4 consortium.
The GW4 machine will also be compared against Archer, the UK’s primary academic research supercomputer run by EPSRC. Archer is a Cray XC30 system, again powered by Intel Xeon CPUs. However, unlike the Met Office supercomputer, Archer runs a wide variety of scientific workloads including in areas such as CFD, materials science, molecular dynamics, quantum chemistry, and earth science.
“We chose about 10,000 cores for Isambard because most Archer science runs use no more than 8,000 to 9,000 cores,” explained McIntosh-Smith, “so with a system this size we’ll get a good idea how well ARMv8 could perform for real UK science jobs, and by extension, as the CPU in a future UK national HPC service.”
Testing the waters for a much wider deployment of the technology seems to be the real driver here. McIntosh-Smith thinks it will be interesting to see if and when more of these ARM powered HPC systems start to show up.
“I don’t expect a huge flood yet because most of the community has no hard data on how well HPC-optimized ARMv8 CPUs might perform for their codes,” he explained, “my intention is that Isambard can help start to address this lack of rigorous data. If Isambard performs well, it may be the start of more ARMv8 based systems being used in production.”