HPC Storage Solutions

Solutions for High Performance Parallel Environments


Overview

What is Parallel HPC Storage and What Makes It Different From Enterprise Storage?

Parallel high-performance computing (HPC) storage is specialized storage that enables large numbers of clustered compute nodes— hundreds or even thousands—to access petabytes of data in file format at very high speeds (measured in dozens or hundreds of gigabytes per second).

Parallel HPC storage is using specialized file systems that enable large numbers of compute nodes (CPU nodes and GPU-accelerated nodes) to read and write data in parallel at the same time from one namespace. Those file systems are called parallel file systems.

For this use case classic enterprise storage is not a good fit because:

• Compute nodes need to access their data in file format. A functionality that enterprise (block) storage systems like HPE Primera or HPE Nimble Storage do not provide or need.

• Compute nodes need to access their data at very high speeds. In most of cases the speed requirements exceed the capabilities of enterprise storage that supports file system protocols such as HPE StoreEasy, Qumulo on HPE Apollo, HPE Elastic Platform for Big Data Analytics, Scality on HPE servers, and more.

• Buyers of HPC storage are willing to pay only a fraction of the price per petabyte that the enterprise storage market supports. Performance and price are the dominating purchase criteria1 while classic enterprise storage features like resilience and system availability rank much lower than in the enterprise storage market.

According to Hyperion Research, The open source parallel file system Lustre is the most widely deployed parallel file system in HPC followed by the IBM Spectrum Scale.

Key differences between HPC Storage and Enterprise Storage

Enterprise Storage HPC Storage
Typical workload Transactional Job (Data Set)
Performance requirement I/O Bandwidth
Typical file size Small Large
Typical file access Random Sequential
Data movement Minimal Extensive

Cray ClusterStor E1000

The Cray ClusterStor E1000 storage system embeds the open-source parallel Lustre® file system for performance, scalability, and cost-effectiveness reasons. Around two thirds of the global top 100 supercomputers use Lustre in production. And a recent multiclient study of Hyperion Research regarding the file system landscape in the HPC ecosystem found that Lustre is the most widely used parallel file system and the only parallel file system that has shown consistent growth in the last years.

The Cray ClusterStor E1000 storage system is a unique new HPC storage system for the new HPC era. The E1000 attaches directly to any supercomputer or any HPC cluster of any vendor as long as the compute supports modern high-speed networks like EDR/HDR InfiniBand, 100/200 Gigabit Ethernet or Cray Slingshot. Connectivity to legacy interconnects (for example, Intel® Omni-Path) can be realized via LNet routers.

Resources

Apollo 4200 Gen10

Purpose-built Systems for Big Data and HPC Storage Workloads

With the HPE Apollo 4500 Systems, HPE challenges the notion that one-size-fits all for Big Data infrastructure by creating purpose-built systems that specifically address storage and analytics workloads. For object storage, the ultra-dense HPE Apollo 4510 includes one server and up to 68 LFF drives in a 4U chassis for a maximum of 544 TB per system. For clustered storage environments, the HPE Apollo 4520 offers two servers with built-in failover capability. For Hadoop® and other Big Data solutions, the HPE Apollo 4530 uniquely offers three servers per chassis, ideal for housing three copies of data in a single system.

The HPE Apollo 4500 series allows you to realize all the value of your data at the right cost and in the least amount of space. And with HPE software tools to help you deploy, operate, and optimize your valuable data center resources, you can grow your data with confidence at any scale.