Posts Tagged ‘HPC Advisory Council’

HPC Advisory Council Showcases World’s First FDR 56Gb/s InfiniBand Demonstration at ISC’11

July 1st, 2011

The HPC Advisory Council, together with ISC’11, showcased the world’s first demonstration of FDR 56Gb/s InfiniBand in Hamburg, Germany, June 20-22. The HPC Advisory Council is hosting and organizing new technology demonstrations at leading HPC conferences around the world to highlight new solutions which will influence future HPC systems in term of performance, scalability and utilization.

The 56Gb/s InfiniBand demonstration connected participating exhibitors on the ISC’11 showroom floor as part of the HPC Advisory Council ISCnet network. The ISCnet network provided organizations with fast interconnect connectivity between their booths.

The FDR InfiniBand network included dedicated and distributed clusters, as well as a Lustre-based storage system. Multiple applications were demonstrated, including high-speed visualization applications using car models courtesy of Peugeot Citroën.

The installation of the fiber cables (we used 20 and 50 meter cables) was completed a few days before the show opened, and we placed the cables on the floor, protecting them with wooden bridges. The clusters, Lustre and application setup was done the day before and everything ran perfectly.

You can see the network architecture of the ISCnet FDR InfiniBand demo below. We have combined both MPI traffic and storage traffic (Lustre) on the same fabric, utilizing the new bandwidth capabilities to provide a high performance, consolidated fabric for the high speed rendering and visualization application demonstration.


The following HPC Council member organizations contributed to the FDR 56Gb/s InfiniBand demo and I would like to personally thank each of them: AMD, Corning Cable Systems, Dell, Fujitsu, HP, MEGWARE, Mellanox Technologies, Microsoft, OFS, Scalable Graphics, Supermicro and Xyratex.


Gilad Shainer

Member of the IBTA and chairman of the HPC Advisory Council

NVIDIA GPUDirect Technology – InfiniBand RDMA for Accelerating GPU-Based Systems

May 11th, 2011

As a member of the IBTA and as being the chairman of the HPC Advisory Council, I wanted to share with you some information on the important role of InfiniBand in the emerging hybrid (CPU-GPU) clustering architectures.

The rapid increase in the performance of graphics hardware, coupled with recent improvements in its programmability, has made graphics accelerators a compelling platform for computationally demanding tasks in a wide variety of application domains. Due to the great computational power of the GPU, the GPGPU method has proven valuable in various areas of science and technology and the hybrid CPU-GPU architecture is seeing increased adoption.

GPU-based clusters are being used to perform compute intensive tasks like finite element computations, Computational Fluids Dynamics, Monte-Carlo simulations, etc. Several of the world-leading InfiniBand supercomputers are using GPUs in order to achieve the desired performance. Since the GPUs provide very high core count and floating point operations capability, a high-speed networking interconnect such as InfiniBand is required to provide the needed throughput and the lowest latency for GPU-to-GPU communications. As such, InfiniBand has become the preferred interconnect solution for hybrid GPU-CPU systems.

While GPUs have been shown to provide worthwhile performance acceleration yielding benefits to price/performance and power/performance, several areas of GPU-based clusters could be improved in order to provide higher performance and efficiency. One issue with deploying clusters consisting of multi-GPU nodes involves the interaction between the GPU and the high speed InfiniBand network - in particular, the way GPUs use the network to transfer data between them. Before the NVIDIA GPUDirect technology, a performance issue existed with user-mode DMA mechanisms used by GPU devices and the InfiniBand RDMA technology. The issue involved the lack of a software/hardware mechanism of “pinning” pages of virtual memory to physical pages that can be shared by both the GPU devices and the networking devices.

The new hardware/software mechanism called GPUDirect eliminates the need for the CPU to be involved in the data movement and essentially enables not only higher GPU-based cluster efficiency, but sets the way for the creation of “floating point services.” GPUDirect is based on a new interface between the GPU and the InfiniBand device that enables both devices to share pinned memory buffers. Therefore data written by a GPU to the host memory can be sent immediately by the InfiniBand device (using RDMA semantics) to a remote GPU much faster.

As a result, GPU communication can now utilize the low latency and zero copies advantages of the InfiniBand RDMA transport for higher applications performance and efficiency. InfiniBand RDMA enables you to connect remote GPUs with latency characteristics to make it seems like all of the GPUs are on the same platform. Examples of the performance benefits and more info on GPUDirect can be found at -


Gilad Shainer

Member of the IBTA and chairman of the HPC Advisory Council