Posts Tagged ‘Exascale’

Enabling Exascale with Co-Design Architecture, Intelligent Networks and InfiniBand Routing

June 14th, 2017


For those working in the High Performance Computing (HPC) industry, achieving Exascale performance has been an ongoing challenge and a significant milestone for some time. Recently, experts have started to take a holistic system-level approach to performance improvements by examining how hardware and software components interact within data centers. This approach, known as co-design architecture, recognizes that the CPU has reached the limits of its scalability, and offers an intelligent network as the new “co-processor” to share the responsibility for handling and accelerating application workloads.

Next week, IBTA representatives will join other industry experts in Frankfurt, Germany for ISC High Performance, an annual event focused on HPC technological developments and applications. In a Birds of a Feather (BoF) session titled A Holistic Approach to Exascale - Co-Design Architecture, Intelligent Networks and InfiniBand Routing, Scott Atchely from Oak Ridge National Laboratory, as well as Richard Graham, Gerald Lotto and Gilad Shainer from member company Mellanox Technologies will discuss the advantages of a co-design architecture in depth. Additionally, the group will cover the role that InfiniBand routers play in reaching Exascale performance and share insights behind these recent HPC developments.

Attending ISC High Performance? The BoF session will be held in the Kontrast room starting at 4:00 p.m. on Monday, June 19. For more information on the BoF session and its participants, visit the event site here.

Bill Lee

Race to Exascale – Nations Vie to Build Fastest Supercomputer

September 28th, 2015

“Discover Supercomputer 3” by NASA Goddard Space Flight Center is licensed under CC BY 2.0

“Discover Supercomputer 3” by NASA Goddard Space Flight Center is licensed under CC BY 2.0

The race between countries to build the fastest, biggest or first of anything is nothing new – think the race to the moon. One of the current global competitions is focused on supercomputing, specifically the race to Exascale computing or a billion billion calculations per second. Recently, governments (President Obama’s Executive Order and China’s current lead in supercomputing) are allocating significant resources toward Exascale initiatives as they start to understand its vast potential for a variety of industries, including healthcare, defense and space exploration.

The TOP500 list ranking the top supercomputers in the world will continue to be the scorecard. Currently, the U.S. leads with 233 of the top 500 supercomputers, Europe with 141 and China with 37. However, China’s small portfolio of supercomputers does not mean it is not a significant competitor in the supercomputing space as China has the #1 supercomputer on the TOP500 list for the fifth consecutive time.

When looking to build the supercomputers of the future, there are a number of factors which need to be taken into consideration, including superior application performance, compute scalability and resource efficiency. InfiniBand’s compute offloads and scalability makes it extremely attractive to supercomputer architects. Proof of the performance and scalability can be found in places such as the HPC Advisory Council’s library of case studies. InfiniBand makes it possible to achieve near linear performance improvement as more computers are connected to the array. Since observers of this space expect Exascale systems to require a massive amount of compute hardware, InfiniBand’s scalability looks to be a requirement to achieve this goal.

As the race to supercomputing speeds up we expect to see a number of exciting advances in technology as we shift from petaflops to exaflops. To give you an idea of how far we have come and where we are heading here is a comparison from the speed of computers that powered the race to space and the goals for Exascale.

Speeds Then vs. Now – Race to Space vs. Race to Supercomputing

  • Computers in 1960s (Speed of the Race to Space): Hectoscale (hundreds of FLOPs per second)
  • Goal for Computers in 2025 (Speed of the Race to Supercomputing): Exascale (quintillions of FLOPs per second)

Advances in supercomputing will continue to dominate the news with these two nations making the development of the fastest supercomputer a priority. As November approaches and the new TOP500 list is released, it will be very interesting to see where the rankings lie and what interconnects the respective architects will pick.

Bill Lee

InfiniBand Volume 1, Release 1.3 – The Industry Sounds Off

May 14th, 2015


On March 10, 2015, IBTA announced the availability of Release 1.3 of Volume 1 of the InfiniBand Architecture Specification and it’s creating a lot of buzz in the industry. IBTA members recognized that as compute clusters and data centers grew larger and more complex, the network equipment architecture would have difficulty keeping pace with the need for more processing power. With that in mind, the new release included improvements to scalability and management for both high performance computing and enterprise data centers.

Here’s a snap shot of what industry experts and media have said about the new specification:

“Release 1.3 of the Volume 1 InfiniBand Architecture Specification provides several improvements, including deeper visibility into switch hierarchy, improved diagnostics allowing for faster response times to connectivity problems, enhanced network statistics, and added counters for Enhanced Data Rate (EDR) to improve network management. These features will allow network administrators to more easily install, maintain, and optimize very large InfiniBand clusters.” - Kurt Yamamoto, Tom’s IT PRO

“It’s worth keeping up with [InfiniBand], as it clearly shows where the broader networking market is capable of going… Maybe geeky stuff, but it allows [InfiniBand] to keep up with “exascales” of data and lead the way large scale-out computer networking gets done. This is particularly important as the 1000 node clusters of today grow towards the 10,000 node clusters of tomorrow.” - Mike Matchett, Taneja Group, Inc.

“Indeed, a rising tide lifts all boats, and the InfiniBand community does not intend to get caught in the shallows of the Big Data surge. The InfiniBand Trade Association recently issued Release 1.3 of Volume I of the format’s reference architecture, designed to incorporate increased scalability, efficiency, availability and other functions that are becoming central to modern data infrastructure.” - Arthur Cole, Enterprise Networking Planet

“The InifiniBand Trade Association (IBTA) hopes to ward off the risk of an Ethernet invasion in the ranks of HPC users with a renewed focus on manageability and visibility. Such features have just appeared in release 1.3 of the Volume 1 standard. The IBTA’s Bill Lee told The Register that as HPC clusters grow, ‘you want to be able to see every level of switch interconnect, so you can identify choke-points and work around them.’” - Richard Chirgwin, The Register

To read more industry coverage of the new release, visit the InfiniBand in the News page.

For additional information about the InifiniBand specification, check out the InifiniBand specification FAQ or access the InfiniBand specification here.

Bill Lee

Visit the IBTA and OFA at SC13!

November 13th, 2013

Attending SC13? The IBTA will be teaming up once again with the OpenFabrics Alliance (OFA) to participate in a number of conference activities. The organizations will be exhibiting together at booth #4132 – stop by for access to:

  • Hands-on computing cluster demonstrations
  • IBTA cable compliance demonstration
  • IBTA & OFA member company exhibition map and SC13 news
  • Current and prospective member information
  • Information regarding OFA’s 2014 User Day and Developer Workshop

IBTA and OFA will also lead the discussion on the future of I/O architectures for improved application performance and efficiency during several technical sessions:

  • RDMA: Scaling the I/O Architecture for Future Applications,” an IBTA-moderated session, will discuss what new approaches to I/O architecture could be used to meet Exascale requirements. The session will be moderated by IBTA’s Bill Boas and will feature a discussion between top users of RDMA. The panel session will take place on Wednesday, November 20 from 1:30 p.m. to 3:00 p.m. in room 301/302/303.

  • Accelerating Improvements in HPC Application I/O Performance and Efficiency,” an OFA Emerging Technologies exhibit, will present to attendees ideas on how incorporating a new framework of I/O APIs may increase performance and efficiency for applications. This Emerging Technologies exhibit will take place at booth #3547. The OFA will also be giving a short talk on this subject in the Emerging Technologies theatre at booth #3947 on Tuesday, November 19 at 2:50 p.m.

  • OFA member company representatives will further develop ideas discussed in its Emerging Technologies exhibit during the Birds of a Feather (BoF) session entitled, “Discussing an I/O Framework to Accelerate Improvements in Application I/O Performance.” Moderators Paul Grun of Cray and Sean Hefty of Intel will lead the discussion on how developers and end-users can enhance and further encourage the growth of open source I/O software.

Bill Lee

Chair, Marketing Working Group (MWG)

InfiniBand Trade Association

IBTA & OFA Join Forces at SC12

November 7th, 2012

Attending SC12? Check out OFA’s Exascale and Big Data I/O panel discussion and stop by the IBTA/OFA booth to meet our industry experts

The IBTA is gearing up for the annual SC12 conference taking place November 10-16 at the UT Salt Palace Convention Center in Salt Lake City, Utah. We will be joining forces with the OpenFabrics Alliance (OFA) on a number of conference activities and will be exhibiting together at SC12 booth #3630.

IBTA members will participate in the OFA-moderated panel, Exascale and Big Data I/O, which we highly recommend attending if you’re at the conference.  The panel session, moderated by IBTA and OFA member Bill Boas, takes place Wednesday, November 14 at 1:30 p.m. Mountain Time and will discuss drivers for future I/O architectures.

Also be sure to stop by the IBTA and OFA booth #3630 to chat with industry experts regarding a wide range of industry topics, including:

·         Behind the IBTA integrators list

·         High speed optical connectivity

·         Building and validating OFA software

·         Achieving low latency with RDMA in virtualized cloud environments

·         UNH-IOL hardware testing and interoperability capabilities

·         Utilizing high-speed interconnects for HPC

·         Release 1.3 of IBA Vol2

·         Peering into a live OFS cluster

·         RoCE in Wide Area Networks

·         OpenFabrics for high speed SAN and NAS

Experts, including: Katharine Schmidtke, Finisar; Alan Brenner, IBM; Todd Wilde, Mellanox; Rupert Dance, Software Forge; Bill Boas and Kevin Moran, System Fabric Works; and Josh Simons, VMware will be in the booth to answer your questions and discuss topics currently affecting the HPC community.

Be sure to check the SC12 website to learn more about Supercomputing 2012, and stay tuned to the IBTA website and Twitter to follow IBTA’s plans and activities at SC12.

See you there!

InfiniBand on the Road to Exascale Computing

January 21st, 2011

(Note: This article appears with reprint permission of The Exascale Reporttm)

InfiniBand has been making remarkable progress in HPC, as evidenced by its growth in the Top5002 rankings of the highest performing computers. In the November 2010 update to these rankings, InfiniBand’s use increased another 18 percent, to help power 43 percent of all listed systems, including 57 percent of all high-end “Petascale” systems.

The continuing march for higher and higher performance levels continues. Today, computation is a critical part of science, where computation compliments observation, experiment and theory. The computational performance of high-end computers has been increasing by a factor of 1000X every 11 years.

InfiniBand has demonstrated that it plays an important role in the current Petascale level of computing driven by its bandwidth, low latency implementations and fabric efficiency. This article will explore how InfiniBand will continue to pace high-end computing as it moves towards the Exascale level of computing.

Figure 1 - The Golden Age of Cluster Computing

Figure 1 - The Golden Age of Cluster Computing

InfiniBand Today

Figure 1 illustrates how the high end of HPC crossed the 1 Terascale mark in 1997 (1012 floating operations per second) and increased three orders of magnitude to the 1 Petascale mark in 2008 (1015 floating operations per second). As you can see, the underlying system architectures changed dramatically during this time. The growth of the cluster computing model, based on commodity server processors, has come to dominate much of high-end HPC. Recently, this model is being augmented by the emergence of GPUs.

Figure 2 - Emergence of InfiniBand in the Top500

Figure 2 - Emergence of InfiniBand in the Top500

Figure 2 shows how interconnects track with changes in the underlying system architectures. The appearance of first 1 GbE, followed by the growth of InfiniBand interconnects, are key enablers of the cluster computing model. The industry standard InfiniBand and Ethernet interconnects have largely displaced earlier proprietary interconnects. InfiniBand interconnects continue to grow share relative to Ethernet, largely driven by performance factors such as low latency and high bandwidth, the ability to support high bisectional bandwidth fabrics, as well as overall cost-effectiveness.

Getting to Exascale
What we know today is that Exascale computing will require enormously larger computer systems than what are available today. What we don’t know is how those computers will look. We have been in the golden age of cluster computing for much of the past decade and the model appears to scale well going forward. However, there is yet no clear consensus regarding the system architecture for Exascale. What we can do is map the evolution of InfiniBand to the evolution of Exascale.

Given historical growth rates, Exascale computing is being anticipated by the industry to be reached around 2018. However, three orders of magnitude beyond where we are today represents too great a change to make as a single leap. In addition, the industry is continuing to assess what system structures will comprise systems of that size.

Figure 3 - Steps from Petascale to Exascale

Figure 3 - Steps from Petascale to Exascale

Figure 3 provides guidance as to the key capabilities of the interconnect as computer systems increase in power by each order of magnitude from current high-end systems with 1 PetaFLOPS performance, to 10 PF, 100 PF and finally 1000PF = 1 ExaFLOPS. Over time, computational nodes will provide increasing performance with advances in processor and system architecture. This performance increase must be matched by a corresponding increase in network bandwidth to each node. However, the increased performance per node also tends to hold down the increase in the total number of nodes required to reach a given level of system performance.

Today, 4x QDR InfiniBand (40 Gbps) is the interconnect of choice for many large-scale clusters. Current InfiniBand technology well supports systems with performance in the order of 1 PetaFLOPS. Deployments in the order of 10,000 nodes have been achieved, and 4x QDR link bandwidths are offered by multiple vendors. InfiniBand interconnects are used in 57 percent of the current Petascale systems on the Top500 list.

Moving from 1 PetaFLOPS to 10 PetaFLOPS is well within the reach of the current InfiniBand roadmap. Reaching 35,000 nodes is within the currently-defined InfiniBand address space. Required 12 GB/s links can either be achieved by 12x QDR, or more likely, by 4x EDR data rates (104 Gbps) now being defined according to the InfiniBand industry bandwidth roadmap. Such data rates also anticipate PCIe Gen3 host connects, which are anticipated in the forthcoming processor generation.

The next order of magnitude increase in system performance from 10 PetaFLOPS to 100 PetaFLOPS will require additional evolution of the InfiniBand standards to permit hundreds of thousands of nodes to be addressed. The InfiniBand industry is already initiating discussions as to what evolved capabilities are needed for systems of such scale. As in the prior step up to more performance, required link bandwidths can be achieved by 12x EDR (which is currently being defined) or perhaps 4x HDR (which has been identified on the InfiniBand industry roadmap). Systems of such scale may also exploit topologies such as mesh/torus or hypercube, for which there are already large scale InfiniBand deployments.

The remaining order of magnitude increase in system performance from 100 PetaFLOPS to 1 ExaFLOPS requires link bandwidths to once again increase. Either 12x HDR, or 4X NDR links will need to be defined. It is also expected that optical technology will play a greater role in systems of such scale.
The Meaning of Exascale

Reaching Exascale computing levels involves much more than just the interconnect. Pending further developments in computer systems design and technology, such systems are expected to occupy many hundreds of racks and consume perhaps 20 MWatts of power. Just as many of the high-end systems today are purpose-built with unique packaging, power distribution, cooling and interconnect architectures, we should expect Exascale systems to be predominantly purpose-built. However, before we conclude that the golden age of cluster computer has ended with its reliance on effective industry standard interconnects such as InfiniBand, let’s look further at the data.

Figure 4 - Top500 Performance Trends

Figure 4 - Top500 Performance Trends

Figure 4 is the trends chart from Top500. At first glance, it shows the tremendous growth over the past two decades of high-end HPC, as well as projecting these trends to continue for the next decade. However, it also shows that the performance of the #1 ranked system is about two orders of magnitude greater than the #500 ranked system.

Figure 5 - Top500 below 1 PetaFLOPS (November 2010)

Figure 5 - Top500 below 1 PetaFLOPS (November 2010)

This is further illustrated in Figure 5, which shows the performance vs. rank from the November 2010 Top500 list – the seven systems above 1 PetaFLOPS have been omitted so as not to stretch the vertical axis too much. We see that only the highest 72 ranked systems come within an order of magnitude of 1 PetaFLOPS (1000 TeraFLOPS). This trend is expected to continue with the implication that once the highest-end HPC systems reach the 1 Exascale threshold, the majority of Top500 systems will be a maximum of order of 100 PetaFLOPS, with the #500 ranked system at an order of 10 PetaFLOPS.

Although we often use the Top500 rankings as an indicator of high-end HPC, the vast majority of HPC deployments occur below the Top500.

InfiniBand Evolution
InfiniBand has been an extraordinarily effective interconnect for HPC, with demonstrated scaling up to the Petascale level. InfiniBand architecture permits low latency implementations and has a bandwidth roadmap matching the capabilities of host processor technology. InfiniBand’s fabric architecture permits implementation and deployment of highly efficient fabrics, in a range of topologies, with congestion management and resiliency capabilities.

The InfiniBand community has demonstrated that the architecture has previously evolved to remain vibrant. The Technical Working Group is currently assessing architectural evolution to permit InfiniBand to continue to meet the needs of increasing system scale.
As we move towards an Exascale HPC environment with possibly purpose-built systems, the cluster computing model enabled by InfiniBand interconnects will remain a vital communications model capable of extending well into the Top500.

Lloyd Dickman
Technical Working Group, IBTA

(Note: This article appears with reprint permission of The Exascale Reporttm)

Author: admin Categories: InfiniBand Tags: , , , , ,