Archive

Author Archive

IBTA Integrators’ List for Plugfest 18 is Now Available

February 21st, 2011

The IBTA Integrators’ List for Plugfest 18 is now available online: www.infinibandta.org/integratorslist. The number of products on the Integrators’ List has steadily increased and now includes 21 devices, 209 DDR cables and 174 QDR cables. We had 19 vendors attend the October 2010 event.

IBTA Plugfests provide an opportunity for InfiniBand device and cable vendors to test their products for compliance with the InfiniBand architecture specification, as well as interoperability with other InfiniBand products. Successfully passing the compliance requirements permits a product to be placed on the Integrators’ List. Both the cables and devices were put through exhaustive Interoperability testing using the Open MPI test suite.

The IBTA Plugfest events and the resulting Integrators’ List helps the IBTA identify and address problems with the specification, helps vendors identify areas for potential product updates, and aims to improve the overall end user experience with InfiniBand-related products. All of this paves the way for the IBTA and its members to confidently move forward to higher speeds, greater bandwidth and lower latencies in new products.

Coming soon: Plugfest 19, April 4-8, 2011

Please note that dates for IBTA Plugfest 19 have been set: April 4-8, 2011 at UNH-IOL. Plugfest 19 will include the first alpha testing of FDR cables and devices which are designed to run at 14 Gb/s per lane. We are also working to define a complete test suite to validate the specifications for both FDR and EDR data rates (see the IBTA Roadmap for more details). Visit the Plugfest web pages for more information and to register.

Congratulations to all of the Plugfest 18 participants!

Rupert Dancerupert-dance

rsdance@soft-forge.com

Co-chair, IBTA’s Compliance and Interoperability Working Group

Author: admin Categories: InfiniBand Tags: , ,

InfiniBand on the Road to Exascale Computing

January 21st, 2011

(Note: This article appears with reprint permission of The Exascale Reporttm)

InfiniBand has been making remarkable progress in HPC, as evidenced by its growth in the Top5002 rankings of the highest performing computers. In the November 2010 update to these rankings, InfiniBand’s use increased another 18 percent, to help power 43 percent of all listed systems, including 57 percent of all high-end “Petascale” systems.

The continuing march for higher and higher performance levels continues. Today, computation is a critical part of science, where computation compliments observation, experiment and theory. The computational performance of high-end computers has been increasing by a factor of 1000X every 11 years.

InfiniBand has demonstrated that it plays an important role in the current Petascale level of computing driven by its bandwidth, low latency implementations and fabric efficiency. This article will explore how InfiniBand will continue to pace high-end computing as it moves towards the Exascale level of computing.

Figure 1 - The Golden Age of Cluster Computing

Figure 1 - The Golden Age of Cluster Computing

InfiniBand Today

Figure 1 illustrates how the high end of HPC crossed the 1 Terascale mark in 1997 (1012 floating operations per second) and increased three orders of magnitude to the 1 Petascale mark in 2008 (1015 floating operations per second). As you can see, the underlying system architectures changed dramatically during this time. The growth of the cluster computing model, based on commodity server processors, has come to dominate much of high-end HPC. Recently, this model is being augmented by the emergence of GPUs.

Figure 2 - Emergence of InfiniBand in the Top500

Figure 2 - Emergence of InfiniBand in the Top500

Figure 2 shows how interconnects track with changes in the underlying system architectures. The appearance of first 1 GbE, followed by the growth of InfiniBand interconnects, are key enablers of the cluster computing model. The industry standard InfiniBand and Ethernet interconnects have largely displaced earlier proprietary interconnects. InfiniBand interconnects continue to grow share relative to Ethernet, largely driven by performance factors such as low latency and high bandwidth, the ability to support high bisectional bandwidth fabrics, as well as overall cost-effectiveness.

Getting to Exascale
What we know today is that Exascale computing will require enormously larger computer systems than what are available today. What we don’t know is how those computers will look. We have been in the golden age of cluster computing for much of the past decade and the model appears to scale well going forward. However, there is yet no clear consensus regarding the system architecture for Exascale. What we can do is map the evolution of InfiniBand to the evolution of Exascale.

Given historical growth rates, Exascale computing is being anticipated by the industry to be reached around 2018. However, three orders of magnitude beyond where we are today represents too great a change to make as a single leap. In addition, the industry is continuing to assess what system structures will comprise systems of that size.

Figure 3 - Steps from Petascale to Exascale

Figure 3 - Steps from Petascale to Exascale

Figure 3 provides guidance as to the key capabilities of the interconnect as computer systems increase in power by each order of magnitude from current high-end systems with 1 PetaFLOPS performance, to 10 PF, 100 PF and finally 1000PF = 1 ExaFLOPS. Over time, computational nodes will provide increasing performance with advances in processor and system architecture. This performance increase must be matched by a corresponding increase in network bandwidth to each node. However, the increased performance per node also tends to hold down the increase in the total number of nodes required to reach a given level of system performance.

Today, 4x QDR InfiniBand (40 Gbps) is the interconnect of choice for many large-scale clusters. Current InfiniBand technology well supports systems with performance in the order of 1 PetaFLOPS. Deployments in the order of 10,000 nodes have been achieved, and 4x QDR link bandwidths are offered by multiple vendors. InfiniBand interconnects are used in 57 percent of the current Petascale systems on the Top500 list.

Moving from 1 PetaFLOPS to 10 PetaFLOPS is well within the reach of the current InfiniBand roadmap. Reaching 35,000 nodes is within the currently-defined InfiniBand address space. Required 12 GB/s links can either be achieved by 12x QDR, or more likely, by 4x EDR data rates (104 Gbps) now being defined according to the InfiniBand industry bandwidth roadmap. Such data rates also anticipate PCIe Gen3 host connects, which are anticipated in the forthcoming processor generation.

The next order of magnitude increase in system performance from 10 PetaFLOPS to 100 PetaFLOPS will require additional evolution of the InfiniBand standards to permit hundreds of thousands of nodes to be addressed. The InfiniBand industry is already initiating discussions as to what evolved capabilities are needed for systems of such scale. As in the prior step up to more performance, required link bandwidths can be achieved by 12x EDR (which is currently being defined) or perhaps 4x HDR (which has been identified on the InfiniBand industry roadmap). Systems of such scale may also exploit topologies such as mesh/torus or hypercube, for which there are already large scale InfiniBand deployments.

The remaining order of magnitude increase in system performance from 100 PetaFLOPS to 1 ExaFLOPS requires link bandwidths to once again increase. Either 12x HDR, or 4X NDR links will need to be defined. It is also expected that optical technology will play a greater role in systems of such scale.
The Meaning of Exascale

Reaching Exascale computing levels involves much more than just the interconnect. Pending further developments in computer systems design and technology, such systems are expected to occupy many hundreds of racks and consume perhaps 20 MWatts of power. Just as many of the high-end systems today are purpose-built with unique packaging, power distribution, cooling and interconnect architectures, we should expect Exascale systems to be predominantly purpose-built. However, before we conclude that the golden age of cluster computer has ended with its reliance on effective industry standard interconnects such as InfiniBand, let’s look further at the data.

Figure 4 - Top500 Performance Trends

Figure 4 - Top500 Performance Trends

Figure 4 is the trends chart from Top500. At first glance, it shows the tremendous growth over the past two decades of high-end HPC, as well as projecting these trends to continue for the next decade. However, it also shows that the performance of the #1 ranked system is about two orders of magnitude greater than the #500 ranked system.

Figure 5 - Top500 below 1 PetaFLOPS (November 2010)

Figure 5 - Top500 below 1 PetaFLOPS (November 2010)

This is further illustrated in Figure 5, which shows the performance vs. rank from the November 2010 Top500 list – the seven systems above 1 PetaFLOPS have been omitted so as not to stretch the vertical axis too much. We see that only the highest 72 ranked systems come within an order of magnitude of 1 PetaFLOPS (1000 TeraFLOPS). This trend is expected to continue with the implication that once the highest-end HPC systems reach the 1 Exascale threshold, the majority of Top500 systems will be a maximum of order of 100 PetaFLOPS, with the #500 ranked system at an order of 10 PetaFLOPS.

Although we often use the Top500 rankings as an indicator of high-end HPC, the vast majority of HPC deployments occur below the Top500.

InfiniBand Evolution
InfiniBand has been an extraordinarily effective interconnect for HPC, with demonstrated scaling up to the Petascale level. InfiniBand architecture permits low latency implementations and has a bandwidth roadmap matching the capabilities of host processor technology. InfiniBand’s fabric architecture permits implementation and deployment of highly efficient fabrics, in a range of topologies, with congestion management and resiliency capabilities.

The InfiniBand community has demonstrated that the architecture has previously evolved to remain vibrant. The Technical Working Group is currently assessing architectural evolution to permit InfiniBand to continue to meet the needs of increasing system scale.
As we move towards an Exascale HPC environment with possibly purpose-built systems, the cluster computing model enabled by InfiniBand interconnects will remain a vital communications model capable of extending well into the Top500.

Lloyd Dickman
Technical Working Group, IBTA

(Note: This article appears with reprint permission of The Exascale Reporttm)

Author: admin Categories: InfiniBand Tags: , , , , ,

January Course: Writing Application Programs for RDMA using OFA Software

December 16th, 2010

As part of its new training initiative, The OpenFabrics Alliance (OFA) is holding a “Writing Application Programs for RDMA using OFA Software” class this January 19-20, 2011 at the University of New Hampshire’s InterOperability Lab (UNH-IOL). If you are an application developer skilled in C programming and familiar with sockets, but with little or no experience programming with OpenFabrics Software, this class is the perfect opportunity to develop your RDMA expertise.

“Writing Application Programs for RDMA using OFA Software” immediately prepares you for writing application programs using RDMA. The class includes 8 hours of classroom work and 8 hours in the lab on Wednesday and Thursday, January 19 and 20. **Attendees enrolled by Dec. 24 will receive a FREE pass and rentals to Loon Mountain for skiing on Friday, January 21.**

Software Forge is a member of the IBTA and is helping drive this very first RDMA class. More information is available at www.openfabrics.org/training. Feel free to contact me with questions as well.

rupert-dance2

Regards,

Rupert Dance

rsdance@soft-forge.com

Member, IBTA’s Compliance and Interoperability Working Group

Last days of SC10

November 19th, 2010

I’ve heard there were more than 10,000 people in New Orleans this week for the SC10 conference and from what I saw on the show floor, in sessions and in the restaurants around the French Quarter, I believe it. The SCinet team has had several tiring yet productive days ensuring the network ran smoothly for the more than 342 exhibitors at the show.

One very popular demo was the real time 3D flight simulator (see photo below) that was displayed in multiple booths at the show. The flight simulator provided a virtual high-resolution rendering of Salt Lake City, Utah, from the air running over SCinet’s high speed, low latency InfiniBand/RDMA network.

Real time 3D flight simulator

Real time 3D flight simulator

This year, SCinet introduced the SCinet Research Sandbox. Sandbox participants were able to utilize the network infrastructure to demonstrate 100G networks for a wide variety of applications, including petascale computing, next-generation approaches to wide area file transfer, security analysis tools, and data-intensive computing.

This is the tenth Supercomputing show I’ve attended and I’ve made a few observations. Years ago, I used to see a lot of proprietary processors, interconnects, and storage. Now we’re seeing much more standardization around technologies such as InfiniBand. In addition, there’s also been a lot of interest this year around 100G connectivity and the need for higher faster data rates.

Several members of the SCinet team. Thank you to all of the volunteers who helped make SCinet a success this week!

Several members of the SCinet team. Thank you to all of the volunteers who helped make SCinet a success this week!

The first couple of shows I attended were very scientific and academic in nature. Now as I walk the show floor, it’s exciting to see more commercial HPC applications for financial services, automotive/aviation, and oil & gas.

I had a great time in New Orleans, and I look forward to my next ten SC conferences. See you next year at SC11 in Seattle, WA!

Eric Dube

SCinet/InfiniBand Co-Chair

Author: admin Categories: InfiniBand Tags: , , , ,

SCinet Update November 13 – Conference Begins!

November 13th, 2010

scinet-help-deskToday is Saturday, November 13, and sessions for SC10 have begun. We’re in the home stretch to get SCinet installed. We’ve been working feverishly to get everything running before the start of the conference. In addition, the network demonstrations should all be live in time for the Exhibition Press Tour on Monday night from 6-7 pm.

I’ve included more photos to show you the network in progress. If you’re going to New Orleans and have any questions about SCinet, be sure to stop by our help desk.

All the SCinet DNOCs located throughout the show floor are now finished and ready to go.

All the SCinet DNOCs located throughout the show floor are now finished and ready to go.

The show floor is busy as ever and exhibitor booth construction is well underway.

The show floor is busy as ever and exhibitor booth construction is well underway.

SCinet's main NOC network equipment racks provide connectivity for network subscribers to the world's fastest network.

SCinet's main NOC network equipment racks provide connectivity for network subscribers to the world's fastest network.

All the power distribution units needed to supply power to all the network equipment in the SCinet main NOC.

All the power distribution units needed to supply power to all the network equipment in the SCinet main NOC.

scinet-team

SCinet has more than 100 volunteers working behind the scenes to bring up the world’s fastest network. Planning began more than a year ago and all of our hard work is about to pay off as we connect SC10 exhibitors and attendees to leading research and commercial networks around the world, including the Department of Energy’s ESnet, Internet2, National LambdaRail and LONI (Louisiana Optical Network Initiative).

I will blog again as the show gets rolling and provide updates on our demos in action.

See you soon!

Eric Dube

SCinet/InfiniBand Co-Chair

Author: admin Categories: InfiniBand Tags: , , , ,

SCinet Update November 11 – Two Days Until the Show!

November 11th, 2010

scinet-noc-stage-1

SCinet Network Operations Center (NOC) stage

scinet-noc-stage-2

SCinet Network Operations Center (NOC) stage

For those heading to New Orleans for the SC10 conference, the weather this week is upper 70s and clear - although the SCinet team hasn’t had much of a chance to soak up the sun. We’re busy building the world’s fastest network - to be up and running this Sunday, November 14, for one week only. It’s going to be a busy couple of days… let me give you an update on progress to date.

The main SCinet Network Operations Center (NOC) stage is in the process of being built. I’ve included a photo of the convention center and our initial framing, followed by a picture after the power and aerial fiber cable drops have been installed.

Our SCinet team includes volunteers from educational institutions, high performance computing centers, network equipment vendors, U.S. national laboratories, research institutions, and research networks and telecommunication carriers that work together to design and deliver the SCinet infrastructure.

The picture below shows SCinet team members Mary Ellen Dube and Parks Fields receiving the aerial fiber cables that are being lowered from the catwalks scattered throughout the convention center floor.

cables-from-catwalks

Aerial fiber cables being lowered from catwalks

Below is a photo of Cary Whitney (my fellow SCinet/InfiniBand Co-Chair) and Parks Fields testing aerial InfiniBand active optical cables going between the distributed/remote NOC.

cary_parks

Cary Whitney and Parks Fields

cables-1

Aerial InfiniBand active optical cables

I’ve also included a picture of myself and team member DA Fye running a lift to install the aerial fiber cables going between the main NOC and distributed/remote NOCs throughout the show floor. Next to that is a photo of some of the many InfiniBand active optical cables going to the main SCinet NOC.

eric_da_lift

Running a lift with DA Fye

cables-2

InfiniBand active optical cables going to the main SCinet NOC

This year’s SC10 exhibitors and attendees are anticipated to push SCinet’s capacity and capabilities to the extreme. I’ll keep updating this blog to show you how we’re preparing for the show and expected demands on the network.

Eric Dube

SCinet/InfiniBand Co-Chair

Author: admin Categories: InfiniBand Tags: , , , ,

Documenting the World’s Fastest Network Installation

November 5th, 2010

Greetings InfiniBand Community,

As many of you know, every fall before Supercomputing, over 100 volunteers - including scientists, engineers, and students - come together to build the world’s fastest network: SCinet. This year, over 168 miles of fiber will be used to form the data backbone. The network takes months to build and is only active during the SC10 conference. As a member of the SCinet team, I’d like to use this blog on the InfiniBand Trade Association’s web site to give you an inside look at network as it’s being built from the ground up.SCinet IB Cables

This year, SCinet includes a 100 Gbps circuit alongside other infrastructure capable of delivering 260 gigabits per second of aggregate data bandwidth for conference attendees and exhibitors - that’s enough data to allow the entire collection of books at the Library of Congress to be transferred in well under a minute. However, my main focus will be on building out SCinet’s InfiniBand network in support of distributed HPC applications demonstrations.

For SC10, the InfiniBand fabric will consist of Quad Data Rate (QDR) 40, 80, and 120-gigabit per second (Gbps) circuits linking together various organizations and vendors with high-speed 120Gbps circuits providing backbone connectivity throughout the SCinet InfiniBand switching infrastructure.

Here are some of the InfiniBand network specifics that we have planned for SC10:

  • 12X InfiniBand QDR (120Gbps) connectivity throughout the entire backbone network
  • 12 SCinet InfiniBand Network Participants
  • Approximately 11 Equipment and Software Vendors working together to provide all the resources to build the IB network
  • Approximately 23 InfiniBand Switches will be used for all the connections to the IB network
  • Approximately 5.39 miles (8.67 km) worth of fiber cable will be used to build the IB network

The photos on this page show allSCinet NOC/DNOC equipment racks the IB cabling that we need to sort through and label prior to installation and the numerous SCinet Network Operations Center (NOC) systems and distributed/remote NOC (dNOC) equipment racks getting installed and configured.

In future blog posts, I’ll update you on the status of the SCinet installation and provide more details on the InfiniBand demonstrations that you’ll be able to see at the show, including a flight simulator in 3D and Remote Desktop over InfiniBand (RDI).

Stay tuned!

Eric Dube

Eric Dube

SCinet/InfiniBand Co-Chair

Author: admin Categories: InfiniBand Tags: , , , ,

IBTA Announces RoCE Specification, Bringing the Power of the RDMA I/O Architecture to Ethernet-Based Business Solutions

April 22nd, 2010

As you may have already heard, earlier this week at HPC Financial Markets in New York, the IBTA officially announced the release of RDMA over Converged Ethernet - i.e. RoCE. The new specification, pronounced “Rocky,” provides the best of both worlds: InfiniBand efficiency and Ethernet ubiquity.

RoCE utilizes Remote Direct Memory Access (RDMA) to enable ultra low latency communication - 1/10th that of other standards-based solutions. The low latency is enabled by the use of RDMA which moves data from one node to another without requiring a lot of help from the CPU or operating system. The specification applies to 10GigE, 40GigE or higher speed adapters.

For people locked into an Ethernet infrastructure who are not currently using RDMA but would like to, RoCE lowers the barriers to deployment. In addition to low latency, RoCE end user benefits include improved application performance, efficiency, and cost and power savings.

RoCE delivers compelling benefits to high-growth markets and applications, including financial services, data warehousing and clustered cloud computing. Products based on RoCE will be available over the coming year.

Since our April 19 launch, we have seen great news coverage:

Be sure to watch for another announcement from the IBTA next week at Interop. I hope to connect with several of you at the show.

brian_2Brian Sparks

IBTA Marketing Working Group Co-Chair

InfiniBand Leads List of Russian Top50 Supercomputers; Connects 74 Percent, Including Seven of the Top10 Supercomputers

April 14th, 2010

Last week, the 12th edition of Russia’s Top50 list of the most powerful high performance computing systems was released at the annual Parallel Computing Technologies international conference. The list is ranked according to Linpack benchmark results and provides an important tool for tracking usage trends in HPC in Russia.

The fastest supercomputer on the Top50 is enabled by 40Gb/s InfiniBand with peak performance of 414 teraflops. More importantly, it is clear that InfiniBand is dominating the list as the most-used interconnect solution, connecting 37 systems - including the top three, and seven of the Top10.

According to the Linpack benchmark results, InfiniBand demonstrates up to 92 percent efficiency; InfiniBand’s high system efficiency and utilization allow users to maximize their return-on-investment for their HPC server and storage infrastructure. Nearly three quarters of the list - represented by leading research laboratories, universities, industrial companies and banks in Russia - rely on industry-leading InfiniBand solutions to provide the highest in bandwidth, efficiency, scalability and application performance.

Highlights of InfiniBand usage on the April 2010 Russian TOP50 list include:

  • InfiniBand connects 74 percent of the Top50, including seven of the Top10 most prestigious positions (#1, #2, #3, #6, #8, #9 and #10)
  • InfiniBand provides world-leading system utilization, up to 92 percent efficiency as measured by the Linpack benchmark
  • The list showed a sharp increase in the aggregated performance - the total peak performance of the list exceeded 1PFlops to reach 1152.9TFlops, an increase of 120 percent compared to the September 2009 list - highlighting the increasing demand for higher performance
  • Ethernet connects only 14 percent of the list (seven systems) and there were no 10GigE clusters
  • Proprietary clustering interconnects declined 40 percent to connect only three systems on the list

I look forward to seeing the results of the Top500 in June at the International Supercomputing Conference. I will be attending the conference, as will many of our IBTA colleagues, and I look forward to seeing all of our HPC friends in Germany.

Brian Sparksbriansparks

IBTA Marketing Working Group Co-Chair

Foundation for the Converging Data Center: InfiniBand with Intelligent Fabric Management

March 26th, 2010

Most data centers are starting to realize the benefits of virtualization, cloud computing and automation. However, the heavy I/O requirements and intense need for better visibility and control quickly become key challenges that create inefficiencies and inhibit wider adoption of these advancements.

InfiniBand coupled with intelligent fabric management software can address many of these challenges-specifically those related to connectivity and I/O.

Technology analysis firm The Taneja Group recently took an in-depth look at this topic and published an interesting whitepaper, “Foundation for the Converging Data Center: Intelligent Fabric Management.” The paper lays out the requirements for intelligent fabric management and highlights how the right software can harness the traffic analysis capabilities that are inherent in and unique to InfiniBand to make data centers run a lot more efficiently. You can download the paper for free here.

Another interesting InfiniBand data point they shared: “in a recent Taneja Group survey of 359 virtual server administrators, those with InfiniBand infrastructures considered storage provisioning and capacity/performance management twice as easy as users of some of the other fabrics (Taneja Group 2009 Survey of Storage Best Practices and Server Virtualization).

The High Performance Computing Center Stuttgart (HLRS) is one organization that has benefited from using 40 Gb/s InfiniBand and Voltaire’s Unified Fabric ManagerTM software (UFMTM software) on their 700-node multi-tenant cluster, which basically operates as a cloud delivering HPC services to their customers. You can read more about it here.

I’d also like to invite you to tune into a recent webinar, “How to Optimize and Acclerate Application Performance with Intelligent Fabric Management,” co-hosted by Voltaire, The Taneja Group and Adaptive Computing. Here we explore this topic further and give an overview of some of the key capabilities of Voltaire’s UFM software.

Look forward to seeing many of you at Interop in Las Vegas next month!

christy-lynch

Christy Lynch

Director, Corporate Communications, Voltaire

Member, IBTA Marketing Working Group