Observations from SC12

December 3rd, 2012

The week of Supercomputing went by quickly and resulted in many interesting discussions around supercomputing and its role in both HPC environments and enterprise data centers. Now that we’re back to work, we’d like to reflect back on the successful supercomputing event. The conference this year saw a huge diversity of attendees from various countries, with major participation from top universities, which seemed to be on the leading edge of Remote Direct Memory Access (RDMA) and InfiniBand deployments.

Overall, we saw InfiniBand and Open Fabrics technologies continue their strong presence at the conference. InfiniBand dominated the Top500 list and is still the #1 interconnect of choice for the world’s fastest supercomputers. The Top500 list also demonstrated that InfiniBand is leading the way to efficient computing, which not only benefits high performance computing, but enterprise data center environments as well.

We also engaged in several discussions around RDMA. Attendees, analysts specifically, were interested in new products using RDMA over Converged Ethernet (RoCE) and their availability, and were impressed that Microsoft Server 2012 natively supports all three RDMA transports, including InfiniBand and RoCE. Another interesting development is InfiniBand customer Microsoft Windows Azure, and their increased efficiency placing them at 165 on the Top500 list.

IBTA & OFA Booth at SC12

IBTA’s Electro-Mechanical Working Group Chair, Alan Benner discussing the new InfiniBand specification with attendees at the IBTA & OFA SC12 booth

IBTA’s release of the new InfiniBand Architecture Specification 1.3 generated a lot of buzz among attendees, press and analysts. IBTA’s Electro-Mechanical Working Group Chair, Alan Benner, was one of our experts at the booth and drew a large crowd of people interested in the InfiniBand roadmap and his projections around the availability of the next specification, which is expected to include EDR and become available in draft form in April 2013.

SC12 provides a great opportunity those in high performance computing to connect in person and engage in discussions around hot industry topics; this year was focused on Software Defined Networking (SDN), OpenSM, and the pioneering efforts by both IBTA and OFA. We enjoyed conversations with exhibitors and attendees that visited our booth and a special thank you to all of those RDMA experts who participated in our booth session: Bill Boas, Cray; Katharine Schmidtke, Finisar; Alan Brenner, IBM; Todd Wilde, Mellanox; Rupert Dance, Software Forge; Kevin Moran, System Fabric Works; and Josh Simons, VMware.

Rupert Dance, Software Forge

Rupert Dance, Software Forge

IBTA & OFA Join Forces at SC12

November 7th, 2012

Attending SC12? Check out OFA’s Exascale and Big Data I/O panel discussion and stop by the IBTA/OFA booth to meet our industry experts

The IBTA is gearing up for the annual SC12 conference taking place November 10-16 at the UT Salt Palace Convention Center in Salt Lake City, Utah. We will be joining forces with the OpenFabrics Alliance (OFA) on a number of conference activities and will be exhibiting together at SC12 booth #3630.

IBTA members will participate in the OFA-moderated panel, Exascale and Big Data I/O, which we highly recommend attending if you’re at the conference.  The panel session, moderated by IBTA and OFA member Bill Boas, takes place Wednesday, November 14 at 1:30 p.m. Mountain Time and will discuss drivers for future I/O architectures.

Also be sure to stop by the IBTA and OFA booth #3630 to chat with industry experts regarding a wide range of industry topics, including:

·         Behind the IBTA integrators list

·         High speed optical connectivity

·         Building and validating OFA software

·         Achieving low latency with RDMA in virtualized cloud environments

·         UNH-IOL hardware testing and interoperability capabilities

·         Utilizing high-speed interconnects for HPC

·         Release 1.3 of IBA Vol2

·         Peering into a live OFS cluster

·         RoCE in Wide Area Networks

·         OpenFabrics for high speed SAN and NAS

Experts, including: Katharine Schmidtke, Finisar; Alan Brenner, IBM; Todd Wilde, Mellanox; Rupert Dance, Software Forge; Bill Boas and Kevin Moran, System Fabric Works; and Josh Simons, VMware will be in the booth to answer your questions and discuss topics currently affecting the HPC community.

Be sure to check the SC12 website to learn more about Supercomputing 2012, and stay tuned to the IBTA website and Twitter to follow IBTA’s plans and activities at SC12.

See you there!

New InfiniBand Architecture Specification Open for Comments

October 15th, 2012

After an extensive review process, Release 1.3 of Volume 2 of the InfiniBand Architecture Specification has been approved by our Electro-Mechanical Working Group (EWG). The specification is undergoing final review by the full InfiniBand Trade Association (IBTA) membership and will be available for vendors at Plugfest 22, taking place October 15-26, 2012 at University of New Hampshire Interoperability Lab in Durham, New Hampshire.

All IBTA working groups and individual members have had several weeks to review and comment on the specification. We are encouraged by the feedback we’ve received and are looking forward to the official release at SC12, taking place November 10-16 in Salt Lake City, Utah.

Release 1.3 is a major overhaul of the InfiniBand Architecture Specification and features important new architectural elements:
• FDR and EDR signal specification methodologies
• Analog signal specifications for FDR, that have been verified through Plugfest compliance and interoperability measurements
• More efficient 64b/66 encoding method
• Forward Error Correction coding
• Improved specification of QSFP-4x and CXP-12x connectors, ports and management interfaces

The new specification also includes significant copy editing and organization to include sub volumes and improve overall readability. The previous specification, Release 1.2.1, was released in November 2007. As Chair of the EWG, I’m pleased with the technical progress made on the InfiniBand Architecture specification. More importantly, I’m excited about the impact that this new specification release will have for users and developers of InfiniBand technology.

Alan Benner
EWG Chair

InfiniBand as Data Center Communication Virtualization

August 1st, 2012

Last month, the Taneja Group released a report on InfiniBand’s role in the data center, “InfiniBand’s Data Center March” confirming what members of the IBTA have known for a while: InfiniBand is expanding its role in the enterprise.

Mike Matchett, Sr. Analyst at Taneja Group recently posted a blog on the industry report. In his post, Mike summarizes the growing InfiniBand market and the benefits he sees in adopting InfiniBand. Here is an excerpt from his blog:

Recently we posted a new market assessment of InfiniBand and its growing role in enterprise data centers, so I’ve been thinking a lot about low-latency switched fabrics and what they imply for IT organizations. I’d like to add a more philosophical thought about the optimized design of InfiniBand and its role as data center communication virtualization.

From the start, InfiniBand’s design goal was to provide a high performing messaging service for applications even if they existed in entirely separate address spaces across servers. By architecting from the “top down” rather than layering up from something like Ethernet’s byte stream transport, InfiniBand is able to deliver highly efficient and effective messaging between applications. In fact, the resulting messaging service can be thought of as “virtual channel IO” (I’m sure much to the delight of mainframers).

To read the full blog post, check out the Taneja blog. And be sure to read the full analyst report from Taneja Group that published in July.

Bill Lee
IBTA Marketing Working Group Chair

Author: admin Categories: Uncategorized Tags:

InfiniBand’s Data Center March

July 18th, 2012

Today’s enterprise data center is challenged with managing growing data, hosting denser computing clusters, and meeting increasing performance demands. As IT architects work to design efficient solutions for Big Data processing, web-scale applications, elastic clouds, and the virtualized hosting of mission-critical applications they are realizing that key infrastructure design “patterns” include scale-out compute and storage clusters, switched fabrics, and low-latency I/O. 

This looks a lot like what the HPC community has been pioneering for years - leveraging scale-out compute and storage clusters with high-speed low-latency interconnects like InfiniBand. In fact, InfiniBand has now become the most widely used interconnect among the top 500 supercomputers (according to www.TOP500.org).  It has taken a lot of effort to challenge the entrenched ubiquity of Ethernet, but InfiniBand has not just survived for over a decade, it has consistently delivered on an aggressive roadmap - and it has an even more competitive future. 

The adoption of InfiniBand in a data center core environment not only supercharges network communications, but by simplifying and converging cabling and switching reduces operational risk and can even reduce overall cost. Bolstered by technologies that should ease migration concerns like RoCE and virtualized protocol adapters, we expect to see InfiniBand further expand into mainstream data center architectures not only as a back-end interconnect in high-end storage systems, but also as the main interconnect across the core.  

For more details be sure to check out Taneja Group’s latest report “InfiniBand’s Data Center March” - available here.

Mike Matchett
Sr. Analyst, Taneja Group 

Author: admin Categories: Uncategorized Tags:

InfiniBand Most Used Interconnect on the TOP500

June 21st, 2012
The TOP500 list was released this week, ranking the fastest supercomputers in the world. We at the IBTA were excited to see that InfiniBand’s presence increased on the list to become the most used interconnect for the first time.  Clearly, InfiniBand is the interconnect of choice for today’s compute-intensive systems! As the chart below demonstrates, InfiniBand’s adoption rate has grown significantly, outpacing all of the other options.


We were also pleased to see that since the last report 6 months ago FDR has increased the number of systems it connects tenfold, making it the fastest growing interconnect on the list.  

The TOP500 list notes that InfiniBand connects eight of the 20 Petascale systems on the list, and that InfiniBand-connected systems boast the highest performance growth rates on the list.  Petascale systems on the TOP500 list favored InfiniBand because of its scalability and the resulting computing efficiency. The graph below illustrates the performance trends showing how supercomputing depends on InfiniBand to achieve the highest performance.


The TOP500 list demonstrates what the IBTA and the InfiniBand community have already known - InfiniBand is a technology that has changed the face of HPC and, we believe, is having the same effect on the enterprise data center. Below, are some additional stats referenced on the TOP500 list.

  • InfiniBand connects 25 of the 30 most compute-efficient systems, including the top 2
  • InfiniBand-based system performance grew 69% from June ‘11 to June ‘12

Want to learn more about the TOP500, or how InfiniBand fared? Check out the IBTA’s press release on the news.


Bill Lee, IBTA Marketing Working Group Co-Chair

Bill Lee, IBTA Marketing Working Group Co-Chair

Author: admin Categories: Uncategorized Tags:

InfiniBand at Interop

May 29th, 2012

This month, IBTA member companies attended the Interop Conference 2012 in Las Vegas. As news from the event streamed in and demos began on-site, we were excited to see that InfiniBand and RDMA were making headlines at this traditionally datacenter-focused event. Microsoft, FusionIO and Mellanox demoed a setup with Windows Server 2012 Beta and SMB 3.0 that illustrated amazing remote file performance using SMB Direct (SMB over RDMA), resulting in 5.8 Gbytes per second from a single network port. The demo consisted of a combination of Intel Romley motherboards each with two CPUs each with 8 cores, the faster PCIe Gen3 bus, four FusionIO ioDrive 2 drives rated at 1.5 Gbytes/sec each and the latest Mellanox InfiniBand ConnectX-3 network adapters. TechNet’s Microsoft contributor Jose Barreto noted in his coverage of the demo when looking at this handy results table that accompanied the demo:

“You can’t miss how RDMA improves the numbers for % Privileged CPU utilization, fulfilling the promise of low CPU utilization and low number of cycles per byte. The comparison between traditional, non-RDMA 10GbE and InfiniBand FDR for the first workload shows the most impressive contrast: over 5 times the throughput with about half the CPU utilization.”


If you’re interested in learning more about the demo, check out Jose’s presentation at the conference; slides are available on his blog. Also if you want to learn more about RDMA, don’t forget to swing by the IBTA’s RDMA Over Converged Ethernet (RoCE) section of the website.

We’re happy to see RDMA getting the recognition it deserves, and we look forward to seeing more coverage resulting from Interop in the coming days, as well as future discussions around RoCE and how InfiniBand solutions can be deployed in the enterprise.










Bill Lee

Author: admin Categories: Uncategorized Tags:

VMware: InfiniBand and RDMA Better Than Ethernet

April 25th, 2012

Last month, VMware’s Josh Simon’s attended the OpenFabrics Alliance User and Developer Workshop in Monterey, CA. While at the event, Josh sat down with InsideHPC to discuss VMware’s current play in the HPC space, big data and the company’s interest in InfiniBand and RDMA. Josh believes that by adopting RDMA, VMware can better manage low-latency issues in the enterprise.

Josh noted that RDMA over an InfiniBand device aides in virtualization. VMware is seeing live-migration times shrinking in its virtualization platforms, such as VMotion and others. The company is also seeing CPU savings in addition, thanks to the efficient RDMA applications it has deployed.

Greg Ferro also posted on The Ethereal Mind blog about how VMware believes InfiniBand over Ethernet is better than Ethernet alone. According to Greg:

“Good InfiniBand networks have latency measured in hundreds of nanoseconds and much lower impact on system CPU because InfiniBand uses RDMA to transfer data. RDMA (Remote Direct Memory Access) means that data is transferred from memory location to memory location thus removing the encapsulation overhead of Ethernet and IP (that’s as short as I can make that description).”

To read the full text of Greg Ferro’s blog post, click here.

To watch Josh’s interview with InsideHPC, or to check out the other presentations from the OFA workshop, head over to the InsideHPC workshop page. Presentations are also available for download on the OFA website.

briansparks-150x150 Brian Sparks
IBTA Marketing Working Group Co-Chair

Author: admin Categories: Uncategorized Tags:

High Performance Computing by Any Other Name Smells So Sweet

March 29th, 2012
Taken from the ISC HPC Blog, by Peter Ffoulkes

This month, the annual HPC Advisory Council meeting took place in Switzerland, stirring many discussions about the future of big data, and the available and emerging technologies slated to solve the inundation of data on enterprises.

On the ISC blog, Peter Ffoulkes from TheInfoPro writes that the enterprise is finally starting to discover that HPC is the pathway to success, especially when it comes to Big Data Analytics.

“Going back to fundamentals, HPC is frequently defined as either compute intensive or data intensive computing or both. Welcome to today’s hottest commercial computing workload, “Total Data” and business analytics. As described by 451 Research, “Total Data” involves processing any data that might be applicable to the query at hand, whether that data is structured or unstructured, and whether it resides in the data warehouse, or a distributed Hadoop file system, or archived systems, or any operational data source – SQL or NoSQL – and whether it is on-premises or in the cloud.”

According to Ffoulkes, the answer to the total data question will remain in HPC. This week, many of us attended the OpenFabrics Alliance User and Developer Workshop and discussed these same topics: enterprise data processing needs, cloud computing, big data, and while the event has ended, I hope the discussions continue as we look to the future of big data.

In the meantime, be sure to check out Peter’s thoughts in his full blog post.

Jim Ryan

Jim Ryan

Author: admin Categories: InfiniBand Tags:

RoCE and InfiniBand: Which should I choose?

February 13th, 2012

The IBTA wrapped up the four part fall Webinar Series in December, and if you didn’t have the opportunity to attend these events live, there is a recorded version available on the IBTA’s website.  In the webinar series, we suggested the idea that it makes sense to take a fresh look at I/O in light of recent developments in I/O and data center architecture. We took a high level look at two RDMA technologies which were InfiniBand and a relative new comer called RoCE - RDMA over Converged Ethernet . 

RDMA is an interesting network  technology that has been dominant in the HPC marketplace for quite a while and is now finding increasing application in modern commercial data centers, especially in performance sensitive environments or environments that depend on an agile, cost constrained approach to computing, for example almost any form of cloud computing.  So it’s no surprise that several questions arose during the webinar series about the differences between a “native” InfiniBand RDMA fabric and one based on RoCE.   In a nutshell, the questions boiled down to this:  What can InfiniBand do that RoCE cannot?  If I start down the path of deploying RoCE, why not simply stick with it, or should I plan to migrate to IB?”

As a quick review, RoCE is a new technology that is best thought of as a network that delivers many of the advantages of RDMA, such as lower latency or improved CPU utilization, but using a Ethernet switched fabric instead of InfiniBand adapters and switches.  This is illustrated in the diagram below.  Conceptually, RoCE is simple enough, but there is a subtlety that is easy to overlook.  Many of us, when we think of Ethernet, naturally envision the complete IP architecture consisting of TCP, IP and Ethernet.  But the truth is that RoCE bears no relationship to traditional TCP/IP/Ethernet, even though it uses an Ethernet layer.  The diagram also compares the two RDMA technologies to traditional TCP/IP/Ethernet.   As the drawing makes clear, RoCE and InfiniBand are sibling technologies, but are only distant cousins to TCP/IP/Ethernet.   Indeed, RoCE’s heritage is found in the basic InfiniBand architecture and is fully supported by the open source software stacks provided by the Open Fabrics Alliance.  So if it’s possible to use Ethernet and still harvest the benefits of RDMA, what’s to choose between the two?   Naturally, there are trade-offs to be made.



During the webinar we presented the following chart as a way to illustrate some of the trade-offs that one might encounter in choosing an I/O architecture.  The first column shows a pure Ethernet approach, as is common in most data centers today.  In this scenario, the data center rides the wave of improvements in Ethernet speeds.  Naturally, using traditional TCP/IP/Ethernet, you don’t get any of the RDMA advantages.   For this blog, our interest is mainly in the middle and right hand columns which focus on the two alternate implementations of RDMA technology.  


From the application perspective both RoCE and native InfiniBand present the same API and provide about the same sets of services.  So what are the differences between them?  They really break down into four distinct areas. 

  • Wire speed and the bandwidth roadmap. The roadmap for Ethernet is maintained by the IEEE and is designed to suit the needs of a broad range of applications ranging from home networks to corporate LANs to data center interconnects and even wide area networking. Naturally, each type of application has unique requirements and different speed requirements. For example, client networking does not have the speed requirements that are typical of a data center application. Of this wide range of applications the Ethernet roadmap naturally tends to reflect the bulk of its intended market, even though speed grades more representative of data center needs (40 and 100GbE) have recently been introduced. The InfiniBand roadmap on the other hand, is maintained by the InfiniBand Trade Association and has one focus, which is to be the highest performance data center interconnect possible. Commodity InfiniBand components (NICs and switches) at 40Gb/s have been in wide distribution for several years now, and a new 56Gb/s speed grade has recently been announced. Although the InfiniBand and Ethernet roadmaps are slowly converging, it is still true that the InfiniBand bandwidth roadmap leads the Ethernet roadmap. So if bandwidth is a serious concern, you would probably want to think about deploying an InfiniBand fabric.  ib-roadmap

                      InfiniBand Speed Roadmap

  •  Adoption curve. Historically, next generation Ethernet has been deployed first as a backbone (switch-to-switch) technology and eventually trickled down to the end nodes. 10GbE was ratified in 2002, but until 2007 almost all servers connected to the Ethernet fabric using 1GbE, with 10GbE reserved for the backbone. The same appears to be true for 40 and 100GbE; although the specs were ratified by the IEEE in 2010, an online search for 40GbE NICs reveals only one 40GbE NIC product in the marketplace today. Server adapters for InfiniBand on the other hand, are ordinarily available coincident with the next announced speed bump allowing servers to connect to an InfiniBand network at the very latest speed grades right away. 40Gb/s InfiniBand HCAs, known as QDR, have been available for a number of years now, and new adapter products matching the next roadmap speed bump, known as FDR, were announced at SC11 this past fall. The important point here is that one trade-off to be made in deciding between RoCE and native InfiniBand is that RoCE allows you to preserve your familiar Ethernet switched fabric, but at the price of a slower adoption curve compared to native InfiniBand.
  • Fabric management. RoCE and InfiniBand both offer many of the features of RDMA, but there is a fundamental difference between an RDMA fabric built on Ethernet using RoCE and one built on top of native InfiniBand wires. The InfiniBand specification describes a complete management architecture based on a central fabric management scheme which is very much in contrast to traditional Ethernet switched fabrics, which are generally managed autonomously. InfiniBand’s centralized management architecture, which gives its fabric manager a broad view of the entire layer 2 fabric, allows it to provide advanced fabric features such as support for arbitrary layer 2 topologies, partitioning, QoS and so forth. These may or may not be important in any particular environment, but by avoiding the limitations of the traditional spanning tree protocol, InfiniBand fabrics can maximize bi-sectional bandwidth and thereby take full advantage of the fabric capacity. That’s not to say that there are not proprietary solutions in the Ethernet space, or that there is no work underway to improve Ethernet management schemes, but again, if these features are important in your environment, that may impact your choice of native InfiniBand compared to an Ethernet-based RoCE solution. So when choosing between an InfiniBand fabric and a RoCE fabric, it makes sense to consider the management implications.
  • Link level flow control vs. DCB. RDMA, whether native InfiniBand or RoCE, works best when the underlying wires implement a so-called lossless fabric. A lossless fabric is one where packets on the wire are not routinely dropped. By comparison, traditional Ethernet is considered a lossy fabric since it frequently drops packets, relying on the TCP transport layer to notice these lost packets and to adjust for them. InfiniBand, on the other hand, uses a technique known as link level flow control, which ensures that packets are not dropped in the fabric except in the case of serious errors. This technique helps explain much of InfiniBand’s traditionally high bandwidth utilization efficiency. In other words, you get all the bandwidth for which you’ve paid. When using RoCE, you can accomplish almost the same thing by deploying the latest version of Ethernet sometimes known as Data Center Bridging, or DCB. DCB comprises five new specifications from the IEEE which taken together provide almost the same lossless characteristic as InfiniBand’s link level flow control. But there’s a catch; to get the full benefit of DCB requires that your switches and NICs implement the important parts of these new IEEE specifications. I would be very interested to hear from anybody who has experience with these new features in terms of how complex they are to implement in products, how well they work in practice, and if there are any special management challenges.

As we pointed out in the webinars, there are many practical routes to follow on the path to an RDMA fabric.  In some environments, it is entirely likely that RoCE will be the ultimate destination, providing many of the benefits of RDMA technology while preserving major investments in existing Ethernet.  In some other cases, RoCE presents a great opportunity to become familiar with RDMA on the way toward implementing the highest performance solution based on InfiniBand.  Either way, it makes sense to understand some of these key differences in order to make the best decision going forward.

If you didn’t get a chance to attend any of the webinars or missed one of the parts, be sure to check out the recording here on the IBTA website.  Or, if you have any lingering questions about the webinars or InfiniBand and RoCE, email me at pgrun@systemfabricworks.com.






Paul Grun
System Fabric Works

Author: admin Categories: Uncategorized Tags: