Archive

Archive for the ‘Uncategorized’ Category

How RDMA is Solving AI’s Scalability Problem

October 25th, 2017

stockphoto_5

Artificial Intelligence (AI) is already impacting many aspects of our day-to-day lives. Through the use of AI, we have been introduced to autonomous vehicles, real-time fraud detection, public safety, advanced drug discovery, cancer research and much more. AI has already enabled scientific achievements once thought impossible, while also delivering on the promise of improving humanity.

Today, AI and machine learning is becoming completely intertwined in society and the way we interact with computers, but the real barrier to tackling even bigger challenges of tomorrow is scalable performance. As future research, development and simulations require the processing of larger data sets, the key to unlocking the performance barrier associated with highly parallelized computing and the communication overhead associated with it will undoubtedly be the interconnect.  The more parallel processes we add to solve a complex problem, the more communication and data movement is needed.  Remote Direct Memory Access (RDMA) fabrics, such as InfiniBand and RDMA over Converged Ethernet (RoCE), are key to unlocking scalable performance for the most demanding AI applications being developed and deployed today.

The InfiniBand Trade Association’s (IBTA) InfiniBand Roadmap lays out a clear and attainable path for performance gains, detailing 1x, 4x and 12x port widths with bandwidths reaching 600Gb/s this year and further outlining plans for future speed increases. For those already deploying InfiniBand in their HPC and AI systems, the roadmap provides specific milestones around expected performance improvements to ensure their investment is protected, and with the assurance of backwards and forwards compatibility across the generations. While high bandwidth is very important, the low latency benefits of RDMA are equally essential for the advancement of machine learning and AI. The ultra-low latency provided by RDMA enables minimal processing overhead and greatly accelerates overall application performance, which AI requires when moving massive amounts of data, exchanging messages and computing results.  InfiniBand’s low latency and high bandwidth characteristics will undoubtedly address AI scalability and efficiency needs as systems tackle challenges involving even larger and more complex data sets.

The InfiniBand Architecture Specification is an open standard developed in a vendor-neutral, community-centric manner. The IBTA has a long history of addressing HPC and enterprise application requirements for I/O performance and scalability – providing a reliable ecosystem for end users through promotion of open standards and roadmaps, compliant and interoperable products, as well as success stories and educational resources. Furthermore, many institutions advancing AI research and development leverage InfiniBand and RoCE  as they satisfy both performance needs and requirements for non-proprietary, open technologies.

One of the most critical elements when creating a cognitive computing application involves deep learning. It takes a considerable amount of time to find a solution in creating a data model with the highest degree of accuracy. While this could be done over a traditional network such as Ethernet, the time required to train the models is considerably time consuming and not practical.  Today, all major frameworks (i.e. TensorFlow, Microsoft Cognitive Toolkit, Baidu’s PaddlePaddle and others) and even communications libraries such as NVIDIA’s NCCL library are natively enabled to take advantage of the low level verb implementation of the InfiniBand standard.  This greatly improves the overall accuracy in training, but also considerably reduces amount of time needed to deploy the solution (as highlighted in a recent IBM PowerAI DDL demonstration).

The supercomputing industry has been aggressively marching towards Exascale. RDMA is the core offload technology that is able to solve the scalability issues hindering the advancements of HPC.  Since machine learning shares the same underlying hardware and interconnect needs as HPC, RDMA is unlocking the power of AI through the use of InfiniBand.  As machine learning demands advance even further, InfiniBand will continue to lead and drive the industries who rely them.

Be sure to check back in on the IBTA blog for future posts on RDMA’s role in AI and machine learning.

Scot Schultz, Sr. Director of HPC/AI & Techincal Computing at Mellanox

Scot Schultz, Sr. Director of HPC/AI & Technical Computing at Mellanox

Author: admin Categories: Uncategorized Tags:

InfiniBand Roadmap – Charting Speeds for Future Needs

December 14th, 2015

ib-roadmap

Defining the InfiniBand ecosystem to accommodate future performance increases is similar to city planners preparing for urban growth. Both require a collaborative effort between experts and the community for whom they serve.

The High Performance Computing (HPC) community continues to call for faster interconnects to transfer massive amounts of data between its servers and clusters. Today, the industry’s fastest supercomputers are processing data in petaflops and experts expect that they will reach Exascale computing by 2025.

IBTA’s working groups are always looking ahead to meet the HPC community’s future performance demands. We are constantly updating the InfiniBand Roadmap, a visual representation of InfiniBand speed increases, to keep our work in line with expected industry trends and systems-level performance gains.

The roadmap itself is dotted by data rates, which are defined by transfer speeds and release dates. Each data rate has a designated moniker and is measured in three ways; 1x, 4x and 12x. The number refers to the amount of lanes per port with each additional lane allowing for greater bandwidth.

Current defined InfiniBand Data Rates include the following:

Data Rate: 4x Link Bandwidth 12x Link Bandwidth
SDR 8 Gb/s 24 Gb/s
DDR 16 Gb/s 48 Gb/s
QDR 32 Gb/s 96 Gb/s
FDR 56 Gb/s 168 Gb/s
EDR 100 Gb/s 300 Gb/s
HDR 200 Gb/s 600 Gb/s

The evolution of InfiniBand can be easily tracked by its data rates as demonstrated in the table above. A typical server or storage interconnect uses 4x links or 4 lanes per port. However, clusters and supercomputers can leverage 12x link bandwidth interconnects for even greater performance. Looking ahead, we expect to see a number of technical advances as the race to Exascale heats up.

As the roadmap demonstrates, planning for future data rates starts years in advance of their expected availability. In the latest edition, you will find two data rates scheduled beyond HDR - NDR and the newly christened XDR. Stayed tuned as the IBTA specifies NDR and XDR’s release dates and bandwidths.

Bill Lee

IBTA Tests Compliance & Interoperability with Top Vendors at Plugfest #26

February 17th, 2015

In preparation for the IBTA’s upcoming Integrators’ List and April Plugfest, we wanted to give a quick recap of our last Plugfest, which included some great participants.

Every year, the IBTA hosts two Compliance and Interoperability Plugfests, one in April and one in October, at the University of New Hampshire (UNH) Interoperability Lab (IOL) in Durham, New Hampshire. The Plugfest’s purpose is to provide an opportunity for participants to measure their products for compliance with the InfiniBand architecture Specifications as well as interoperability with other InfiniBand products.

This past October, we hosted our 26th Plugfest in New Hampshire. A total of 16 cable vendors participated, while our device vendors included Intel, Mellanox and NetApp. Test equipment vendors included Anritsu, Keysight (formerly Agilent) and Tektronix. Overall, 136 cables and 13 devices were tested, and the data is broken out below:

ib-blog

The Integrators’ List, a compilation of all the products tested and accepted to be compliant with the InfiniBand architecture specification, will go live in about a month, so stay tuned!

Plugfest #27 will take place from April 13 to April 24. The cable and device registration deadline is Wednesday, March 16, while the shipping deadline is Wednesday, April 1. Check out the IBTA website for additional information on the upcoming Plugfest.

Tweet: Excited for Plugfest 27? Get a recap of last year's event here: http://blog.infinibandta.org/2015/02/16/ibta-tests-com…at-plugfest-26/

Rupert Dance, IBTA CIWG

Rupert Dance, IBTA CIWG

Author: admin Categories: Uncategorized Tags: ,

The IBTA Celebrates Its 15th Anniversary

December 15th, 2014

Since 1999, the IBTA has worked to further the InfiniBand specification in order to provide the IT industry with an advanced fabric architecture that transmits large amounts of data between data centers around the globe. This year, the IBTA is celebrating 15 years of growth and success.

In its mission to unite the IT industry, the IBTA has welcomed an array of distinguished members including Cray, Microsoft, Oracle and QLogic. The IBTA now boasts over 50 member companies all dedicated to furthering the InfiniBand specification.

The continued growth of the IBTA reflects the IT industry’s dedication to the advancement of InfiniBand. Many IBTA member companies are developing products incorporating InfiniBand technology, including FDR, which has proven to be the fastest growing generation of InfiniBand technology: FDR adoption grew 76 percent year over year from 80 systems in 2013 to 141 systems in 2014. Most recently, the Top500 list announced that 225 of the world’s most powerful computers chose InfiniBand as their interconnect device in 2014.

2014 also marked the release of RoCEv2. RoCEv2 is an extension of the original RoCE specification announced in 2010 that brought benefits of Remote Direct Memory Access (RDMA) I/O architecture to Ethernet-based networks. The updated specification addresses the needs of today’s evolving enterprise data centers by enabling routing across Layer 3 networks. By extending RoCE to allow Layer 3 routing, the specification can provide better traffic isolation and enables hyperscale data center deployments.

Below is a timeline that further illustrates the IBTA’s advancements over the past 15 years that have helped to bring InfiniBand technology to the forefront of the interconnect industry.

ibta-15th-anniv-graph

Volume 1 – General Specification
Volume 2 – Physical Specification

Bill Lee

Visit the IBTA and OFA at SC13!

November 13th, 2013

Attending SC13? The IBTA will be teaming up once again with the OpenFabrics Alliance (OFA) to participate in a number of conference activities. The organizations will be exhibiting together at booth #4132 – stop by for access to:

  • Hands-on computing cluster demonstrations
  • IBTA cable compliance demonstration
  • IBTA & OFA member company exhibition map and SC13 news
  • Current and prospective member information
  • Information regarding OFA’s 2014 User Day and Developer Workshop

IBTA and OFA will also lead the discussion on the future of I/O architectures for improved application performance and efficiency during several technical sessions:

  • RDMA: Scaling the I/O Architecture for Future Applications,” an IBTA-moderated session, will discuss what new approaches to I/O architecture could be used to meet Exascale requirements. The session will be moderated by IBTA’s Bill Boas and will feature a discussion between top users of RDMA. The panel session will take place on Wednesday, November 20 from 1:30 p.m. to 3:00 p.m. in room 301/302/303.

  • Accelerating Improvements in HPC Application I/O Performance and Efficiency,” an OFA Emerging Technologies exhibit, will present to attendees ideas on how incorporating a new framework of I/O APIs may increase performance and efficiency for applications. This Emerging Technologies exhibit will take place at booth #3547. The OFA will also be giving a short talk on this subject in the Emerging Technologies theatre at booth #3947 on Tuesday, November 19 at 2:50 p.m.

  • OFA member company representatives will further develop ideas discussed in its Emerging Technologies exhibit during the Birds of a Feather (BoF) session entitled, “Discussing an I/O Framework to Accelerate Improvements in Application I/O Performance.” Moderators Paul Grun of Cray and Sean Hefty of Intel will lead the discussion on how developers and end-users can enhance and further encourage the growth of open source I/O software.

Bill Lee

Chair, Marketing Working Group (MWG)

InfiniBand Trade Association

Plugfest #23: InfiniBand Compliance & Interoperability Testing

March 28th, 2013

The IBTA is ramping up for our 23rd Plugfest (Plugfest 23), taking place April 1-12, 2013 at the University of New Hampshire’s Interoperability Laboratory (UNH-IOL). The bi-annual Plugfest event provides cable and device compliance testing with the current InfiniBand Architecture specifications as well as interoperability with other InfiniBand products.

The IBTA ElectroMechanical Working Group (EWG) has prepared a draft of the next InfiniBand Architecture Specification, 1.3.1 of Volume 2, which specifies the parameters for Enhanced Data Rate (EDR) InfiniBand products.  The InfiniBand™ specification defines the interconnect technology for servers and storage that changes the way data centers are built, deployed and managed. Many vendors have submitted EDR cables for testing, and we expect Plugfest 23 to be the preview of the new EDR generation of products. IBTA added FDR Receiver testing to this year’s Plugfest, which depends on Agilent, Anritsu and Tektronix test equipment and will lead the way for EDR device testing moving forward.

“The IBTA Plugfest has long been a critical event for InfiniBand vendors and users, and Plugfest 23 will be particularly important as the technology continues to advance,” said Dr. Alan Benner, system & network design engineer at IBM Corporation. “I’m looking forward to working with IBTA and our testing equipment partners to further the testing of InfiniBand-compliant products.”

Vendor devices and cables successfully passing all required Integrators’ List Compliance Tests and Interoperability procedures at Plugfest 23 will be listed on the IBTA Integrators’ List, updated twice per year, and will be granted the IBTA Integrators’ List Logo. The IBTA’s compliance and interoperability program provides resources for end-users who are building InfiniBand clusters, drives adoption of RDMA technology and provides assurance to participating vendors’ customers and end-users.

To learn more about Plugfest 23, or to find out more about IBTA membership benefits, please visit the IBTA website or contact ibta_plugfest@soft-forge.com.


The IBTA wishes to thank Agilent, Anritsu, Molex and Tektronix for providing test equipment for the IBTA Plugfest. Testing equipment is provided free of charge for the benefit of the InfiniBand community.  IBTA Plugfests would not be possible without this testing equipment.


Rupert Dance, Software Forge

Rupert Dance, Software Forge

Author: admin Categories: Uncategorized Tags:

Observations from SC12

December 3rd, 2012

The week of Supercomputing went by quickly and resulted in many interesting discussions around supercomputing and its role in both HPC environments and enterprise data centers. Now that we’re back to work, we’d like to reflect back on the successful supercomputing event. The conference this year saw a huge diversity of attendees from various countries, with major participation from top universities, which seemed to be on the leading edge of Remote Direct Memory Access (RDMA) and InfiniBand deployments.

Overall, we saw InfiniBand and Open Fabrics technologies continue their strong presence at the conference. InfiniBand dominated the Top500 list and is still the #1 interconnect of choice for the world’s fastest supercomputers. The Top500 list also demonstrated that InfiniBand is leading the way to efficient computing, which not only benefits high performance computing, but enterprise data center environments as well.

We also engaged in several discussions around RDMA. Attendees, analysts specifically, were interested in new products using RDMA over Converged Ethernet (RoCE) and their availability, and were impressed that Microsoft Server 2012 natively supports all three RDMA transports, including InfiniBand and RoCE. Another interesting development is InfiniBand customer Microsoft Windows Azure, and their increased efficiency placing them at 165 on the Top500 list.

IBTA & OFA Booth at SC12

IBTA’s Electro-Mechanical Working Group Chair, Alan Benner discussing the new InfiniBand specification with attendees at the IBTA & OFA SC12 booth

IBTA’s release of the new InfiniBand Architecture Specification 1.3 generated a lot of buzz among attendees, press and analysts. IBTA’s Electro-Mechanical Working Group Chair, Alan Benner, was one of our experts at the booth and drew a large crowd of people interested in the InfiniBand roadmap and his projections around the availability of the next specification, which is expected to include EDR and become available in draft form in April 2013.

SC12 provides a great opportunity those in high performance computing to connect in person and engage in discussions around hot industry topics; this year was focused on Software Defined Networking (SDN), OpenSM, and the pioneering efforts by both IBTA and OFA. We enjoyed conversations with exhibitors and attendees that visited our booth and a special thank you to all of those RDMA experts who participated in our booth session: Bill Boas, Cray; Katharine Schmidtke, Finisar; Alan Brenner, IBM; Todd Wilde, Mellanox; Rupert Dance, Software Forge; Kevin Moran, System Fabric Works; and Josh Simons, VMware.

Rupert Dance, Software Forge

Rupert Dance, Software Forge

IBTA & OFA Join Forces at SC12

November 7th, 2012

Attending SC12? Check out OFA’s Exascale and Big Data I/O panel discussion and stop by the IBTA/OFA booth to meet our industry experts

The IBTA is gearing up for the annual SC12 conference taking place November 10-16 at the UT Salt Palace Convention Center in Salt Lake City, Utah. We will be joining forces with the OpenFabrics Alliance (OFA) on a number of conference activities and will be exhibiting together at SC12 booth #3630.

IBTA members will participate in the OFA-moderated panel, Exascale and Big Data I/O, which we highly recommend attending if you’re at the conference.  The panel session, moderated by IBTA and OFA member Bill Boas, takes place Wednesday, November 14 at 1:30 p.m. Mountain Time and will discuss drivers for future I/O architectures.

Also be sure to stop by the IBTA and OFA booth #3630 to chat with industry experts regarding a wide range of industry topics, including:

·         Behind the IBTA integrators list

·         High speed optical connectivity

·         Building and validating OFA software

·         Achieving low latency with RDMA in virtualized cloud environments

·         UNH-IOL hardware testing and interoperability capabilities

·         Utilizing high-speed interconnects for HPC

·         Release 1.3 of IBA Vol2

·         Peering into a live OFS cluster

·         RoCE in Wide Area Networks

·         OpenFabrics for high speed SAN and NAS

Experts, including: Katharine Schmidtke, Finisar; Alan Brenner, IBM; Todd Wilde, Mellanox; Rupert Dance, Software Forge; Bill Boas and Kevin Moran, System Fabric Works; and Josh Simons, VMware will be in the booth to answer your questions and discuss topics currently affecting the HPC community.

Be sure to check the SC12 website to learn more about Supercomputing 2012, and stay tuned to the IBTA website and Twitter to follow IBTA’s plans and activities at SC12.

See you there!

New InfiniBand Architecture Specification Open for Comments

October 15th, 2012

After an extensive review process, Release 1.3 of Volume 2 of the InfiniBand Architecture Specification has been approved by our Electro-Mechanical Working Group (EWG). The specification is undergoing final review by the full InfiniBand Trade Association (IBTA) membership and will be available for vendors at Plugfest 22, taking place October 15-26, 2012 at University of New Hampshire Interoperability Lab in Durham, New Hampshire.

All IBTA working groups and individual members have had several weeks to review and comment on the specification. We are encouraged by the feedback we’ve received and are looking forward to the official release at SC12, taking place November 10-16 in Salt Lake City, Utah.

Release 1.3 is a major overhaul of the InfiniBand Architecture Specification and features important new architectural elements:
• FDR and EDR signal specification methodologies
• Analog signal specifications for FDR, that have been verified through Plugfest compliance and interoperability measurements
• More efficient 64b/66 encoding method
• Forward Error Correction coding
• Improved specification of QSFP-4x and CXP-12x connectors, ports and management interfaces

The new specification also includes significant copy editing and organization to include sub volumes and improve overall readability. The previous specification, Release 1.2.1, was released in November 2007. As Chair of the EWG, I’m pleased with the technical progress made on the InfiniBand Architecture specification. More importantly, I’m excited about the impact that this new specification release will have for users and developers of InfiniBand technology.

Alan Benner
EWG Chair

InfiniBand as Data Center Communication Virtualization

August 1st, 2012

Last month, the Taneja Group released a report on InfiniBand’s role in the data center, “InfiniBand’s Data Center March” confirming what members of the IBTA have known for a while: InfiniBand is expanding its role in the enterprise.

Mike Matchett, Sr. Analyst at Taneja Group recently posted a blog on the industry report. In his post, Mike summarizes the growing InfiniBand market and the benefits he sees in adopting InfiniBand. Here is an excerpt from his blog:

Recently we posted a new market assessment of InfiniBand and its growing role in enterprise data centers, so I’ve been thinking a lot about low-latency switched fabrics and what they imply for IT organizations. I’d like to add a more philosophical thought about the optimized design of InfiniBand and its role as data center communication virtualization.

From the start, InfiniBand’s design goal was to provide a high performing messaging service for applications even if they existed in entirely separate address spaces across servers. By architecting from the “top down” rather than layering up from something like Ethernet’s byte stream transport, InfiniBand is able to deliver highly efficient and effective messaging between applications. In fact, the resulting messaging service can be thought of as “virtual channel IO” (I’m sure much to the delight of mainframers).

To read the full blog post, check out the Taneja blog. And be sure to read the full analyst report from Taneja Group that published in July.

Bill Lee
IBTA Marketing Working Group Chair

Author: admin Categories: Uncategorized Tags: