Archive

Archive for the ‘InfiniBand’ Category

InfiniBand Experts Discuss Latest Trends and Opportunities at OFA Workshop 2016

May 24th, 2016

ofaworkshop2016

Each year, OpenFabrics Software (OFS) users and developers gather at the OpenFabrics Alliance (OFA) Workshop to discuss and tackle the most recent challenges facing the high performance storage and networking industry. OFS is an open-source software that enables maximum application efficiency and performance agnostically over RDMA fabrics, including InfiniBand and RDMA over Converged Ethernet (RoCE). The work of the OFA supports mission critical applications in High Performance Computing (HPC) and enterprise data centers, but is also quickly becoming significant in cloud and hyper-converged markets.

In our previous blog, we showcased an IBTA sponsored session that provided an update on InfiniBand virtualization support. In addition to our virtualization update, there were a handful of other notable sessions that highlighted the latest InfiniBand developments, case studies and tutorials. Below is a collection of notable InfiniBand focused sessions that we recommend you check out:

InfiniBand as Core Network in an Exchange Application
Ralph Barth, Deutsche Börse AG; Joachim Stenzel, Deutsche Börse AG

Group Deutsche Boerse is a global financial service organization covering the entire value chain from trading, market data, clearing, settlement to custody. While reliability has been a fundamental requirement for exchanges since the introduction of electronic trading systems in the 1990s, since about 10 years also low and predictable latency of the entire system has become a major design objective. Both issues have been important architecture considerations, when Deutsche Boerse started to develop an entirely new derivatives trading system T7 for its options market in the US (ISE) in 2008. As the best fit at the time a combination of InfiniBand with IBM® WebSphere® MQ Low Latency Messaging (WLLM) as the messaging solution was determined. Since then the same system has been adopted for EUREX, one of the largest derivatives exchanges in the world, and is now also extended to cover cash markets. The session presents the design of the application and its interdependence with the combination of InfiniBand and WLLM. Also practical experiences with InfiniBand in the last couple of years will be reflected upon.

Download: Slides / Video


Experiences in Writing OFED Software for a New InfiniBand HCA
Knut Omang, Oracle

This talk presents experiences, challenges and opportunities as lead developer in initiating and developing OFED stack support (kernel and user space driver) for Oracles InfiniBand HCA integrated in the new SPARC Sonoma SoC CPU. In addition to the physical HCA function SR/IOV is supported with vHCAs visible to the interconnect as connected to virtual switches. Individual driver instances for the vHCAs maintains page tables set up for the HCAs MMU for memory accessible from the HCA. The HCA is designed to scale to a large number of QPs. For minimal overhead and maximal flexibility, administrative operations such as memory invalidations also use an asynchronous work request model similar to normal InfiniBand traffic.

Download: Slides / Video

Fabrics and Topologies for Directly Attached Parallel File Systems and Storage Networks
Susan Coulter, Los Alamos National Laboratory

InfiniBand fabrics supporting directly attached storage systems are designed to handle unique traffic patterns, and they contain different stress points than other fabrics. These SAN fabrics are often expected to be extensible in order to allow for expansion of existing file systems and addition of new file systems. The character and lifetime of these fabrics is distinct from those of internal compute fabrics, or multi-purpose fabrics. This presentation covers the approach to InfiniBand SAN design and deployment as experienced by the High Performance Computing effort at Los Alamos National Laboratory.

Download: Slides / Video


InfiniBand Topologies and Routing in the Real World
Susan Coulter, Los Alamos National Laboratory; Jesse Martinez, Los Alamos National Laboratory

As with all sophisticated and multifaceted technologies - designing, deploying and maintaining high-speed networks and topologies in a production environment and/or at larger scales can be unwieldy and surprising in their behavior. This presentation illustrates that fact via a case study from an actual fabric deployed at Los Alamos National Laboratory.

Download: Slides / Video


InfiniBand Routers Premier
Mark Bloch, Mellanox Technologies; Liran Liss, Mellanox Technologies

InfiniBand has gone a long way in providing efficient large-scale high performance connectivity. InfiniBand subnets have shown to scale to tens of thousands of nodes, both in raw capacity and in management. As demand for computing capacity increases, future clusters sizes might exceed the number of addressable endpoints in a single IB subnet (around 40K nodes). To accommodate such clusters, a routing layer with the same latencies and bandwidth characteristics as switches is required.

In addition, as data center deployments evolve, it becomes beneficial to consolidate resources across multiple clusters. For example, several compute clusters might require access to a common storage infrastructure. Routers can enable such connectivity while reducing management complexity and isolating intra-subnet faults. The bandwidth capacity to storage may be provisioned as needed.

This session reviews InfiniBand routing operation and how it can be used in the future. Specifically, we will cover topology considerations, subnet management issues, name resolution and addressing, and potential implications for the host software stack and applications.

Download: Slides

Bill Lee

Author: admin Categories: InfiniBand, RDMA Tags: , ,

InfiniBand-based Supercomputer to Power New Discoveries at the Library of Alexandria

May 17th, 2016
“Bibliotheca Alexandrina” by Ting Chen is licensed under CC BY 2.0

“Bibliotheca Alexandrina” by Ting Chen is licensed under CC BY 2.0

Thriving civilizations, both past and present, tend to have one important characteristic in common – a vast, dynamic knowledge base. Whether it be the latest advancements in agriculture, civil engineering or battlefield tactics, technological innovation frequently determined the level and reach of a nation’s influence. One of the most prominent examples of knowledge driven supremacy stems from ancient Egypt’s Library of Alexandria.

Built in the third century BCE, the Library of Alexandria was considered the greatest collection of scholarly works and papers in its era. In addition to gifts from distinguished intellectuals and monarchs, the library built up its massive archive by coping documents and scrolls brought into Alexandria via merchants and traders. Its subsequent destruction a few centuries later was thought to be one of the most significant losses of cultural knowledge in world history.

In 2002, Bibliotheca Alexandrina was constructed to commemorate the library’s remarkable history and lasting notoriety. With a mission “to recapture the spirit of openness and scholarship of the original,” the modern library acts as a global center of knowledge and learning. It contains over a million books in six separate libraries and boasts four museums and 13 academic research centers. Furthermore, Bibliotheca Alexandria acts as a mirrored backup for the Internet Archive, a non-profit digital library that offers free access to millions of books, media and software around the world.

In addition to preserving existing knowledge, the library pursues new insights and understanding as well. Bibliotheca Alexandrina is currently building a new supercomputer with that exact goal in mind. Supercomputing is considered by many to be the standard-bearer of knowledge creation, with many countries committing significant resources to build the world’s most powerful systems (see our blog “Race to Exascale – Nations Vie to Build Fastest Supercomputer”). Supercomputing allows companies, researchers and institutions to process massive data sets to produce useful results in rapid time.

According to a recent announcement from Huawei, the new supercomputer will feature high-density FusionServer X6800 servers powered by high-performance InfiniBand interconnects. The system will be capable of a theoretical peak speed of 118 TFLOPS and a storage capacity of 288 TB. Its design enables an expansion of up to 4.5 PB, ensuring future storage scalability. Once completed, the supercomputer will support a variety of research fields, including bioinformatics, data mining, physics, weather forecasting, resource exploration/extraction and cloud computing.

We look forward to seeing what type of breakthroughs originate from Bibliotheca Alexandrina’s new InfiniBand-based supercomputer. It may even be a discovery that will have the same lasting effect as the original Library of Alexandria, which people will talk about thousands of years from now.

Bill Lee

OpenFabrics Software Users and Developers Receive InfiniBand Virtualization Update at the 2016 OFA Workshop

April 26th, 2016

capture

The InfiniBand architecture is a proven network interconnect standard that provides benefits for bandwidth, efficiency and latency, while also boasting an extensive roadmap of future performance increases. Initially adopted by the High Performance Computing industry, a growing number of enterprise data centers are demanding the performance capabilities that InfiniBand has to offer. InfiniBand data center use cases vary widely, ranging from physical network foundations transporting compute and storage traffic to enabling Platform-as-a-Service (PaaS) in cloud service providers.

Today’s enterprise data center and cloud environments are also seeing an increased use of virtualized workloads. Using virtualized servers allows data center managers to create a common shared pool of resources from a single host. Virtualization support in the Channel Adapter enables different software entities to interact independently with the fabric. This effectively creates an efficient service-centric computing model capable of dynamic resource utilization and scalable performance, while reducing overhead costs.

Earlier this month at the OpenFabrics Alliance (OFA) Workshop 2016 in Monterey, CA, Liran Liss of member company Mellanox Technologies provided an update on the IBTA’s ongoing work to standardize InfiniBand virtualization support. He explained that the IBTA Management Working Group’s goals include making the InfiniBand Virtualization Annex scalable, explicit, backward compatible and, above all, simple in both implementation and management. Liss specifically covered the concepts of InfiniBand Virtualization, and its manifestation in the host software stack, subnet management and monitoring tools.

The IBTA effort to support virtualization is nearing completion as the annex enters its final review period from other working groups. If you were unable to attend the OFA Workshop 2016 and would like to learn more about InfiniBand virtualization, download the official slides or watch a video of the presentation via insideHPC.

Bill Lee

Plugfest 28 Results Highlight Expanding InfiniBand EDR 100 Gb/s & RoCE Ecosystems

March 21st, 2016

il-interopWe are excited to announce the availability of our latest InfiniBand Integrators’ List and RoCE Interoperability List. The two lists make up the backbone of our Integrators’ Program and are designed to support data center managers, CIOs and other IT decision makers with their planned InfiniBand and RoCE deployments for enterprise and high performance computing systems. To keep data up to date and as useful as possible, both documents are refreshed twice a year following our bi-annual plugfests, which are held at the University of New Hampshire InterOperability Lab (UNH-IOL).

Having recently finalized the results from Plugfest 28, we can report a significant increase in InfiniBand EDR 100 Gb/s submissions compared to the last Integrators’ List. This trend demonstrates a continued industry demand for InfiniBand-based systems that are capable of higher bandwidth and faster performance. The updated list features a variety of InfiniBand devices, including Host Channel Adapters (HCAs), Switches, SCSI Remote Protocol (SRP) targets and cables (QDR, FDR and EDR).

Additionally, we held our second RoCE interoperability event at Plugfest 28, testing 10, 25 and 40 GbE RNICs, Switches and SFP+, SFP28, QSFP and QSFP28 cables. Although a full spec compliance program is still under development for RoCE, the existing interoperability testing offers solid insight into the ecosystem’s robustness and viability. We plan to continue our work creating a comprehensive RoCE compliance program at Plugfest 29. RoCE testing at Plugfest 29 will include testing of more than 16 different 10, 25, 40, 50 and 100 GbE RNICs and Switches along with all of the various cables to support these devices. Plugfest 29 testing of RoCE products, which use Ethernet physical and link layers, will be the most comprehensive interoperability testing ever performed.

As always, we’d like to thank the leading vendors that contributed test equipment to IBTA Plugfest 28. These invaluable members include Anritsu, Keysight Technologies, Matlab, Molex, Tektronix, Total Phase and Wilder Technologies.

The next opportunity for members to test InfiniBand and RoCE products is Plugfest 29 is scheduled for April 4-15, 2016 at UNH-IOL. Event details and registration information are available here.

Rupert Dance, IBTA CIWG

Rupert Dance, IBTA CIWG

IBTA Wants You – Guide the Future of InfiniBand, RoCE and Performance-Driven Data Centers

January 25th, 2016

For any organization, the New Year provides a great opportunity to reflect on the past and set a healthier course for the future. Companies can take a variety of internal actions to prepare for impeding market changes, but rarely do they have the power to influence the course of an entire industry on their own. For those devoted to improving clustered server and data center performance, joining an industry alliance such as the InfiniBand Trade Association (IBTA) offers a chance to contribute to the foundational work that sets the path for technological advances one, five and ten years into the future.

The IBTA is the organization that maintains and furthers the InfiniBand specification, used by Cloud service providers and high performing enterprise data centers as well as the interconnect of choice for the world’s fastest supercomputers. Additionally, the IBTA defines the specification for RDMA over Converged Ethernet (RoCE), which leverages the advantages of RDMA technology for Ethernet-based environments.

Leading enterprise IT vendors and HPC research facilities make up the coalition of more than 50 members that all have a shared interest in the advancement of InfiniBand and/or RoCE technology. Each member company contributes specialized expertise to IBTA’s various technical working groups, which shape and guide the progression of InfiniBand and RoCE capabilities.

Joining the IBTA comes with a variety of membership benefits, including:

Access to:

  • The InfiniBand and RoCE architecture specifications as they are being developed
  • Meeting minutes and notices of proposed and actual changes to IBTA-controlled documents
  • IBTA-sponsored Compliance and Interoperability Plugfests and workshops

Participation in:

  • The maintenance of the InfiniBand Roadmap, which defines future speeds and lane widths for InfiniBand-based technologies
  • IBTA sponsored activities at tradeshows including the annual Supercomputing Conference in November
  • IBTA speaking and demo opportunities

Opportunity to:

  • Influence and contribute to the ongoing development and maintenance of the InfiniBand and RoCE architecture specifications
  • Add approved products to the IBTA Integrators’ List, which provides a centralized listing of products that have passed a suite of compliance and interoperability testing
  • Post InfiniBand and RoCE related whitepapers, webinars, podcasts and press releases on the IBTA and RoCE Initiative web sites
  • Submit and obtain access to information regarding licensing policies posted by member patent holders on specific InfiniBand and RoCE architecture specifications
  • Network with the world’s foremost developers of InfiniBand and RoCE hardware and software

Make 2016 the year your company defines the future of the HPC industry! Visit our Membership page to learn how to join or contact membership@infinibandta.org for more information.

Bill Lee

InfiniBand Roadmap – Charting Speeds for Future Needs

December 14th, 2015

ib-roadmap

Defining the InfiniBand ecosystem to accommodate future performance increases is similar to city planners preparing for urban growth. Both require a collaborative effort between experts and the community for whom they serve.

The High Performance Computing (HPC) community continues to call for faster interconnects to transfer massive amounts of data between its servers and clusters. Today, the industry’s fastest supercomputers are processing data in petaflops and experts expect that they will reach Exascale computing by 2025.

IBTA’s working groups are always looking ahead to meet the HPC community’s future performance demands. We are constantly updating the InfiniBand Roadmap, a visual representation of InfiniBand speed increases, to keep our work in line with expected industry trends and systems-level performance gains.

The roadmap itself is dotted by data rates, which are defined by transfer speeds and release dates. Each data rate has a designated moniker and is measured in three ways; 1x, 4x and 12x. The number refers to the amount of lanes per port with each additional lane allowing for greater bandwidth.

Current defined InfiniBand Data Rates include the following:

Data Rate: 4x Link Bandwidth 12x Link Bandwidth
SDR 8 Gb/s 24 Gb/s
DDR 16 Gb/s 48 Gb/s
QDR 32 Gb/s 96 Gb/s
FDR 56 Gb/s 168 Gb/s
EDR 100 Gb/s 300 Gb/s
HDR 200 Gb/s 600 Gb/s

The evolution of InfiniBand can be easily tracked by its data rates as demonstrated in the table above. A typical server or storage interconnect uses 4x links or 4 lanes per port. However, clusters and supercomputers can leverage 12x link bandwidth interconnects for even greater performance. Looking ahead, we expect to see a number of technical advances as the race to Exascale heats up.

As the roadmap demonstrates, planning for future data rates starts years in advance of their expected availability. In the latest edition, you will find two data rates scheduled beyond HDR - NDR and the newly christened XDR. Stayed tuned as the IBTA specifies NDR and XDR’s release dates and bandwidths.

Bill Lee

Changes to the Modern Data Center – Recap from SDC 15

October 19th, 2015

sdc15_logo
The InfiniBand Trade Association recently had the opportunity to speak on RDMA technology at the 2015 Storage Developer Conference. For the first time, SDC15 introduced Pre-conference Primer Sessions covering topics such as Persistent Memory, Cloud and Interop and Data Center Infrastructure. Intel’s David Cohen, System Architect and Brian Hausauer, Hardware Architect spoke on behalf of IBTA in a pre-conference session and discussed “Nonvolatile Memory (NVM), four trends in the modern data center and implications for the design of next generation distributed storage systems.”

Below is a high level overview of their presentation:

The modern data center continues to transform as applications and uses change and develop. Most recently, we have seen users abandon traditional storage architectures for the cloud. Cloud storage is founded on data-center-wide connectivity and scale-out storage, which delivers significant increases in capacity and performance, enabling application deployment anytime, anywhere. Additionally, job scheduling and system balance capabilities are boosting overall efficiency and optimizing a variety of essential data center functions.

Trends in the modern data center are appearing as cloud architecture takes hold. First, the performance of network bandwidth and storage media is growing rapidly. Furthermore, operating system vendors (OSV) are optimizing the code path of their network and storage stacks. All of these speed and efficiency gains to network bandwidth and storage are occurring while single processor/core performance remains relatively flat.

Data comes in a variety of flavors, some of which is accessed frequently for application I/O requests and others that are rarely retrieved. To enable higher performance and resource efficiency, cloud storage uses a tiering model to access data based on what is accessed most often. Data that is regularly accessed is stored on expensive, high performance media (solid-state drives). Data that is hardly or never retrieved is relegated to less expensive media with the lowest $/GB (rotational drives). This model follows a Hot, Warm and Cold data pattern and allows you faster access to what you use the most.

The growth of high performance storage media is driving the need for innovation in the network, primarily addressing application latency. This is where Remote Direct Memory Access (RDMA) comes into play. RDMA is an advanced, reliable transport protocol that enhances the efficiency of workload processing. Essentially, it increases data center application performance by offloading the movement of data from the CPU. This lowers overhead and allows the CPU to focus its processing power on running applications, which in turn reduces latency.

Demand for cloud storage is increasing and the need for RDMA and high performance storage networking grows as well. With this in mind, the InfiniBand Trade Association is continuing its work developing the RDMA architecture for InfiniBand and Ethernet (via RDMA over Converged Ethernet or RoCE) topologies.

Bill Lee

Race to Exascale – Nations Vie to Build Fastest Supercomputer

September 28th, 2015

“Discover Supercomputer 3” by NASA Goddard Space Flight Center is licensed under CC BY 2.0

“Discover Supercomputer 3” by NASA Goddard Space Flight Center is licensed under CC BY 2.0

The race between countries to build the fastest, biggest or first of anything is nothing new – think the race to the moon. One of the current global competitions is focused on supercomputing, specifically the race to Exascale computing or a billion billion calculations per second. Recently, governments (President Obama’s Executive Order and China’s current lead in supercomputing) are allocating significant resources toward Exascale initiatives as they start to understand its vast potential for a variety of industries, including healthcare, defense and space exploration.

The TOP500 list ranking the top supercomputers in the world will continue to be the scorecard. Currently, the U.S. leads with 233 of the top 500 supercomputers, Europe with 141 and China with 37. However, China’s small portfolio of supercomputers does not mean it is not a significant competitor in the supercomputing space as China has the #1 supercomputer on the TOP500 list for the fifth consecutive time.

When looking to build the supercomputers of the future, there are a number of factors which need to be taken into consideration, including superior application performance, compute scalability and resource efficiency. InfiniBand’s compute offloads and scalability makes it extremely attractive to supercomputer architects. Proof of the performance and scalability can be found in places such as the HPC Advisory Council’s library of case studies. InfiniBand makes it possible to achieve near linear performance improvement as more computers are connected to the array. Since observers of this space expect Exascale systems to require a massive amount of compute hardware, InfiniBand’s scalability looks to be a requirement to achieve this goal.

As the race to supercomputing speeds up we expect to see a number of exciting advances in technology as we shift from petaflops to exaflops. To give you an idea of how far we have come and where we are heading here is a comparison from the speed of computers that powered the race to space and the goals for Exascale.

Speeds Then vs. Now – Race to Space vs. Race to Supercomputing

  • Computers in 1960s (Speed of the Race to Space): Hectoscale (hundreds of FLOPs per second)
  • Goal for Computers in 2025 (Speed of the Race to Supercomputing): Exascale (quintillions of FLOPs per second)

Advances in supercomputing will continue to dominate the news with these two nations making the development of the fastest supercomputer a priority. As November approaches and the new TOP500 list is released, it will be very interesting to see where the rankings lie and what interconnects the respective architects will pick.


Bill Lee

EDR Hits Primetime! Newly Published IBTA Integrators’ List Highlights Growth of EDR

August 20th, 2015

The highly anticipated IBTA April 2015 Combined Cable and Device Integrators’ List is now available for download. The list highlights the results of the IBTA Plugfest 27 held at the University of New Hampshire’s Interoperability Lab earlier this year. The updated list consists of newly verified products that are compliant to the InfiniBand specification as well as details on solution interoperability.

Of particular note was the rise of EDR submissions. At IBTA Plugfest 27, eight companies provided 32 EDR cables for testing, up from three companies and 12 EDR cables at IBTA Plugfest 26. The increase in EDR cable solutions indicates that the technology is beginning to hit its stride. At Plugfest 28 we anticipate even more EDR solutions.

The IBTA is known in the industry for its rigorous testing procedures and subsequent Integrators’ List. The Integrators’ List provides IT professionals with peace of mind when purchasing new components to incorporate into new and existing infrastructure. To ensure the most reliable results, the IBTA uses industry-leading test equipment from Anritsu, Keysight, Molex, Tektronix, Total Phase and Wilder Technologies. We appreciate their commitment to our compliance program; we couldn’t do it without them.

The IBTA hosts its Plugfest twice a year to give members a chance to test new configurations or form factors. Although many technical associations require substantial attendance fees for testing events, the IBTA covers the bulk of Plugfest costs through membership dues.

The companies participating in Plugfest 27 included 3M Company, Advanced Photonics, Inc., Amphenol, FCI, Finisar, Fujikura, Ltd., Fujitsu Component Limited, Lorom America, Luxshare-ICT, Mellanox Technologies, Molex Incorporated, SAE Magnetics, Samtec, Shanghai Net Miles Fiber Technology Co. Ltd, Siemon, Sumitomo, and Volex.

We’ve already begun planning for IBTA Plugfest 28, which will be held October 12-23, 2015. For questions about Plugfest, contact ibta_plugfest@soft-forge.com or visit the Plugfest page for additional information.

Rupert Dance, IBTA CIWG

Rupert Dance, IBTA CIWG

InfiniBand leads the TOP500 powering more than 50 percent of the world’s supercomputing systems

August 4th, 2015

TOP500 Interconnect Trends

TOP500 Interconnect Trends

TOP500.org released the list of the 500 most powerful commercially available computer systems in the world, reporting that InfiniBand powers 257 systems, 51.4 percent of the list. This marks a 15.8 percent year over year growth from June 2014.

Demand for higher bandwidth, lower latency and higher message rates along with the need for application acceleration is driving continued adoption of InfiniBand in traditional High Performance Computing (HPC) as well as commercial HPC, cloud and enterprise data centers. InfiniBand is the only open-standard I/O that provides the capability required to handling supercomputing’s high demand for CPU cycles without time wasted on I/O transactions.

  • InfiniBand powers the most efficient system on the list with 98.8% efficiency.TOP100 Systems
  • EDR (Enhanced Data Rate) InfiniBand delivers 100Gbps and enters the TOP500 for the first time, powering three systems.
  • FDR (Fourteen Data Rate) InfiniBand at 56Gbps continues to be the most used technology on the TOP500, connecting 156 systems.
  • InfiniBand connects the most powerful clusters, 33 of the Petascale-performance systems, up from 24 in June 2014.
  • InfiniBand Is the leading interconnect for accelerator-based systems covering 77% of the list.

Not only is InfiniBand the most used interconnect solution in the world’s 500 most powerful supercomputers, it’s also the leader in the TOP100 as well. The TOP100 encompasses the top 100 supercomputing systems, as ranked in the TOP500. InfiniBand is the natural choice for world-leading supercomputers because of its performance, efficiency and scalability.

The full TOP500 list is available at www.top500.org.

Bill Lee

Author: admin Categories: InfiniBand, TOP500 Tags: , , , ,