AMD, Broadcom, Cisco, Google, Hewlett Packard Enterprise (HPE), Intel, Meta and Microsoft agreed to develop an open industry standard for interconnecting AI accelerators.
The Ultra Accelerator Link (UALink) group will develop a specification to define a high-speed, low-latency interconnect for scale-up communications between accelerators and switches in AI computing pods.
The 1.0 specification will enable the connection of up to 1,024 accelerators within an AI computing pod and allow for direct loads and stores between the memory attached to accelerators, such as GPUs, in the pod. The UALink Promoter Group has formed the UALink Consortium and expects it to be incorporated in Q3 of 2024. The 1.0 specification is expected to be available in Q3 of 2024 and made available to companies that join the Ultra Accelerator Link (UALink) Consortium.
Ultra Accelerator Link Overview
- The group plans to launch the organization and release the 1.0 spec for members in 3Q24
- An update to increase the bandwidth will be released in 4Q24
- The interconnect is for GPU-to-GPU communication Direct load, store, and atomic operations between Al Accelerators (i.e. GPUs)
- Low latency, high bandwidth fabric for 100’s of accelerators
- The initial UALink spec taps into the experience of the Promoters developing and deploying a broad range of accelerators and leverages the proven Infinity Fabric protocol
“The work being done by the companies in UALink to create an open, high performance and scalable accelerator fabric is critical for the future of AI. Together, we bring extensive experience in creating large scale AI and high-performance computing solutions that are based on open-standards, efficiency and robust ecosystem support. AMD is committed to contributing our expertise, technologies and capabilities to the group as well as other open industry efforts to advance all aspects of AI technology and solidify an open AI ecosystem.” – Forrest Norrod, executive vice president and general manager, Data Center Solutions Group, AMD
“Broadcom is proud to be one of the founding members of the UALink Consortium, building upon our long-term commitment to increase large-scale AI technology implementation into data centers. It is critical to support an open ecosystem collaboration to enable scale-up networks with a variety of high-speed and low-latency solutions.” – Jas Tremblay, vice president and general manager of the Data Center Solutions Group, Broadcom
“Ultra-high performance interconnects are becoming increasingly important as AI workloads continue to grow in size and scope. Together, we are committed to developing the UALink which will be a scalable and open solution available to help overcome some of the challenges with building AI supercomputers.” – Martin Lund, Executive Vice President, Common Hardware Group, Cisco
“Open standards are important to HPE as we innovate in supercomputing and increase access to systems. As a founding member of the UALink industry consortium, we look forward to contributing our expertise in high performance networking and systems, and collaborating to develop a new open standard for accelerator interconnects for the next generation of supercomputing.” – Trish Damkroger, senior vice president and general manager, HPC & AI Infrastructure Solutions, HPE
“UALink is an important milestone for the advancement of Artificial Intelligence computing. Intel is proud to co-lead this new technology and bring our expertise in creating an open, dynamic AI ecosystem. As a founding member of this new consortium, we look forward to a new wave of industry innovation and customer value delivered though the UALink standard. This initiative extends Intel’s commitment to AI connectivity innovation that includes leadership roles in the Ultra Ethernet Consortium and other standards bodies.” – Sachin Katti, SVP & GM, Network and Edge Group, Intel Corporation
“In a very short period of time, the technology industry has embraced challenges that AI and HPC have uncovered. Interconnecting accelerators like GPUs requires a holistic perspective when seeking to improve efficiencies and performance. At UEC, we believe that UALink’s scale-up approach to solving pod cluster issues complements our own scale-out protocol, and we are looking forward to collaborating together on creating an open, ecosystem-friendly, industry-wide solution that addresses both kinds of needs in the future.” – J Metz, Ph.D., Chair, Ultra Ethernet Consortium
About the Infinity Fabric Protocol
The Infinity Fabric Protocol is a high-speed interconnect technology developed by AMD (Advanced Micro Devices). It is designed to improve communication between different components within a computer system, such as CPUs, GPUs, memory, and other peripherals. Here are some key aspects of the Infinity Fabric Protocol:
1. Scalability: Infinity Fabric is highly scalable, allowing it to connect multiple processors and devices within a system. This scalability is essential for creating high-performance computing environments, such as those used in data centers and supercomputers.
2. Low Latency and High Bandwidth: The protocol is designed to provide low latency and high bandwidth communication, which is crucial for applications that require rapid data exchange between components.
3. Coherent and Non-Coherent Traffic: Infinity Fabric supports both coherent and non-coherent traffic. Coherent traffic ensures that multiple processors can share and access the same memory space without data inconsistencies, which is important for multi-threaded applications. Non-coherent traffic, on the other hand, is used for less critical data exchanges.
4. Power Efficiency: AMD has designed Infinity Fabric to be power-efficient, making it suitable for use in both high-performance and power-sensitive applications.
5. Versatility: Infinity Fabric can be used in various types of computing devices, including desktop computers, servers, and integrated systems. It provides a unified communication framework that simplifies the design and integration of different components.
Infinity Fabric plays a crucial role in AMD’s Ryzen and EPYC processors, enabling efficient and high-performance communication between cores, memory, and other system components.