• Home
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Subscribe to Daily Newsletter
  • NextGenInfra.io
No Result
View All Result
Converge Digest
Sunday, June 22, 2025
  • Home
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Subscribe to Daily Newsletter
  • NextGenInfra.io
No Result
View All Result
Converge Digest
No Result
View All Result

Home » OFC 2025 Panel: Million-GPU Clusters Push Networks to the Breaking Point

OFC 2025 Panel: Million-GPU Clusters Push Networks to the Breaking Point

April 1, 2025
in Optical
A A

San Francisco – April 1, 2025– As AI clusters balloon toward the million-GPU mark, the optical and electrical interconnect industry faces immense pressure to keep up with surging bandwidth, power, and reliability demands. At a standing-room-only OFC 2025 Executive Forum panel titled “AI’s Optical Bottleneck: Scaling Networks for the Next Generation of AI Workloads,” industry leaders from NVIDIA, Credo, Marvell, TE Connectivity, and the Optical Internetworking Forum (OIF) broke down the challenges—and innovations—shaping tomorrow’s AI infrastructure.

Moderated by Alan Weckel of the 650 Group, the conversation centered on network architectures, co-packaged optics (CPO), pluggables, copper’s surprising resurgence, and the need for tighter industry collaboration. While optics continues to be essential for scale, new thermal and spatial dynamics are bringing copper back into play for short-reach links.


🔍 Key Takeaways from the Panel Discussion

🔗 AI Is Reshaping Data Center Network Topology

• AI clusters require four distinct network planes:

• Scale-up (intra-rack) – copper-dense connections within nodes.

• Scale-out (east-west) – high-bandwidth inter-node fabrics.

• Front-end (north-south) – user and storage traffic.

• DCI (inter-site) – multi-building and multi-region links.

⚡ Copper’s Resurgence in Short-Reach Interconnects

• Don Barnetson (Credo): AI density and liquid cooling now enable short copper links even in scale-out fabrics.

• “For the first time, we’re converting optics back into copper,” Barnetson said.

• Credo’s AECs reduce soft link errors and offer “zero-flap” operation, improving reliability.

🧠 Optics Still Critical for Scaling AI

• Craig Thompson (NVIDIA): CPO is key for east-west fabric efficiency, reducing power and increasing cluster uptime.

• “We’re simplifying switch-to-optic interfaces and reducing component count with CPO,” Thompson said.

• Front-pluggable optics still have value and will coexist with CPO.

🌐 Data Center Interconnect (DCI) Undergoes 100X Growth

• Josef Berger (Marvell): DCI bandwidth demand has increased 100-fold due to:

1. Growing number of AI training clusters.

2. Need to monetize via inference workloads.

3. Power delivery constraints requiring multi-site clustering.

• Marvell emphasized the importance of ZR optics and silicon photonics.

🧵 Massive Volumes of Interconnect Cable

• Nathan Tracy (TE/OIF):

• Some AI deployments use 60,000–100,000 km of Twinax per year.

• That’s up to 1.5× around the globe—just in backplane cable.

• “Next-gen is 400G electrical—get ready for a wild ride.”


🛠️ Standards and Collaboration

🧩 OIF’s Role in Powering the Future

• Tracy described OIF as the “sports car” standards body, laser-focused on hyperscale needs.

• Key initiatives include:

• 400G electrical (single-lane 400G, enabling 3.2T–6.4T modules).

• CMIS, the protocol “glue” for all interconnects.

• Advancements in CPO, LPO, and coherent optics.

🤝 Cross-Industry Co-Design Is Essential

• Panelists stressed the importance of early engagement:

• NVIDIA provides reference designs; hyperscalers modify them.

• Credo and Marvell work closely with customers and silicon partners to anticipate needs.

• TE adjusts manufacturing and interconnect tech preemptively to meet new form factors and thermal specs.


📈 Speed of Innovation: Faster than Ever

• Development must target N+2 generations in advance.

• Craig Thompson: “GPU bandwidth doubles every two years, and so must the network.”

• Don Barnetson: Long silicon lead times mean design teams must predict customer needs before they even know them.

• Josef Berger: Innovation pace is so fast that new players may struggle to break in.


🔮 Final Thoughts from the Panelists

• Craig Thompson (NVIDIA):

“The network defines how fast we can build computers. We’re headed for million-GPU clusters.”

• Don Barnetson (Credo):

“Copper’s not dead—it’s thriving again. And diversity in interconnects is here to stay.”

• Nathan Tracy (TE/OIF):

“Interconnect is now a first-class citizen in AI infrastructure. It’s our day.”

• Josef Berger (Marvell):

“We’re hitting <5 pJ/bit in short-reach optics. Innovation isn’t slowing down—it’s accelerating.”


Tags: OFC
ShareTweetShare
Previous Post

Ciena, HyperLight, McGill Demo 3.2Tbps IMDD using 448G/Lane

Next Post

Optica Executive Forum: Tech Giants Debate Future of Photonics in AI Clusters

Related Posts

#OFC25 Video: Next-Gen 224G & 448G/Lane Electrical Connectivity
Video

#OFC25 Video: OIF’s Energy-Efficient Interfaces for AI/ML

April 7, 2025
Molex Launches Expanded Beam Optical Connectors
Optical

Molex Launches Expanded Beam Optical Connectors

March 25, 2025
Fujitsu and Accelecom deliver Middle Mile for Kentucky
Optical

Fujitsu’s new 1FINITY Ultra Optical delivers 1.2 Tbps wavelengths

February 22, 2023
Next Post
Google’s Amin Vahdat: Networking Is the Bottleneck

Optica Executive Forum: Tech Giants Debate Future of Photonics in AI Clusters

Categories

  • 5G / 6G / Wi-Fi
  • All
  • Automotive Networking
  • Blueprints
  • Clouds and Carriers
  • Data Centers
  • Enterprise
  • Explainer
  • Financials
  • Last Mile / Middle Mile
  • Optical
  • Quantum
  • Research
  • Security
  • Semiconductors
  • Space
  • Start-ups
  • Subsea
  • Sustainability
  • Video
  • Webinars

Archives

Tags

5G All AT&T AWS Blueprint columns BroadbandWireless Broadcom China Ciena Cisco Data Centers Dell'Oro Ericsson FCC Financial Financials Huawei Infinera Intel Japan Juniper Last Mile Last Mille LTE Mergers and Acquisitions Mobile NFV Nokia Optical Packet Systems PacketVoice People Regulatory Satellite SDN Security Service Providers Silicon Silicon Valley StandardsWatch Storage TTP UK Verizon Wi-Fi
Converge Digest

A private dossier for networking and telecoms

Follow Us

  • Home
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Subscribe to Daily Newsletter
  • NextGenInfra.io

© 2025 Converge Digest - A private dossier for networking and telecoms.

No Result
View All Result
  • Home
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Subscribe to Daily Newsletter
  • NextGenInfra.io

© 2025 Converge Digest - A private dossier for networking and telecoms.

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.
Go to mobile version