FuriosaAI Taps Broadcom for Rack-Scale AI Inference Architecture

FuriosaAI announced a strategic partnership with Broadcom to develop a third-generation AI accelerator platform built around a multi-die chiplet architecture optimized for large-scale inference workloads. The collaboration combines FuriosaAI’s Tensor Contraction Processor (TCP) architecture with Broadcom’s AI networking technologies, high-bandwidth Ethernet switching, PCIe connectivity, and advanced packaging capabilities to create a rack-scale inference platform for hyperscale AI deployments.

The new platform extends beyond FuriosaAI’s existing RNGD inference accelerator, which is currently in mass production and fabricated on TSMC’s 5nm process technology. RNGD is a 180W PCIe accelerator designed for large language model inference and agentic AI workloads operating in standard air-cooled data center environments. FuriosaAI said RNGD has already been validated in production deployments by organizations including Samsung SDS and LG AI Research. The third-generation architecture advances the design into a chiplet-based system featuring a 2nm compute die and dual-layer HBM4/4E memory.

Broadcom will contribute Ethernet scale-up technologies, PCIe interconnect IP, packaging integration, and AI fabric switching infrastructure intended to scale inference clusters across thousands of nodes. FuriosaAI said the architecture emphasizes high-bandwidth data movement and communication efficiency instead of conventional GPU thread management approaches. The company also highlighted its software stack, which includes a compiler-driven SDK that maps high-level PyTorch workloads directly onto silicon along with a Virtual ISA abstraction layer that exposes low-level hardware control without requiring developers to manage traditional GPU kernel optimization workflows.

Third-generation FuriosaAI accelerator uses a chiplet-based multi-die architecture
Platform combines a 2nm compute die with dual-layer HBM4/4E memory
Broadcom contributes Ethernet networking, PCIe technologies, AI fabrics, and advanced packaging
RNGD inference accelerator remains in mass production on TSMC 5nm
RNGD operates at 180W in standard PCIe server configurations
Architecture targets large-scale inference clusters supporting high-volume token generation
Design focuses on communication efficiency, memory bandwidth, and rack-scale scalability
FuriosaAI’s compiler-based SDK aims to reduce reliance on hand-tuned kernels

“Inference performance is no longer defined solely by raw compute. It is increasingly a function of data reuse and communication efficiency across servers and racks,” said Charlie Kawwas, president of Broadcom’s Semiconductor Solutions Group. “By pairing Furiosa’s TCP architecture with Broadcom’s market-leading XPU Technology and IP Platform, Ethernet scale-up and fabric switches, we are building a platform that addresses the key bottlenecks of large-scale agentic AI.”

“Bringing together Broadcom’s infrastructure capabilities and Furiosa’s Tensor Contraction Processor architecture allows us to move beyond the chip level and deliver a comprehensive solution for the token factory era,” said June Paik. “Having proven the performance and efficiency of our architecture with RNGD, our second-generation chip now in mass production with TSMC, we will deliver a third-generation inference solution that offers industry-leading performance per watt for even the largest, most complex frontier AI models.”

Category	Details
Company	FuriosaAI
Founded	2017
Headquarters	Seoul, South Korea, with operations in Silicon Valley
Founder & CEO	June Paik
Leadership Background	Founded by semiconductor engineers from AMD, Qualcomm, and Samsung
Core Architecture	Tensor Contraction Processor (TCP)
Primary Focus	High-efficiency AI inference accelerators for LLMs and agentic AI
Current Flagship Product	RNGD AI inference accelerator
Process Technology	TSMC 5nm for RNGD; planned 2nm compute die for third-generation platform
Memory Technology	HBM4/4E planned for next-generation chiplet platform
Power Envelope	180W for current RNGD accelerator
Software Stack	Compiler-driven SDK with PyTorch mapping and Virtual ISA programming model
Production Deployments	Samsung SDS and LG AI Research
Funding	Approximately $250 million raised through Series C Bridge
2026 Milestone	Partnership with Broadcom to develop chiplet-based rack-scale AI inference platform

🌐 Analysis: The FuriosaAI-Broadcom partnership highlights the growing transition from monolithic AI accelerators toward chiplet-based inference architectures tightly coupled with high-bandwidth Ethernet fabrics and advanced packaging. As hyperscalers push toward sustained token generation for agentic AI systems, networking efficiency, memory bandwidth, and rack-scale communication increasingly determine deployment economics alongside raw compute performance.

🌐 The collaboration also reinforces Broadcom’s expanding role in AI infrastructure beyond switching silicon and custom XPUs. Broadcom has steadily broadened its AI platform strategy across networking fabrics, advanced packaging, interconnect IP, and scale-out infrastructure. FuriosaAI joins a growing field of inference-focused challengers seeking to differentiate against GPU-centric architectures through specialized silicon optimized for inference efficiency and lower power consumption.

🌐 Analysis: FuriosaAI’s partnership with Broadcom follows a significant financing milestone for the company. In July 2025, FuriosaAI closed a $125 million Series C bridge round, bringing total funding to approximately $246 million. The round included backing from Korea Development Bank, Industrial Bank of Korea, Keistone Partners, PI Partners, and Kakao Investment, alongside participation from a broad group of institutional investors. FuriosaAI said the capital is being used to scale production of its RNGD inference processor while accelerating development of its next-generation architecture. The size and composition of the round underscored strong support from both strategic and financial investors in South Korea, while giving FuriosaAI additional runway to remain independent as it expands globally in the increasingly competitive AI inference silicon market.