d-Matrix Ramps Corsair AI Inference Platform

Jim Carroll

15 hours ago

d-Matrix announced that its Corsair AI inference accelerator platform has entered full production, with volume shipments scheduled to begin this summer for select hyperscalers, neocloud providers, and frontier AI laboratories. The company said demand has accelerated as enterprises deploy increasingly latency-sensitive agentic AI applications, including coding assistants, voice agents, and interactive AI systems that require rapid token generation and low response times.

The company positions Corsair as a complementary accelerator to GPUs rather than a replacement. In heterogeneous AI clusters, GPUs handle compute-intensive model prefill operations while Corsair accelerators execute the decode phase of inference. d-Matrix cited independent testing showing that pairing Corsair accelerators with GPUs reduced response times from approximately 24 seconds to less than two seconds in speculative decoding workloads. The platform uses an SRAM-based in-memory compute architecture combined with LPDDR5 memory rather than HBM-based packaging, a design the company says improves supply chain predictability and reduces manufacturing complexity.

To support large-scale deployments, d-Matrix is offering its SquadRack reference architecture, developed in collaboration with Arista, Broadcom, and Supermicro. The company also highlighted its April acquisition of GigaIO’s data center business, which added rack-scale systems expertise and field deployment capabilities. Corsair is manufactured using TSMC’s N6 process technology through a partnership with Alchip Technologies, and the company said supply agreements are in place to support production ramp requirements.

Corsair AI inference accelerator platform enters full production
Volume shipments begin this summer
Targets hyperscalers, neoclouds, and frontier AI labs
Designed for heterogeneous AI clusters combining GPUs and inference accelerators
Independent testing demonstrated greater than 10x inference response improvements
Built on TSMC N6 process technology
Uses SRAM-based in-memory compute architecture
Avoids HBM and CoWoS packaging dependencies
SquadRack integrates Corsair, JetStream networking, and Aviator software
April acquisition of GigaIO data center business expands rack-scale deployment capabilities

“We built Corsair specifically for this moment, the Age of AI Inference,” said Sid Sheth, founder and CEO of d-Matrix. “The applications that matter most today — agentic AI, interactive coding, real-time voice agents — live or die on latency.”

🌐 Analysis

The announcement reflects a broader shift occurring across AI infrastructure. While much of the industry’s attention has focused on training clusters built around increasingly powerful GPUs, inference is emerging as the dominant long-term workload. As AI models become embedded into enterprise applications and consumer services, operators are seeking ways to reduce latency and operating costs while scaling inference capacity. This has created opportunities for specialized inference accelerators designed to work alongside GPUs rather than compete directly with them.

The production ramp also follows d-Matrix’s April acquisition of GigaIO’s data center business. GigaIO built its reputation around composable and disaggregated infrastructure architectures that allow compute, memory, and accelerators to be dynamically pooled and allocated. The acquisition gives d-Matrix experienced rack-scale systems engineers, deployment expertise, and integration capabilities that complement its silicon portfolio. Combined with SquadRack, the move positions d-Matrix to deliver complete rack-level inference systems rather than standalone accelerator cards, aligning with a broader industry trend toward integrated AI infrastructure platforms.

Company Profile: d-Matrix
Headquarters	Santa Clara, California, USA
Founded	2019
Founders	Sid Sheth (CEO) & Sudeep Bhoja
Core Focus	Low-latency, high-efficiency Generative AI (GenAI) & LLM inference acceleration for data centers.
Flagship Product	Corsair™ AI Inference Platform (Entering full volume production as of June 2026)
Architecture	Digital In-Memory Computing (DIMC), chiplet-based scaling, and native Block Floating Point formats (OCP Microscaling / MX formats).
Memory Specs	SRAM-based integrated Performance Memory (~150 TB/s bandwidth per card) paired with LPDDR5 Capacity Memory (up to 256GB).
Networking Tech	JetStream™ Ultra-fast I/O Accelerator (Transparent NIC solution for accelerator-to-accelerator connectivity).
Reference Design	SquadRack™ (Disaggregated, standards-based rack-scale solution utilizing standard air cooling).
Software Stack	Aviator™ (Enterprise-grade compiler/software stack integrated with PyTorch and Triton DSL).
Manufacturing	TSMC N6 process utilizing Alchip Technologies for design/packaging services.
Ecosystem Partners	Arista, Broadcom, Supermicro, Liqid, TSMC, and Alchip.
Recent Acquisition	GigaIO’s Data Center Business (April 2026). Brought in the SuperNODE architecture and FabreX PCIe memory fabric to solve system-level scaling issues.

Strategic Impact: GigaIO Acquisition
Timeline	Announced & Closed: April 2026
Business Acquired	GigaIO Data Center Assets (Excluding legacy defense/high-performance computing business segments).
Core Technology	FabreX™ open software-defined memory fabric (PCIe Gen 5 / CXL) and SuperNODE™ disaggregated cluster architectures.
Primary Driver	Solves the multi-node scaling bottleneck. Enables d-Matrix to dynamically link dozens of Corsair accelerators with uniform, ultra-low memory latency without hitting InfiniBand/Ethernet overhead limits.
System Integration	Directly accelerates the deployment of SquadRack™ clusters, allowing the company to ship pre-configured, multi-petabyte LLM inference infrastructure.
Macro Trend	The definitive pivot from siloed AI hardware component sales to fully integrated, turnkey rack-scale computing architectures capable of supporting multi-trillion parameter agentic workflows.