Site icon Converge Digest

d-Matrix Ramps Corsair AI Inference Platform

d-Matrix announced that its Corsair AI inference accelerator platform has entered full production, with volume shipments scheduled to begin this summer for select hyperscalers, neocloud providers, and frontier AI laboratories. The company said demand has accelerated as enterprises deploy increasingly latency-sensitive agentic AI applications, including coding assistants, voice agents, and interactive AI systems that require rapid token generation and low response times.

The company positions Corsair as a complementary accelerator to GPUs rather than a replacement. In heterogeneous AI clusters, GPUs handle compute-intensive model prefill operations while Corsair accelerators execute the decode phase of inference. d-Matrix cited independent testing showing that pairing Corsair accelerators with GPUs reduced response times from approximately 24 seconds to less than two seconds in speculative decoding workloads. The platform uses an SRAM-based in-memory compute architecture combined with LPDDR5 memory rather than HBM-based packaging, a design the company says improves supply chain predictability and reduces manufacturing complexity.

To support large-scale deployments, d-Matrix is offering its SquadRack reference architecture, developed in collaboration with Arista, Broadcom, and Supermicro. The company also highlighted its April acquisition of GigaIO’s data center business, which added rack-scale systems expertise and field deployment capabilities. Corsair is manufactured using TSMC’s N6 process technology through a partnership with Alchip Technologies, and the company said supply agreements are in place to support production ramp requirements.

“We built Corsair specifically for this moment, the Age of AI Inference,” said Sid Sheth, founder and CEO of d-Matrix. “The applications that matter most today — agentic AI, interactive coding, real-time voice agents — live or die on latency.”

🌐 Analysis

The announcement reflects a broader shift occurring across AI infrastructure. While much of the industry’s attention has focused on training clusters built around increasingly powerful GPUs, inference is emerging as the dominant long-term workload. As AI models become embedded into enterprise applications and consumer services, operators are seeking ways to reduce latency and operating costs while scaling inference capacity. This has created opportunities for specialized inference accelerators designed to work alongside GPUs rather than compete directly with them.

The production ramp also follows d-Matrix’s April acquisition of GigaIO’s data center business. GigaIO built its reputation around composable and disaggregated infrastructure architectures that allow compute, memory, and accelerators to be dynamically pooled and allocated. The acquisition gives d-Matrix experienced rack-scale systems engineers, deployment expertise, and integration capabilities that complement its silicon portfolio. Combined with SquadRack, the move positions d-Matrix to deliver complete rack-level inference systems rather than standalone accelerator cards, aligning with a broader industry trend toward integrated AI infrastructure platforms.

Company Profile: d-Matrix
Headquarters Santa Clara, California, USA
Founded 2019
Founders Sid Sheth (CEO) & Sudeep Bhoja
Core Focus Low-latency, high-efficiency Generative AI (GenAI) & LLM inference acceleration for data centers.
Flagship Product Corsair™ AI Inference Platform (Entering full volume production as of June 2026)
Architecture Digital In-Memory Computing (DIMC), chiplet-based scaling, and native Block Floating Point formats (OCP Microscaling / MX formats).
Memory Specs SRAM-based integrated Performance Memory (~150 TB/s bandwidth per card) paired with LPDDR5 Capacity Memory (up to 256GB).
Networking Tech JetStream™ Ultra-fast I/O Accelerator (Transparent NIC solution for accelerator-to-accelerator connectivity).
Reference Design SquadRack™ (Disaggregated, standards-based rack-scale solution utilizing standard air cooling).
Software Stack Aviator™ (Enterprise-grade compiler/software stack integrated with PyTorch and Triton DSL).
Manufacturing TSMC N6 process utilizing Alchip Technologies for design/packaging services.
Ecosystem Partners Arista, Broadcom, Supermicro, Liqid, TSMC, and Alchip.
Recent Acquisition GigaIO’s Data Center Business (April 2026). Brought in the SuperNODE architecture and FabreX PCIe memory fabric to solve system-level scaling issues.
Strategic Impact: GigaIO Acquisition
Timeline Announced & Closed: April 2026
Business Acquired GigaIO Data Center Assets (Excluding legacy defense/high-performance computing business segments).
Core Technology FabreX™ open software-defined memory fabric (PCIe Gen 5 / CXL) and SuperNODE™ disaggregated cluster architectures.
Primary Driver Solves the multi-node scaling bottleneck. Enables d-Matrix to dynamically link dozens of Corsair accelerators with uniform, ultra-low memory latency without hitting InfiniBand/Ethernet overhead limits.
System Integration Directly accelerates the deployment of SquadRack™ clusters, allowing the company to ship pre-configured, multi-petabyte LLM inference infrastructure.
Macro Trend The definitive pivot from siloed AI hardware component sales to fully integrated, turnkey rack-scale computing architectures capable of supporting multi-trillion parameter agentic workflows.
Exit mobile version