d-Matrix announced that its Corsair AI inference accelerator platform has entered full production, with volume shipments scheduled to begin this summer for select hyperscalers, neocloud providers, and frontier AI laboratories. The company said demand has accelerated as enterprises deploy increasingly latency-sensitive agentic AI applications, including coding assistants, voice agents, and interactive AI systems that require rapid token generation and low response times.
The company positions Corsair as a complementary accelerator to GPUs rather than a replacement. In heterogeneous AI clusters, GPUs handle compute-intensive model prefill operations while Corsair accelerators execute the decode phase of inference. d-Matrix cited independent testing showing that pairing Corsair accelerators with GPUs reduced response times from approximately 24 seconds to less than two seconds in speculative decoding workloads. The platform uses an SRAM-based in-memory compute architecture combined with LPDDR5 memory rather than HBM-based packaging, a design the company says improves supply chain predictability and reduces manufacturing complexity.
To support large-scale deployments, d-Matrix is offering its SquadRack reference architecture, developed in collaboration with Arista, Broadcom, and Supermicro. The company also highlighted its April acquisition of GigaIO’s data center business, which added rack-scale systems expertise and field deployment capabilities. Corsair is manufactured using TSMC’s N6 process technology through a partnership with Alchip Technologies, and the company said supply agreements are in place to support production ramp requirements.
- Corsair AI inference accelerator platform enters full production
- Volume shipments begin this summer
- Targets hyperscalers, neoclouds, and frontier AI labs
- Designed for heterogeneous AI clusters combining GPUs and inference accelerators
- Independent testing demonstrated greater than 10x inference response improvements
- Built on TSMC N6 process technology
- Uses SRAM-based in-memory compute architecture
- Avoids HBM and CoWoS packaging dependencies
- SquadRack integrates Corsair, JetStream networking, and Aviator software
- April acquisition of GigaIO data center business expands rack-scale deployment capabilities
“We built Corsair specifically for this moment, the Age of AI Inference,” said Sid Sheth, founder and CEO of d-Matrix. “The applications that matter most today — agentic AI, interactive coding, real-time voice agents — live or die on latency.”
🌐 Analysis
The announcement reflects a broader shift occurring across AI infrastructure. While much of the industry’s attention has focused on training clusters built around increasingly powerful GPUs, inference is emerging as the dominant long-term workload. As AI models become embedded into enterprise applications and consumer services, operators are seeking ways to reduce latency and operating costs while scaling inference capacity. This has created opportunities for specialized inference accelerators designed to work alongside GPUs rather than compete directly with them.
The production ramp also follows d-Matrix’s April acquisition of GigaIO’s data center business. GigaIO built its reputation around composable and disaggregated infrastructure architectures that allow compute, memory, and accelerators to be dynamically pooled and allocated. The acquisition gives d-Matrix experienced rack-scale systems engineers, deployment expertise, and integration capabilities that complement its silicon portfolio. Combined with SquadRack, the move positions d-Matrix to deliver complete rack-level inference systems rather than standalone accelerator cards, aligning with a broader industry trend toward integrated AI infrastructure platforms.
| Company Profile: d-Matrix | |
|---|---|
| Headquarters | Santa Clara, California, USA |
| Founded | 2019 |
| Founders | Sid Sheth (CEO) & Sudeep Bhoja |
| Core Focus | Low-latency, high-efficiency Generative AI (GenAI) & LLM inference acceleration for data centers. |
| Flagship Product | Corsair™ AI Inference Platform (Entering full volume production as of June 2026) |
| Architecture | Digital In-Memory Computing (DIMC), chiplet-based scaling, and native Block Floating Point formats (OCP Microscaling / MX formats). |
| Memory Specs | SRAM-based integrated Performance Memory (~150 TB/s bandwidth per card) paired with LPDDR5 Capacity Memory (up to 256GB). |
| Networking Tech | JetStream™ Ultra-fast I/O Accelerator (Transparent NIC solution for accelerator-to-accelerator connectivity). |
| Reference Design | SquadRack™ (Disaggregated, standards-based rack-scale solution utilizing standard air cooling). |
| Software Stack | Aviator™ (Enterprise-grade compiler/software stack integrated with PyTorch and Triton DSL). |
| Manufacturing | TSMC N6 process utilizing Alchip Technologies for design/packaging services. |
| Ecosystem Partners | Arista, Broadcom, Supermicro, Liqid, TSMC, and Alchip. |
| Recent Acquisition | GigaIO’s Data Center Business (April 2026). Brought in the SuperNODE architecture and FabreX PCIe memory fabric to solve system-level scaling issues. |
| Strategic Impact: GigaIO Acquisition | |
|---|---|
| Timeline | Announced & Closed: April 2026 |
| Business Acquired | GigaIO Data Center Assets (Excluding legacy defense/high-performance computing business segments). |
| Core Technology | FabreX™ open software-defined memory fabric (PCIe Gen 5 / CXL) and SuperNODE™ disaggregated cluster architectures. |
| Primary Driver | Solves the multi-node scaling bottleneck. Enables d-Matrix to dynamically link dozens of Corsair accelerators with uniform, ultra-low memory latency without hitting InfiniBand/Ethernet overhead limits. |
| System Integration | Directly accelerates the deployment of SquadRack™ clusters, allowing the company to ship pre-configured, multi-petabyte LLM inference infrastructure. |
| Macro Trend | The definitive pivot from siloed AI hardware component sales to fully integrated, turnkey rack-scale computing architectures capable of supporting multi-trillion parameter agentic workflows. |
