Etched emerged from stealth mode, announcing a working AI inference chip, more than $1 billion in signed customer contracts, and $800 million raised across multiple previously undisclosed financing rounds. The latest financing closed in December 2025, raising $500 million at a $5 billion post-money valuation. Investors include VentureTech Alliance, Peter Thiel, Jane Street, Hudson River Trading, Jump Trading, Two Sigma, Stripes, Ribbit Capital, Radical Ventures, Primary VC, Positive Sum, and several prominent AI researchers and entrepreneurs. The company said it achieved first-pass (A0) silicon success on the TSMC N4P process and plans to begin shipping its first rack-scale inference systems this summer.
Founded less than three years ago, Etched is developing vertically integrated AI inference infrastructure that combines custom silicon, racks, networking, cooling, software, and manufacturing. The company says its systems are already running production AI models including DeepSeek, Qwen, Mamba, and Llama, and are designed to support models ranging from dense architectures to large mixture-of-experts (MoE) systems with arbitrarily large parameter counts. To support manufacturing, Etched has opened a Taiwan factory while building a 2 MW data center, test house, and new product introduction (NPI) prototyping lab at its San Jose headquarters, with a stated goal of enabling gigawatt-scale deployments beginning in 2027.
Etched also disclosed two architectural technologies intended to differentiate its platform. The first, Low Voltage Inference (LVI), operates the chip’s compute arrays at less than half the voltage of conventional AI accelerators, which the company says enables sustained utilization above 80% of peak FLOPs without thermal throttling during trillion-parameter sparse MoE inference. The second, Cluster Scale Memory (CSM), combines HBM with a shared low-latency memory architecture connected through a proprietary interconnect to reduce inference latency while maintaining high throughput. According to Etched, early customer testing has demonstrated state-of-the-art throughput, latency, and power efficiency on inference workloads, with additional performance data expected later this summer.
• Raised $800 million across multiple previously undisclosed financing rounds.
• Latest financing: $500 million in December 2025 at a $5 billion post-money valuation.
• More than $1 billion in signed customer contracts.
• Achieved first-pass (A0) silicon success on TSMC’s N4P process.
• First rack-scale inference systems scheduled to ship in summer 2026.
• Systems currently running DeepSeek, Qwen, Mamba, and Llama models.
• Built a 2 MW data center, test facility, and NPI prototyping lab in San Jose.
• Opened a Taiwan engineering and manufacturing facility.
• Team of more than 400 engineers from NVIDIA, Google TPU, Broadcom, SK hynix, TSMC, and quantitative trading firms.
• Introduced Low Voltage Inference (LVI) architecture for sustained compute utilization.
• Introduced Cluster Scale Memory (CSM) architecture combining HBM with shared low-latency memory.
• Targeting gigawatt-scale AI inference infrastructure by 2027.
“We recognized early on that frontier AI would become one of the most economically significant technologies ever created, but that the infrastructure needed to serve those models in a sustainable and economically viable way simply did not exist,” said Gavin Uberti, co-founder and CEO of Etched.
🌐 Analysis: Etched enters an increasingly competitive market for AI inference infrastructure, joining companies such as Groq, Cerebras Systems, SambaNova Systems, d-Matrix, and established GPU suppliers led by NVIDIA. Unlike vendors focused primarily on accelerator silicon, Etched is positioning itself as a supplier of complete rack-scale inference systems, combining custom chips, interconnects, cooling, software, manufacturing, and deployment into a vertically integrated platform.
Several of the disclosed figures are noteworthy. A first-pass A0 tapeout on TSMC N4P is a meaningful semiconductor milestone, while $1 billion in signed customer contracts suggests substantial commercial interest before general availability. At the same time, many of the company’s performance claims—including state-of-the-art throughput, latency, and power efficiency—remain based on internal customer testing. Broader industry evaluation will likely follow once production systems ship and customers publish independent benchmarking results.
Etched emerged from stealth mode, revealing that it has raised $800 million across four previously undisclosed financing rounds, secured more than $1 billion in customer contracts, and completed a successful A0 tapeout of its first inference chip on the TSMC N4P process. The company said it has already built its first rack-scale systems, is validating them with early customers, and plans to begin shipping production racks this summer. Etched also disclosed that it employs more than 400 engineers with experience from NVIDIA, Google TPU teams, Broadcom, SK hynix, TSMC, and other semiconductor companies.
The startup positions itself as a vertically integrated supplier of AI inference infrastructure, co-designing silicon, racks, networking, cooling, software, and manufacturing. Rather than focusing solely on accelerator silicon, Etched says it is developing complete “frontier inference clusters” optimized for large-scale mixture-of-experts (MoE), long-context, and agentic AI workloads. To support production, the company has opened a manufacturing facility in Taiwan while building a 2 MW data center, test facility, and new product introduction (NPI) prototyping lab at its San Jose headquarters.
Etched also disclosed two architectural technologies it says differentiate its platform. The first, Low Voltage Inference (LVI), operates the chip’s compute arrays at less than half the voltage of conventional AI accelerators, enabling sustained utilization above 80% of peak FLOPs without thermal throttling during trillion-parameter sparse MoE inference. The second, Cluster Scale Memory (CSM), combines HBM with a shared SRAM-like memory architecture connected through a proprietary ultra-low-latency interconnect. According to the company, this architecture improves inference latency while maintaining high throughput without relying on SRAM-only designs, 3D DRAM approaches, or optical memory interconnects. Etched said early customer evaluations demonstrate state-of-the-art throughput, latency, and power efficiency, with additional performance data scheduled for release later this summer.
• Raised $800 million across four previously undisclosed financing rounds.
• Secured more than $1 billion in customer contracts.
• Completed successful A0 silicon tapeout on TSMC N4P.
• First production racks scheduled to ship in summer 2026.
• Built a 2 MW AI data center and prototyping facility in San Jose.
• Opened a Taiwan engineering and manufacturing facility.
• Team exceeds 400 engineers from NVIDIA, Google TPU, Broadcom, SK hynix, TSMC and others.
• Introduced Low Voltage Inference (LVI) architecture for higher sustained compute utilization.
• Introduced Cluster Scale Memory (CSM) architecture combining HBM and shared low-latency memory.
• Targets frontier inference workloads including trillion-parameter MoE models, long-context AI, and agentic systems.
“We’re coming out of stealth. We’ve built our first racks after a successful A0 tapeout, $1B+ in customer contracts, and $800m raised. Early customer tests show us achieving SOTA throughput, latency, and power efficiency on inference workloads. Our first racks ship this summer.”
🌐 Analysis: Etched joins a growing wave of AI infrastructure startups moving beyond accelerator chips toward tightly integrated rack-scale systems. Competitors including Cerebras Systems, Groq, SambaNova Systems, Furiosa, and d-Matrix have similarly emphasized inference performance rather than AI training. Etched’s announcement stands out because it combines silicon, system integration, manufacturing, and data center deployment while claiming substantial commercial traction before publicly launching.
The technical disclosures are notable because they address two of the primary constraints on next-generation inference infrastructure: sustained compute utilization and memory latency. Large mixture-of-experts models increasingly shift bottlenecks away from raw FLOPs toward memory movement, interconnect efficiency, and power delivery. Etched’s Low Voltage Inference and Cluster Scale Memory architectures represent a different design philosophy from conventional GPU scaling. However, the company has not yet published independently verified benchmark data or detailed technical specifications, so performance claims remain based on internal testing until broader customer deployments and peer-reviewed measurements become available.
🌐 We’re tracking the latest developments in AI infrastructure. Follow our ongoing coverage at: https://convergedigest.com/category/ai-infra/
🏢 Etched AI Transformer-Specific AI Inference Silicon Updated: June 30, 2026 | |
| Etched AI is a private semiconductor startup building vertically integrated AI inference systems. Its rack-scale platform combines custom transformer-specific silicon, networking, memory, cooling and software to accelerate large language model inference with higher throughput, lower latency and improved power efficiency than conventional GPU-based systems. | |
| Why It Matters | Rather than selling standalone AI accelerators, Etched is designing complete inference clusters optimized for frontier AI models. The company’s vertically integrated approach reflects a broader industry shift toward co-designing chips, interconnects, memory, cooling and software as AI infrastructure scales toward gigawatt-class deployments. |
| Founded | 2022 |
| Headquarters | San Jose, California, USA |
| Leadership | Gavin Uberti, Co-founder & CEO Rob Wachen, Co-founder |
| Company Type | Private semiconductor startup |
| Funding | Reported total funding of approximately $800 million, including a $500 million financing completed in December 2025 at a reported $5 billion post-money valuation. |
| Commercial Traction | Etched reports more than $1 billion in signed customer contracts, with rack-scale inference systems currently undergoing customer validation ahead of production shipments. |
| Technology Status | First-pass (A0) silicon on TSMC’s N4P process is operational. Etched says its rack-scale systems are validating with customers and running DeepSeek, Qwen, Mamba and Llama models ahead of production shipments planned for Summer 2026. |
| Key Technology | Sohu Transformer ASIC • Low Voltage Inference (LVI) • Cluster Scale Memory (CSM) • Rack-Scale AI Inference • Custom Interconnect • Transformer-Specific Architecture |
| Roadmap | First production racks ship in Summer 2026. The company is targeting gigawatt-scale AI inference deployments beginning in 2027 while expanding manufacturing and validation capabilities. |
| Team | More than 400 engineers with experience from NVIDIA, Google TPU, Broadcom, SK hynix, TSMC, quantitative trading firms and other AI infrastructure organizations. |
| Editorial Coverage | Converge Digest tracks Etched across AI infrastructure, custom silicon, inference acceleration, rack-scale AI systems, semiconductor startups, transformer architectures and next-generation AI data center platforms. |
| Explore More | AI Infrastructure • Semiconductors • Inference • Custom Silicon • Data Centers |



