OpenAI and Broadcom unveiled “Jalapeño,” a custom AI inference processor designed specifically for large language model (LLM) workloads. The chip marks OpenAI’s first internally architected accelerator and represents a significant step in the company’s strategy to control more of the AI infrastructure stack, spanning models, software, systems, and now silicon. The companies said engineering samples are already running production-class workloads, including GPT-5.3-Codex-Spark, at target frequency and power levels.
OpenAI designed Jalapeño around the inference characteristics of current and future LLMs, focusing on minimizing data movement and balancing compute, memory, and networking resources to improve hardware utilization. Broadcom contributed silicon implementation, manufacturing, and networking technologies, including its Tomahawk Ethernet switching portfolio, while Celestica provided board, rack, and system-level engineering. The companies said early testing indicates performance-per-watt improvements over current state-of-the-art AI accelerators, although detailed benchmark data has not yet been released.
The announcement signals OpenAI’s intention to become a full-stack infrastructure provider rather than relying exclusively on merchant GPUs. The companies said Jalapeño was developed from initial design through tape-out in approximately nine months and will serve as the first member of a multi-generation accelerator roadmap. Deployments are expected to begin by the end of 2026 in large-scale AI data centers, with OpenAI and Broadcom targeting gigawatt-scale infrastructure deployments alongside partners including Microsoft.
|
Profile: Jalapeño AI Accelerator
Updated: June 24, 2026
|
|
| Developer | OpenAI with Broadcom and Celestica |
| Chip Type | Custom AI inference accelerator (ASIC) |
| Primary Workload | Large Language Model inference |
| Architecture Goal | Optimize compute, memory and networking utilization for frontier AI models |
| Current Status | Engineering samples operational at target frequency and power |
| Demonstrated Workload | GPT-5.3-Codex-Spark |
| Networking | Broadcom Tomahawk Ethernet switching technology |
| Development Cycle | Approximately 9 months from design to tape-out |
| Performance Claim | Higher performance-per-watt than current state-of-the-art accelerators (early testing) |
| Deployment Timeline | Initial deployments planned by end of 2026 |
| Scale Target | Gigawatt-scale AI data centers over multiple generations |
“The world is moving to a compute-powered economy,” said Greg Brockman, President and Co-Founder of OpenAI. “Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems.”
🌐 Analysis
The Jalapeño announcement confirms growing industry speculation that leading AI model developers are moving beyond software and into custom silicon. OpenAI joins a growing list of hyperscalers and AI platform providers—including Google Cloud (TPU), Amazon Web Services (Trainium and Inferentia), Microsoft Azure (Maia), and Meta (MTIA)—that are developing workload-specific AI processors to reduce dependence on merchant accelerators and optimize infrastructure economics.
The partnership also highlights Broadcom’s growing influence in AI infrastructure. Beyond networking silicon, the company has become a major supplier of custom AI ASICs for hyperscalers and large cloud providers. If Jalapeño achieves its stated performance-per-watt objectives, it could strengthen Broadcom’s position as a preferred partner for organizations pursuing vertically integrated AI infrastructure strategies. The emphasis on inference rather than training reflects an industry-wide shift as AI deployments increasingly focus on serving production workloads efficiently at scale.
|
OpenAI AI Infrastructure Stack & Ecosystem
Updated: June 24, 2026
|
|
| Applications | ChatGPT, Codex, API Services, Enterprise AI, Agentic AI Platforms |
| Foundation Models | GPT family, reasoning models, multimodal systems, coding agents |
| Serving Software | OpenAI-developed kernels, orchestration software, schedulers, serving infrastructure and inference optimization |
| Custom AI Silicon | Jalapeño Intelligence Processor — purpose-built ASIC optimized for large-scale LLM inference |
| ASIC Development Partner | Broadcom — silicon implementation, packaging, custom ASIC development and AI infrastructure roadmap |
| Networking Fabric | Broadcom Tomahawk Ethernet switching architecture for AI cluster connectivity |
| System Integration | Celestica — board design, rack integration, manufacturing and deployment engineering |
| Cloud Partner | Microsoft Azure — strategic cloud and data center deployment platform |
| Current Training Infrastructure | NVIDIA GPUs remain the primary platform for frontier AI model training and much of current inference deployment |
| Alternative Accelerator Ecosystem | AMD accelerators deployed through selected hyperscale cloud environments |
| Potential Network OEM Layer | Arista, Cisco, Juniper and others may participate in broader AI infrastructure deployments but were not identified in this announcement |
| Potential Server OEM Layer | Dell, HPE, Supermicro and ODM partners may support deployment infrastructure but were not identified in this announcement |
| Development Speed | 9-month tape-out cycle from architecture definition to manufacturing |
| Current Validation Status | Engineering samples running GPT-5.3-Codex-Spark at production target frequency and power |
| Roadmap Scale | 10 GW deployment target through 2029 |
| Infrastructure Strategy | Vertical integration across models, software, networking, systems and custom silicon |
| Industry Significance | OpenAI is evolving from an AI model developer into a full-stack AI infrastructure company with its own silicon roadmap and gigawatt-scale deployment ambitions |
