OpenAI and Broadcom Unveil “Jalapeño” Inference Accelerator

Jim Carroll

4 hours ago

OpenAI and Broadcom⁠ unveiled “Jalapeño,” a custom AI inference processor designed specifically for large language model (LLM) workloads. The chip marks OpenAI’s first internally architected accelerator and represents a significant step in the company’s strategy to control more of the AI infrastructure stack, spanning models, software, systems, and now silicon. The companies said engineering samples are already running production-class workloads, including GPT-5.3-Codex-Spark, at target frequency and power levels.

OpenAI designed Jalapeño around the inference characteristics of current and future LLMs, focusing on minimizing data movement and balancing compute, memory, and networking resources to improve hardware utilization. Broadcom contributed silicon implementation, manufacturing, and networking technologies, including its Tomahawk Ethernet switching portfolio, while Celestica⁠ provided board, rack, and system-level engineering. The companies said early testing indicates performance-per-watt improvements over current state-of-the-art AI accelerators, although detailed benchmark data has not yet been released.

The announcement signals OpenAI’s intention to become a full-stack infrastructure provider rather than relying exclusively on merchant GPUs. The companies said Jalapeño was developed from initial design through tape-out in approximately nine months and will serve as the first member of a multi-generation accelerator roadmap. Deployments are expected to begin by the end of 2026 in large-scale AI data centers, with OpenAI and Broadcom targeting gigawatt-scale infrastructure deployments alongside partners including Microsoft.

Profile: Jalapeño AI Accelerator Updated: June 24, 2026
Developer	OpenAI with Broadcom and Celestica
Chip Type	Custom AI inference accelerator (ASIC)
Primary Workload	Large Language Model inference
Architecture Goal	Optimize compute, memory and networking utilization for frontier AI models
Current Status	Engineering samples operational at target frequency and power
Demonstrated Workload	GPT-5.3-Codex-Spark
Networking	Broadcom Tomahawk Ethernet switching technology
Development Cycle	Approximately 9 months from design to tape-out
Performance Claim	Higher performance-per-watt than current state-of-the-art accelerators (early testing)
Deployment Timeline	Initial deployments planned by end of 2026
Scale Target	Gigawatt-scale AI data centers over multiple generations

“The world is moving to a compute-powered economy,” said Greg Brockman, President and Co-Founder of OpenAI. “Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems.”

🌐 Analysis

The Jalapeño announcement confirms growing industry speculation that leading AI model developers are moving beyond software and into custom silicon. OpenAI joins a growing list of hyperscalers and AI platform providers—including Google Cloud⁠ (TPU), Amazon Web Services⁠ (Trainium and Inferentia), Microsoft Azure⁠ (Maia), and Meta⁠ (MTIA)—that are developing workload-specific AI processors to reduce dependence on merchant accelerators and optimize infrastructure economics.

The partnership also highlights Broadcom’s growing influence in AI infrastructure. Beyond networking silicon, the company has become a major supplier of custom AI ASICs for hyperscalers and large cloud providers. If Jalapeño achieves its stated performance-per-watt objectives, it could strengthen Broadcom’s position as a preferred partner for organizations pursuing vertically integrated AI infrastructure strategies. The emphasis on inference rather than training reflects an industry-wide shift as AI deployments increasingly focus on serving production workloads efficiently at scale.

OpenAI AI Infrastructure Stack & Ecosystem Updated: June 24, 2026
Applications	ChatGPT, Codex, API Services, Enterprise AI, Agentic AI Platforms
Foundation Models	GPT family, reasoning models, multimodal systems, coding agents
Serving Software	OpenAI-developed kernels, orchestration software, schedulers, serving infrastructure and inference optimization
Custom AI Silicon	Jalapeño Intelligence Processor — purpose-built ASIC optimized for large-scale LLM inference
ASIC Development Partner	Broadcom — silicon implementation, packaging, custom ASIC development and AI infrastructure roadmap
Networking Fabric	Broadcom Tomahawk Ethernet switching architecture for AI cluster connectivity
System Integration	Celestica — board design, rack integration, manufacturing and deployment engineering
Cloud Partner	Microsoft Azure — strategic cloud and data center deployment platform
Current Training Infrastructure	NVIDIA GPUs remain the primary platform for frontier AI model training and much of current inference deployment
Alternative Accelerator Ecosystem	AMD accelerators deployed through selected hyperscale cloud environments
Potential Network OEM Layer	Arista, Cisco, Juniper and others may participate in broader AI infrastructure deployments but were not identified in this announcement
Potential Server OEM Layer	Dell, HPE, Supermicro and ODM partners may support deployment infrastructure but were not identified in this announcement
Development Speed	9-month tape-out cycle from architecture definition to manufacturing
Current Validation Status	Engineering samples running GPT-5.3-Codex-Spark at production target frequency and power
Roadmap Scale	10 GW deployment target through 2029
Infrastructure Strategy	Vertical integration across models, software, networking, systems and custom silicon
Industry Significance	OpenAI is evolving from an AI model developer into a full-stack AI infrastructure company with its own silicon roadmap and gigawatt-scale deployment ambitions