Converge Digest

OpenAI and Broadcom Unveil “Jalapeño” Inference Accelerator

OpenAI and  Broadcom⁠ unveiled “Jalapeño,” a custom AI inference processor designed specifically for large language model (LLM) workloads. The chip marks OpenAI’s first internally architected accelerator and represents a significant step in the company’s strategy to control more of the AI infrastructure stack, spanning models, software, systems, and now silicon. The companies said engineering samples are already running production-class workloads, including GPT-5.3-Codex-Spark, at target frequency and power levels.

OpenAI designed Jalapeño around the inference characteristics of current and future LLMs, focusing on minimizing data movement and balancing compute, memory, and networking resources to improve hardware utilization. Broadcom contributed silicon implementation, manufacturing, and networking technologies, including its Tomahawk Ethernet switching portfolio, while  Celestica⁠ provided board, rack, and system-level engineering. The companies said early testing indicates performance-per-watt improvements over current state-of-the-art AI accelerators, although detailed benchmark data has not yet been released.

The announcement signals OpenAI’s intention to become a full-stack infrastructure provider rather than relying exclusively on merchant GPUs. The companies said Jalapeño was developed from initial design through tape-out in approximately nine months and will serve as the first member of a multi-generation accelerator roadmap. Deployments are expected to begin by the end of 2026 in large-scale AI data centers, with OpenAI and Broadcom targeting gigawatt-scale infrastructure deployments alongside partners including Microsoft.

Profile: Jalapeño AI Accelerator
Updated: June 24, 2026
Developer OpenAI with Broadcom and Celestica
Chip Type Custom AI inference accelerator (ASIC)
Primary Workload Large Language Model inference
Architecture Goal Optimize compute, memory and networking utilization for frontier AI models
Current Status Engineering samples operational at target frequency and power
Demonstrated Workload GPT-5.3-Codex-Spark
Networking Broadcom Tomahawk Ethernet switching technology
Development Cycle Approximately 9 months from design to tape-out
Performance Claim Higher performance-per-watt than current state-of-the-art accelerators (early testing)
Deployment Timeline Initial deployments planned by end of 2026
Scale Target Gigawatt-scale AI data centers over multiple generations

“The world is moving to a compute-powered economy,” said Greg Brockman, President and Co-Founder of OpenAI. “Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems.”

🌐 Analysis

The Jalapeño announcement confirms growing industry speculation that leading AI model developers are moving beyond software and into custom silicon. OpenAI joins a growing list of hyperscalers and AI platform providers—including  Google Cloud⁠ (TPU),  Amazon Web Services⁠ (Trainium and Inferentia),  Microsoft Azure⁠ (Maia), and  Meta⁠ (MTIA)—that are developing workload-specific AI processors to reduce dependence on merchant accelerators and optimize infrastructure economics.

The partnership also highlights Broadcom’s growing influence in AI infrastructure. Beyond networking silicon, the company has become a major supplier of custom AI ASICs for hyperscalers and large cloud providers. If Jalapeño achieves its stated performance-per-watt objectives, it could strengthen Broadcom’s position as a preferred partner for organizations pursuing vertically integrated AI infrastructure strategies. The emphasis on inference rather than training reflects an industry-wide shift as AI deployments increasingly focus on serving production workloads efficiently at scale.

OpenAI AI Infrastructure Stack & Ecosystem
Updated: June 24, 2026
Applications ChatGPT, Codex, API Services, Enterprise AI, Agentic AI Platforms
Foundation Models GPT family, reasoning models, multimodal systems, coding agents
Serving Software OpenAI-developed kernels, orchestration software, schedulers, serving infrastructure and inference optimization
Custom AI Silicon Jalapeño Intelligence Processor — purpose-built ASIC optimized for large-scale LLM inference
ASIC Development Partner Broadcom — silicon implementation, packaging, custom ASIC development and AI infrastructure roadmap
Networking Fabric Broadcom Tomahawk Ethernet switching architecture for AI cluster connectivity
System Integration Celestica — board design, rack integration, manufacturing and deployment engineering
Cloud Partner Microsoft Azure — strategic cloud and data center deployment platform
Current Training Infrastructure NVIDIA GPUs remain the primary platform for frontier AI model training and much of current inference deployment
Alternative Accelerator Ecosystem AMD accelerators deployed through selected hyperscale cloud environments
Potential Network OEM Layer Arista, Cisco, Juniper and others may participate in broader AI infrastructure deployments but were not identified in this announcement
Potential Server OEM Layer Dell, HPE, Supermicro and ODM partners may support deployment infrastructure but were not identified in this announcement
Development Speed 9-month tape-out cycle from architecture definition to manufacturing
Current Validation Status Engineering samples running GPT-5.3-Codex-Spark at production target frequency and power
Roadmap Scale 10 GW deployment target through 2029
Infrastructure Strategy Vertical integration across models, software, networking, systems and custom silicon
Industry Significance OpenAI is evolving from an AI model developer into a full-stack AI infrastructure company with its own silicon roadmap and gigawatt-scale deployment ambitions

Exit mobile version