Site icon Converge Digest

Cerebras Posts Q1 Revenue, Expands OpenAI and AWS Partnerships

Cerebras reported Q1 2026 core revenue of $191.3 million, up 92% year-over-year, while GAAP revenue reached $193.4 million, an increase of 94% compared to the same period last year. The company said growth was driven by rising demand for its wafer-scale AI systems and cloud-based inference services, which are increasingly being deployed for large language model inference workloads.

A major highlight was the announcement of a multi-year OpenAI agreement valued at more than $20 billion, under which OpenAI plans to deploy 750 MW of Cerebras inference compute capacity over the coming years. The companies also introduced Codex-Spark, a low-latency coding model optimized for interactive AI applications and capable of delivering more than 1,000 tokens per second. Cerebras also launched a strategic partnership with Amazon to bring its inference technology to AWS customers through a disaggregated architecture that combines AWS Trainium 3 processors for prefill operations with Cerebras CS-3 systems for high-speed decoding.

The company continued expanding its model ecosystem, launching enterprise trials for Kimi K2.6, the first trillion-parameter model served on Cerebras infrastructure, and Gemma 4 31B, Google’s latest open-weight model family. Cerebras said Kimi K2.6 achieved performance approaching 1,000 tokens per second, while Gemma 4 runs significantly faster than competing implementations according to independent benchmarks from Artificial Analysis.

Financially, Cerebras reported a GAAP operating loss of $15.0 million, an improvement from a $28.5 million loss a year earlier. On a non-GAAP basis, the company reduced its operating loss to $3.5 million. Cloud and services revenue grew 178% year-over-year to $82.8 million, highlighting the increasing importance of AI inference services alongside hardware sales. Cash, cash equivalents, restricted cash, and short-term investments totaled $3.3 billion at quarter end.

The company also completed a historic financing cycle. Following a $1 billion Series H financing and a $1 billion working-capital loan from OpenAI, Cerebras raised $6.4 billion through its IPO, which it described as the largest semiconductor IPO on record. In April, it added an $850 million revolving credit facility to support continued expansion of AI data center capacity.

For Q2 2026, Cerebras expects approximately $194 million in core revenue, representing about 88% year-over-year growth. For the full fiscal year, the company projects core revenue between $855 million and $865 million, up approximately 69% year-over-year at the midpoint.

Analysis

🌐 Cerebras is increasingly positioning itself not simply as a semiconductor company but as a vertically integrated AI infrastructure provider. The OpenAI agreement is notable not only for its reported value—more than $20 billion—but also for the scale of the planned deployment. A commitment for 750 MW of inference capacity places the discussion squarely in the realm of hyperscale AI infrastructure, comparable to the power footprints being discussed for next-generation AI campuses.

🌐 The AWS partnership is strategically significant because it embraces a disaggregated AI architecture rather than requiring customers to choose a single accelerator platform. By combining Trainium 3 for prefill and Cerebras CS-3 systems for decode, the companies are targeting one of the fastest-growing segments of AI infrastructure: inference optimization. This reflects a broader industry trend toward specialized architectures for different stages of AI workloads.

🌐 The revenue mix is also shifting. Hardware revenue grew 59% year-over-year, while cloud and services revenue expanded 178%, suggesting that recurring inference services are becoming a larger contributor to the business. For investors and infrastructure operators, this transition may prove as important as Cerebras’ wafer-scale processor technology itself.

Key Q1 2026 Metrics

• GAAP Revenue: $193.4 million (+94% YoY)
• Core Revenue: $191.3 million (+92% YoY)
• Hardware Revenue: $110.6 million (+59% YoY)
• Cloud & Services Revenue: $82.8 million (+178% YoY)
• GAAP Gross Margin: 45%
• Core Gross Margin: 47%
• GAAP Operating Loss: $15.0 million
• Core Operating Loss: $3.5 million
• Cash & Investments: $3.3 billion
• OpenAI Agreement: 750 MW deployment; value exceeds $20 billion
• IPO Proceeds: $6.4 billion
• FY2026 Revenue Outlook: $855–865 million

Addendum: Key Points from the Quarterly Conference Call

• Cerebras framed fast inference as a core AI infrastructure requirement, arguing that latency now directly affects AI productivity, user adoption, agent performance, and the viability of interactive frontier-model applications.

• Management emphasized that inference demand is shifting from simple token generation toward workloads involving text, images, video, agents, robotics, and other modalities that require faster response times and larger compute footprints.

• Cerebras said speed also affects AI safety because guardrails add additional compute overhead. The company argued that faster inference can support safety layers without creating unacceptable user-experience delays.

• Wafer-scale architecture was presented as a structural infrastructure advantage because it reduces chip-to-chip communication, which management described as a fundamental limitation in large terrestrial data centers and a major unresolved issue for future space-based data centers.

• Cerebras highlighted SRAM-based memory as a supply-chain differentiator, noting that it avoids dependence on HBM, which management characterized as expensive, constrained, and one of the key bottlenecks in the accelerator market.

• The company said it also avoids other major AI accelerator supply constraints, including TSMC CoWoS packaging and 3nm capacity, because its current systems use 5nm manufacturing and do not require CoWoS.

• Data center capacity—not wafer supply or demand—was identified as the primary constraint on growth. Management said it is actively pursuing capacity across North America, Europe, the Middle East, and Asia-Pacific.

• Cerebras said it is adding data centers across the U.S. and Canada, France, the Nordics, and is in early discussions for sites in Israel, the UAE, Australia, Singapore, India, and Indonesia.

• The company described AI infrastructure expansion as a “dog fight” for data center capacity, reinforcing that power availability and site access are becoming central competitive factors in AI compute.

• Management said future growth will be increasingly tied to cloud capacity deployment rather than only hardware shipments, as more systems are directed into Cerebras Cloud to support large contracted demand.

• Cerebras expects hardware revenue to decline sequentially over the next few quarters as existing purchase orders are fulfilled and more production is allocated to cloud deployments.

• The company expects OpenAI-related cloud revenue to be back-end loaded in 2026 as new data center capacity comes online.

• Cerebras said AWS impact is expected mainly in 2027, indicating that the AWS partnership is still in the technical collaboration and deployment-preparation phase.

• Management positioned disaggregated inference as a major emerging architecture, separating prefill and decode across different processors to optimize each stage of inference.

• Cerebras said decode is a particularly strong fit for its architecture because decode is sequential, while GPU architectures struggle with that workload profile.

• The company suggested that decode acceleration could become a broader partnership opportunity beyond AWS, including with companies that already operate large GPU fleets.

• Management said disaggregated inference is especially attractive for hyperscalers because they can route workloads based on traffic shape, improving utilization and avoiding stranded compute.

• Cerebras indicated that pricing for fast inference remains premium because demand exceeds supply and customers value faster tokens for productivity-sensitive use cases.

• Management said higher market pricing for inference has supported cloud gross margin, while broader accelerator-market cost pressures, including HBM costs, have lifted the pricing floor.

• Cerebras said it is temporarily renting back systems from an existing customer to bring capacity online faster while it builds out its own data center footprint.

• The company expects this rental strategy to pressure cloud services margins temporarily, but said the costs are already reflected in its Q2 and full-year outlook.

• Cerebras said its long-term margin model depends on scale economies, higher system utilization, manufacturing efficiency, product throughput improvements, and performance-based pricing.

• R&D remains the company’s largest operating investment area, with management emphasizing work across silicon, systems, software, models, cloud infrastructure, and disaggregated inference.

• Management said future product work includes disaggregated inference solutions with multiple hardware partners, expected to begin delivery in the second half of 2026.

• Cerebras said its work with frontier model providers gives it visibility into model direction and creates a feedback loop for infrastructure design.

• The company described its OpenAI collaboration as a technology-validation milestone, saying frontier-model deployment demonstrates that its architecture can support large-scale models, not just smaller or medium models.

• Cerebras pushed back against the idea that its architecture is limited to smaller models, saying it can support large models, large KV caches, and frontier-model workloads.

• Management said there are currently only two hardware vendors serving OpenAI models, positioning Cerebras as one of a very small group of infrastructure platforms validated at that level.

Strategic AI Infrastructure Highlights
OpenAI, AWS, model trials, and capital expansion
Updated: June 23, 2026
OpenAI Agreement Multi-year agreement valued at more than $20 billion
Deployment Scale 750 MW of Cerebras high-speed inference compute planned for OpenAI
AWS Partnership Multi-year partnership to bring Cerebras fast inference to AWS customers globally
Disaggregated Inference AWS Trainium 3 for prefill; Cerebras CS-3 for high-speed decode
Codex-Spark Near-instant coding model optimized for latency-sensitive interactive work; more than 1,000 tokens per second
Model Trials Enterprise trials of Kimi K2.6 and Gemma 4 31B
IPO $6.4 billion in gross proceeds raised in Q2 IPO
Credit Facility Up to $850 million revolving credit facility to support data center acquisition strategy
Exit mobile version