• Home
  • About
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Manage Email Delivery
  • NextGenInfra.io
No Result
View All Result
Converge Digest
Monday, June 22, 2026
  • Home
  • About
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Manage Email Delivery
  • NextGenInfra.io
No Result
View All Result
Converge Digest
No Result
View All Result

Home » Baseten Raises $1.5 Billion to Expand AI Inference Infrastructure

Baseten Raises $1.5 Billion to Expand AI Inference Infrastructure

June 22, 2026
in AI Infrastructure
A A

Baseten raised $1.5 billion in Series F financing as demand accelerates for AI inference infrastructure that supports custom and post-trained models in production environments. The funding round was led by Altimeter Capital, Conviction, and Spark Capital, with participation from Sands Capital, Wellington Management, IVP, Greylock, Battery Ventures, D. E. Shaw Ventures, Blackbird, Durable Capital Partners, Verified Capital, 01A, and existing investors. The financing was completed in two tranches at valuations of $11 billion and $13 billion.

The company said enterprises increasingly deploy specialized models trained on proprietary data rather than relying exclusively on frontier foundation models. Baseten reported approximately 20x year-over-year revenue growth and said its platform now handles more than one billion inference calls per day. The company operates 87 clusters across 18 cloud environments, providing a multi-cloud architecture designed to support large-scale AI application deployments.

Founded in 2019 and headquartered in San Francisco, Baseten provides infrastructure software that manages AI workloads, including GPU orchestration, autoscaling, observability, billing, and developer tooling. The company said it will use the new capital to expand engineering, research, operations, and go-to-market teams while increasing investments in compute capacity. Baseten plans to triple headcount during 2026 as it scales to meet growing enterprise demand.

• Raised $1.5 billion in Series F financing

• Funding completed at valuations of $11 billion and $13 billion

• Reports approximately 20x year-over-year revenue growth

• Processes more than 1 billion inference calls daily

• Operates 87 AI clusters globally

• Supports deployments across 18 cloud environments

• Plans to triple employee headcount during 2026

• Has raised more than $2 billion since founding

• Customers include Abridge, Clay, Cursor, Lovable, Mercor, and OpenEvidence

“The future of AI will be built on millions of specialized models, and the companies building the best ones know that post-training has become existential. It’s how they build intelligence they own, on data that’s theirs, optimized for the customers they serve,” said Tuhin Srivastava, CEO and Co-Founder of Baseten.

🌐 Analysis

Baseten’s financing highlights the rapid emergence of inference as one of the fastest-growing segments of the AI infrastructure market. While much of the industry’s attention has focused on training ever-larger foundation models, enterprises are increasingly investing in the infrastructure required to serve AI applications in production. That shift has fueled growth for a new generation of inference-focused providers that optimize deployment, scaling, observability, and cost efficiency across distributed GPU environments.

The announcement also reflects the growing importance of post-training and model customization. Enterprises increasingly seek AI systems trained on proprietary datasets and optimized for specific workflows, creating demand for infrastructure platforms capable of managing large fleets of specialized models. As open-source models continue to improve, inference platforms may become a key layer in the AI stack alongside GPUs, networking, storage, and cloud infrastructure.

Profile: Baseten
AI inference infrastructure platform • Updated June 2026
HeadquartersSan Francisco, California
Founded2019
LeadershipTuhin Srivastava
CEO & Co-Founder
Latest Funding$1.5 billion Series F
Valuation$11B–$13B across two funding tranches
Capital RaisedMore than $2 billion since founding
Business FocusSoftware infrastructure for deploying, scaling, and operating production AI inference workloads rather than manufacturing GPUs, servers, or network hardware.
Infrastructure ModelMulti-cloud inference platform that handles containerization, GPU scheduling across clouds, autoscaling, observability, and engine-level optimization.
Scale87 clusters across 18 cloud environments
Inference VolumeMore than 1 billion inference calls per day
Developer FrameworkTruss, Baseten’s open-source framework for packaging models into deployable containers, including config-based deployment and custom Python model logic.
Serving StackProduction API endpoints with autoscaling, observability, optimized serving infrastructure, versioning, deployment workflows, and OpenAI-compatible APIs for supported models.
Inference EnginesBaseten says its deployments use inference engines tuned for model architecture, with optimization across quantization, tensor parallelism, KV-cache management, and batching.
NVIDIA StackNVIDIA case-study material identifies Baseten’s use of NVIDIA GPUs and TensorRT-LLM for cloud-based generative AI and LLM inference.
Optimization FeaturesBaseten describes its inference stack as combining infrastructure and runtime optimizations, including custom kernels, speculation, optional quantization, KV-cache optimization, topology-aware parallelism, request prioritization, continuous batching, geo-aware routing, LoRA-aware routing, and fast cold starts.
Deployment ModesBaseten describes support for deployment in Baseten Cloud, customer environments, or hybrid configurations.
Confirmed Cloud PartnersPublic materials describe Baseten deployments or partnerships involving Google Cloud, Nebius, Vultr, and AWS GPU instances referenced in NVIDIA’s Baseten case study.
Workload TypesLLMs, embeddings, image generation, text-to-speech, video generation, custom models, post-trained models, and multi-step AI workflows.
AI Architecture RoleBaseten sits above raw GPU infrastructure as an application-facing inference operations layer: model packaging, serving, routing, scaling, monitoring, and cost/performance optimization.
Networking RelevanceInference workloads emphasize latency, API delivery, multi-region routing, load balancing, service reliability, and cloud capacity management rather than the tightly coupled east-west GPU traffic patterns associated with large training clusters.
Customers NamedAbridge, Clay, Cursor, Lovable, Mercor, OpenEvidence
Growth MetricApproximately 20× year-over-year revenue growth reported in 2026
Tags: BasetenInference
ShareTweetShareSummarizeSummarize
Previous Post

Dell Unveils Vera Rubin Rack Supporting 144 GPUs

Next Post

Upscale AI Secures $190 Million for Data Center Networking for AI

Jim Carroll

Jim Carroll

Editor and Publisher, Converge! Network Digest, Optical Networks Daily - Covering the full stack of network convergence from Silicon Valley

Related Posts

No Content Available

Categories

  • 5G / 6G / Wi-Fi
  • AI Infrastructure
  • All
  • Automotive Networking
  • Blueprints
  • Clouds and Carriers
  • Data Centers
  • Enterprise
  • Explainer
  • Feature
  • Financials
  • Last Mile / Middle Mile
  • Legal / Regulatory
  • Optical
  • Quantum
  • Research
  • Security
  • Semiconductors
  • Space
  • Start-ups
  • Subsea
  • Sustainability
  • Video
  • Webinars

Archives

Tags

5G All AT&T Australia AWS Blueprint columns BroadbandWireless Broadcom China Ciena Cisco Data Centers Dell'Oro Ericsson FCC Financial Financials Huawei Infinera Intel Japan Juniper Last Mile Last Mille LTE Mergers and Acquisitions Mobile NFV Nokia Optical Packet Systems PacketVoice People Regulatory Satellite SDN Service Providers Silicon Silicon Valley StandardsWatch Storage TTP UK Verizon Wi-Fi
Converge Digest

A private dossier for networking and telecoms

Follow Us

  • Home
  • About
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Manage Email Delivery
  • NextGenInfra.io

© 2026 Converge Digest - A private dossier for networking and telecoms.

No Result
View All Result
  • Home
  • About
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Manage Email Delivery
  • NextGenInfra.io

© 2026 Converge Digest - A private dossier for networking and telecoms.

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.
Go to mobile version