• Home
  • About
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Manage Email Delivery
  • NextGenInfra.io
No Result
View All Result
Converge Digest
Wednesday, June 3, 2026
  • Home
  • About
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Manage Email Delivery
  • NextGenInfra.io
No Result
View All Result
Converge Digest
No Result
View All Result

Home » VAST Data Reworks Inference Architecture for Agentic AI with NVIDIA

VAST Data Reworks Inference Architecture for Agentic AI with NVIDIA

January 8, 2026
in All
A A

VAST Data has redesigned its AI inference architecture to support long-lived, agentic workloads by integrating its AI Operating System directly with NVIDIA’s data-center networking stack. The company announced that VAST AI OS now runs natively on NVIDIA BlueField-4 DPUs as part of the NVIDIA Inference Context Memory Storage Platform. The architecture targets large-scale inference environments where models operate across long sessions, multiple turns, and multiple agents, shifting the performance focus from raw GPU compute to how efficiently inference context is stored, shared, and reused.

As inference evolves beyond stateless prompts, VAST argues that keeping context local to GPUs no longer scales. The updated design embeds storage and data services directly inside GPU servers as well as dedicated data nodes, removing traditional client-server contention and reducing data copies that increase time-to-first-token as concurrency rises. Using VAST’s Disaggregated Shared-Everything (DASE) architecture with NVIDIA Spectrum-X Ethernet, the system exposes a shared, globally coherent key-value cache across nodes with deterministic access characteristics.

VAST positions the platform as a foundation for production inference as AI services move into regulated and revenue-generating deployments. By treating inference context as shared infrastructure, the AI OS adds policy controls, isolation, auditability, and lifecycle management while maintaining high-speed access to KV cache. The company says this approach helps reduce idle GPU time and improves infrastructure efficiency as context sizes and concurrent sessions increase.

  • Runs VAST AI Operating System natively on NVIDIA BlueField-4 DPUs
  • Collapses traditional storage tiers into a shared, pod-scale KV cache
  • Enables direct GPU-to-NVMe access over RDMA Ethernet fabrics
  • Targets predictable latency for long-context, multi-turn, and multi-agent inference
  • Adds policy, security, and lifecycle controls for production environments

“Inference is becoming a memory system, not a compute job,” said John Mao, Vice President of Global Technology Alliances at VAST Data. “If context isn’t available on demand, GPUs idle and economics collapse. With the VAST AI Operating System on NVIDIA BlueField-4, we’re turning context into shared infrastructure built to stay predictable as agentic AI scales.”

🌐  Analysis

The announcement highlights a broader shift in AI infrastructure, where memory systems and data movement increasingly define inference performance. It also aligns with NVIDIA’s strategy of extending its platform beyond GPUs into DPUs and Ethernet fabrics that support scalable, multi-tenant AI factories, where efficient context sharing becomes central to system design.

VAST Data is a privately held, remote-first data infrastructure company with its corporate headquarters listed in New York City, while maintaining a significant engineering presence in Israel and distributed teams across North America and Europe. The company was founded in 2016 by Renen Hallak (CEO), Jeff Denworth (President), and Shachar Fienblit (CTO), all of whom previously held senior technical and leadership roles at all-flash storage vendor Kaminario, where they worked on large-scale, NVMe-based distributed storage systems. VAST does not publicly disclose headcount, but industry estimates place the company in the several-hundred-employee range, commonly cited between 400 and 700 staff as of 2025. The company has raised over $380 million in private funding from investors including Tiger Global, Norwest Venture Partners, Goldman Sachs, and Next47, reaching unicorn valuation status in later rounds, and remains independent as it focuses on AI-scale data, training, and inference infrastructure.

Tags: Nvidia
ShareTweetShareSummarizeSummarize
Previous Post

AT&T Debuts Analytics Platform to Monitor IoT Across Its Cellular Network

Next Post

xAI Commits $20B to 2-GW AI Data Center in Mississippi

Jim Carroll

Jim Carroll

Editor and Publisher, Converge! Network Digest, Optical Networks Daily - Covering the full stack of network convergence from Silicon Valley

Related Posts

Vera Rubin Cluster
AI Infrastructure

NVIDIA Vera Rubin Enters Full Production 

May 31, 2026
All

NVIDIA Adds In-Silicon Security to Vera BlueField-4 STX 

May 31, 2026
Financials

NVIDIA Networking Revenue Jumps 199%

May 20, 2026
Data Centers

NVIDIA’s New GPU Fleet Intelligence Platform 

May 13, 2026
Data Centers

NVIDIA and IREN Partner on 5GW Global AI Factory Buildout

May 7, 2026
Optical

NVIDIA and Corning Launch Massive U.S. AI Optics Manufacturing Push 

May 6, 2026
Next Post

xAI Commits $20B to 2-GW AI Data Center in Mississippi

Categories

  • 5G / 6G / Wi-Fi
  • AI Infrastructure
  • All
  • Automotive Networking
  • Blueprints
  • Clouds and Carriers
  • Data Centers
  • Enterprise
  • Explainer
  • Feature
  • Financials
  • Last Mile / Middle Mile
  • Legal / Regulatory
  • Optical
  • Quantum
  • Research
  • Security
  • Semiconductors
  • Space
  • Start-ups
  • Subsea
  • Sustainability
  • Video
  • Webinars

Archives

Tags

5G All AT&T Australia AWS Blueprint columns BroadbandWireless Broadcom China Ciena Cisco Data Centers Dell'Oro Ericsson FCC Financial Financials Huawei Infinera Intel Japan Juniper Last Mile Last Mille LTE Mergers and Acquisitions Mobile NFV Nokia Optical Packet Systems PacketVoice People Regulatory Satellite SDN Service Providers Silicon Silicon Valley StandardsWatch Storage TTP UK Verizon Wi-Fi
Converge Digest

A private dossier for networking and telecoms

Follow Us

  • Home
  • About
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Manage Email Delivery
  • NextGenInfra.io

© 2026 Converge Digest - A private dossier for networking and telecoms.

No Result
View All Result
  • Home
  • About
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Manage Email Delivery
  • NextGenInfra.io

© 2026 Converge Digest - A private dossier for networking and telecoms.

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.
Go to mobile version