Arrcus reported 3x bookings growth in 2025 and introduced a new AI-policy aware network fabric designed to accelerate distributed inference workloads across data centers and edge environments. The San Jose-based networking software provider said growth spanned data center, telco, and enterprise deployments, with ArcOS and its ACE platform running across thousands of production nodes worldwide.
The company unveiled its Arrcus Inference Network Fabric (AINF), a distributed routing fabric engineered to improve AI inference performance by steering traffic between inference nodes, caches, and data centers. AINF targets higher Tokens per Second (TPS), lower Time to First Token (TTFT), and reduced End-to-End Latency (E2EL) for real-time and agentic AI applications. Arrcus said research indicates potential improvements of more than 60% reduction in TTFT, 15% TPS gains, 40% lower E2EL, and up to 30% reduction in cost per inference.
AINF introduces a policy abstraction layer that translates application intent—such as latency thresholds, data sovereignty requirements, model selection, and power constraints—into dynamic routing decisions. The platform integrates with inference frameworks including vLLM, SGLang, and Triton, and supports Kubernetes-based orchestration. Arrcus positions AINF as complementary to its existing ACE-AI fabric, enabling distributed AI across data centers, edge, and hybrid cloud environments while remaining hardware-agnostic across xPUs and Ethernet silicon.
- 3x bookings growth in 2025 across data center, telco, and enterprise markets
- Arrcus Inference Network Fabric (AINF) for distributed AI inference
- Policy-aware routing based on latency, sovereignty, model, and power parameters
- Claimed improvements:
- 60% reduction in TTFT
- 15% improvement in TPS
- 40% reduction in E2EL
- Up to 30% lower cost per inference
- Integration with vLLM, SGLang, Triton, and Kubernetes orchestration
- Designed to operate across multi-vendor networking silicon and open hardware platforms
“To enhance agentic AI adoption by improving response times, networks need to become AI-aware,” said Shekar Ayyar, Chairman and CEO of Arrcus. “AINF extends Arrcus’ leadership in distributed networking by delivering the first fabric designed to meet the latency, sovereignty, and power constraints of large-scale AI inferencing.”
🌐 Analysis: As AI infrastructure shifts from centralized training clusters to geographically distributed inference, the network layer becomes a performance and economic control point. Arrcus’ focus on policy-driven traffic steering aligns with broader industry moves toward workload-aware fabrics, particularly as operators contend with power limits, regional data residency rules, and multi-model deployments.
The announcement also reflects growing emphasis on inference optimization across the stack. While hyperscalers invest in custom silicon and scale-out fabrics, software-defined routing layers such as AINF aim to extract incremental efficiency from existing infrastructure. If enterprises and service providers pursue inference-as-a-service models at scale, intelligent traffic steering tied to SLOs could become a differentiator in AI network architecture.






