At the Cisco AI Summit, Matt Garman, CEO of Amazon Web Services, joined Jeetu Patel to discuss why many enterprises stall between AI pilots and production—and what that means for infrastructure at scale. Garman said the most common gap is the absence of clear success metrics at the start of AI projects, which leaves organizations unable to decide which proofs of concept deserve global deployment.
Garman said production AI forces infrastructure decisions well beyond model selection. As inference becomes embedded in every application, AWS is integrating AI natively across storage, networking, VPCs, identity, and security controls. He described this as a shift toward treating inference as a first-class infrastructure primitive rather than a standalone workload. That approach, he said, drives demand for predictable scaling, secure agent identity, and tighter integration between compute, network, and data planes—areas where early pilots often break down.
On silicon and economics, Garman said AWS’s long-term investment in custom chips has become central to both performance and cost control. While NVIDIA GPUs remain critical, he said AWS silicon allows the company to offer customers better price-performance and architectural choice, especially for inference. He emphasized that infrastructure constraints today are as much physical as financial, citing power availability, construction timelines, and permitting as limiting factors, while dismissing near-term feasibility of space-based data centers due to launch economics and logistics.
- AWS sees inference becoming embedded in every application, not a separate AI tier
- Scaling AI requires tight integration across compute, networking, storage, identity, and security
- Custom AWS silicon improves price-performance versus GPU-only architectures, particularly for inference
- Network and VPC integration are critical as agents operate across distributed systems
- Data center growth is constrained by power, permitting, and construction timelines, not capital alone
- Space-based data centers remain economically impractical due to launch and logistics costs
“When you have guardrails—security, identity, and operational controls—teams can move fast in production,” Garman said. “Our job is to give customers safe places to run fast, all the way from silicon up through the application.”






