At Cisco’s AI Summit, Google detailed how closer collaboration between infrastructure, silicon, and frontier model teams—and faster paths from lab to scale—now shape its AI platform strategy.
Amin Vahdat, VP and GM of Systems and Infrastructure at Google, joined Cisco’s Jeetu Patel to explain how Google’s vertically integrated stack—spanning accelerators, systems, networking, power, and data centers—co-evolves with its most demanding AI workloads.
Vahdat described Google’s core advantage as organizational co-design rather than any single technology. Infrastructure and silicon teams work directly with Google’s frontier model builders, aligning TPUs, systems architecture, and data center design with the evolving needs of large-scale training and inference. That tight loop, informed by production workloads such as Search and YouTube, allows Google to plan across multi-year horizons while continuously stress-testing assumptions about where models and agents are headed.
A central constraint, Vahdat argued, is time: today’s two-to-three-year cycle from hardware concept to fleet-wide deployment limits how far specialization can go. Compressing that cycle—even incrementally—would enable more workload-specific accelerators and step-function gains in power efficiency and cost. Google continues to deploy TPUs internally and via GCP, while also offering GPUs through deep partnerships, including with NVIDIA, selecting silicon based on customer requirements rather than ideology.
- Google emphasizes co-design across models, software, silicon, networking, power, and facilities
- Infrastructure and silicon teams work closely with frontier model builders to guide design choices
- TPUs remain central to Google’s strategy, alongside broad GPU availability on GCP
- Customer workloads determine accelerator choice, not a single preferred architecture
- Hardware lead times of 2–3 years constrain specialization and efficiency gains
- Shorter lab-to-scale cycles would unlock more targeted accelerators and higher power efficiency
- Data center scaling now forces changes in deployment, maintenance, and automation
- Efficiency improvements continue, but expanding AI capabilities rapidly absorb those gains
- Competition among leading models—including Google’s Gemini and offerings from OpenAI—accelerates progress across the ecosystem
“We have the opportunity to write the book on how this revolution gets supported from a technical perspective,” Vahdat said, pointing to infrastructure velocity as the critical factor that will determine how fast AI capabilities reach production at scale.







