2025-11-26
11/26/2025, 12:00:00 AM ~ 11/27/2025, 12:00:00 AM (UTC) Recent Announcements SageMaker HyperPod now supports Managed tiered KV cache and intelligent routing Amazon SageMaker HyperPod now supports Managed Tiered KV Cache and Intelligent Routing for large language model (LLM) inference, enabling customers to optimize inference performance for long-context prompts and multi-turn conversations. Customers deploying production LLM applications need fast response times while processing lengthy documents or maintaining conversation context, but traditional inference approaches require recalculating attention mechanisms for all previous tokens with each new token generation, creating computational overhead and escalating costs....