2025-11-26

11/26/2025, 12:00:00 AM ~ 11/27/2025, 12:00:00 AM (UTC)

Recent Announcements

SageMaker HyperPod now supports Managed tiered KV cache and intelligent routing

Amazon SageMaker HyperPod now supports Managed Tiered KV Cache and Intelligent Routing for large language model (LLM) inference, enabling customers to optimize inference performance for long-context prompts and multi-turn conversations. Customers deploying production LLM applications need fast response times while processing lengthy documents or maintaining conversation context, but traditional inference approaches require recalculating attention mechanisms for all previous tokens with each new token generation, creating computational overhead and escalating costs. Managed Tiered KV Cache addresses this challenge by intelligently caching and reusing computed values, while Intelligent Routing directs requests to optimal instances.\n These capabilities deliver up to 40% latency reduction, 25% throughput improvement, and 25% cost savings compared to baseline configurations. The Managed Tiered KV Cache feature uses a two-tier architecture combining local CPU memory (L1) with disaggregated cluster-wide storage (L2). AWS-native disaggregated tiered storage is the recommended backend, providing scalable terabyte-scale capacity and automatic tiering from CPU memory to local SSD for optimal memory and storage utilization. We also offer Redis as an alternative L2 cache option. The architecture enables efficient reuse of previously computed key-value pairs across requests. The newly introduced Intelligent Routing maximizes cache utilization through three configurable strategies: prefix-aware routing for common prompt patterns, KV-aware routing for maximum cache efficiency with real-time cache tracking, and round-robin for stateless workloads. These features work seamlessly together. Intelligent routing directs requests to instances with relevant cached data, reducing time to first token in document analysis and maintaining natural conversation flow in multi-turn dialogues. Built-in observability integration with Amazon Managed Grafana provides metrics for monitoring performance. You can enable these features through InferenceEndpointConfig or SageMaker JumpStart when deploying models via the HyperPod Inference Operator on EKS-orchestrated clusters. These features are available in all regions where SageMaker HyperPod is available. To learn more, see the user guide.

Amazon SageMaker HyperPod now supports custom Kubernetes labels and taints

Amazon SageMaker HyperPod now supports custom Kubernetes labels and taints, enabling customers to control pod scheduling and integrate seamlessly with existing Kubernetes infrastructure. Customers deploying AI workloads on HyperPod clusters orcehstrated with EKS need precise control over workload placement to prevent expensive GPU resources from being consumed by system pods and non-AI workloads, while ensuring compatibility with custom device plugins such as EFA and NVIDIA GPU operators. Previously, customers had to manually apply labels and taints using kubectl and reapply them after every node replacement, scaling, or patching operation, creating significant operational overhead.\n This capability allows you to configure labels and taints at the instance group level through the CreateCluster and UpdateCluster APIs, providing a managed approach to defining and maintaining scheduling policies across the entire node lifecycle. Using the new KubernetesConfig parameter, you can specify up to 50 labels and 50 taints per instance group. Labels enable resource organization and pod targeting through node selectors, while taints repel pods without matching tolerations to protect specialized nodes. For example, you can apply NoSchedule taints to GPU instance groups to ensure only AI training jobs with explicit tolerations consume high-cost compute resources, or add custom labels that enable device plugin pods to schedule correctly. HyperPod automatically applies these configurations during node creation and maintains them across replacement, scaling, and patching operations, eliminating manual intervention and reducing operational overhead. This feature is available in all AWS Regions where Amazon SageMaker HyperPod is available. To learn more about custom labels and taints, see the user guide.

Amazon Kinesis Video Streams now supports a new cost effective warm storage tier

AWS announces a new warm storage tier for Amazon Kinesis Video Streams (Amazon KVS), delivering cost-effective storage for extended media retention. The standard Amazon KVS storage tier, now designated as the hot tier, remains optimized for real-time data access and short-term storage. The new warm tier enables long-term media retention with sub-second access latency at reduced storage costs.\n The warm storage tier enables developers of home security and enterprise video monitoring solutions to cost-effectively stream data from devices, cameras, and mobile phones while maintaining extended retention periods for video analytics and regulatory compliance. Moreover, developers now have the flexibility to configure fragment sizes based on their specific requirements — selecting smaller fragments for lower latency use cases or larger fragments to reduce ingestion costs. Both hot and warm storage tiers integrate seamlessly with Amazon Rekognition Video and Amazon SageMaker, enabling continuous data processing to support the creation of computer vision and video analytics applications. Amazon Kinesis Video Streams with the new warm storage tier is available in all regions where Amazon Kinesis Video Streams is available, except the AWS GovCloud (US) Regions. To learn more, refer to the getting started guide.

Amazon S3 Block Public Access now supports organization-level enforcement

Amazon S3 Block Public Access (BPA) now allows organization-level control through AWS Organizations, allowing you to standardize and enforce S3 public access settings across all accounts in your AWS organization through a single policy configuration.\n S3 Block Public Access at the organization level uses a single configuration that controls all public access settings across accounts within your organization. When you attach the policy at the root or Organizational Unit (OU)-level of your organization, it propagates to all sub-accounts within that scope, and new member accounts automatically inherit the policy. Alternatively, you can choose to apply the policy to specific accounts for more granular control. To get started, navigate to the AWS Organizations console and use the “Block all public access” checkbox or JSON editor. Additionally, you can use AWS CloudTrail to audit or keep track of policy attachment as well as enforcement for member accounts. This feature is available in the AWS Organizations console as well as AWS CLI/SDK, in all AWS Regions where AWS Organizations and Amazon S3 are supported, with no additional charges. For more information, visit the AWS Organizations User Guide and Amazon S3 Block Public Access documentation.

Improved AWS Health event triage

AWS Health now includes two new properties in its event schema - actionability and persona - enabling customers to identify the most relevant events. These properties allow organizations to programmatically identify events requiring customer action and direct them to relevant teams. The enhanced event schema is accessible through both the AWS Health API and Health EventBridge communication channels, improving operational efficiency and team coordination.\n AWS customers receive various operational notifications and scheduled changes, including Planned Lifecycle Events. With the new actionability property, teams can quickly distinguish between events requiring action and those shared for awareness. The persona property streamlines event routing and visibility to specific teams like security and billing, ensuring critical information reaches appropriate stakeholders. These structured properties streamline integration with existing operational tools, allowing teams to effectively identify and remediate affected resources while maintaining appropriate visibility across the organization. This enhancement is available across all AWS Commercial and AWS GovCloud (US) Regions. To learn more about implementing these new properties, see the AWS Health User Guide and the API and EventBridge schema documentation.

Amazon CloudWatch now supports deletion protection for logs

Amazon CloudWatch now offers configuring deletion protection on your CloudWatch log groups, helping customers safeguard their critical logging data from accidental or unintended deletion. This feature provides an additional layer of protection for logs maintaining audit trails, compliance records, and operational logs that must be preserved.\n With deletion protection enabled, administrators can prevent unintended deletions of their most important log groups. Once enabled, log groups cannot be deleted until the protection is explicitly turned off, helping safeguard critical operational, security, and compliance data. This protection is particularly valuable for preserving audit logs and production application logs needed for troubleshooting and analysis. Log group deletion protection is available in all AWS commercial Regions. You can enable deletion protection during log group creation or on existing log groups using the Amazon CloudWatch console, AWS Command Line Interface (AWS CLI), AWS Cloud Development Kit (AWS CDK), and AWS SDKs. For more information, visit the Amazon CloudWatch Logs User Guide..

Amazon SageMaker HyperPod now supports programmatic node reboot and replacement

Today, Amazon SageMaker HyperPod announces the general availability of new APIs that enable programmatic rebooting and replacement of SageMaker HyperPod cluster nodes. SageMaker HyperPod helps you provision resilient clusters for running machine learning (ML) workloads and developing state-of-the-art models such as large language models (LLMs), diffusion models, and foundation models (FMs). The new BatchRebootClusterNodes and BatchReplaceClusterNodes APIs enable customers to programmatically reboot or replace unresponsive or degraded cluster nodes, providing a consistent, orchestrator agnostic approach to node recovery operations.\n The new APIs enhance node management capabilities for both Slurm and EKS orchestrated clusters complementing existing node reboot and replacement workflows. Existing orchestrator-specific methods, such as Kubernetes labels for EKS clusters and Slurm commands for Slurm clusters, remain available alongside the newly introduced programmatic capabilities for reboot and replace operations through these purpose-built APIs. When cluster nodes become unresponsive due to issues such as memory overruns or hardware degradation, recovery operations such as node reboots and replacements maybe be necessary and can be initiated through these new APIs. These capabilities are particularly valuable when running time-sensitive workloads. For instance, when a Slurm controller, login or compute node becomes unresponsive, administrators can trigger a reboot operation using the API and monitor its progress to get nodes back to operational status. Similarly, EKS cluster administrators can replace degraded worker nodes programmatically. Each API supports batch operations of up to 25 instances, enabling efficient management of large-scale recovery scenarios. The reboot and replace APIs are currently supported in three AWS regions where SageMaker HyperPod is available: US East (Ohio), Asia Pacific (Mumbai), and Asia Pacific (Tokyo).The APIs can be accessed through the AWS CLI, SDK, or API calls. For more information, see the Amazon SageMaker HyperPod documentation for BatchRebootClusterNodes and BatchReplaceClusterNodes.

AWS Compute Optimizer now supports unused NAT Gateway recommendations

Today, AWS announces that AWS Compute Optimizer now supports idle resource recommendations for NAT Gateways. With this new recommendation type, you will be able to identify NAT Gateways that are unused, resulting in cost savings.\n With the new unused NAT Gateway recommendation, you will be able to identify NAT Gateways that show no traffic activity over a 32-day analysis period. Compute Optimizer analyzes CloudWatch metrics including active connection count, incoming packets from source, and incoming packets from destination to validate if NAT Gateways are truly unused. To avoid recommending critical backup resources, Compute Optimizer also examines if the NAT Gateway resource is associated in any AWS Route Tables. You can view the total savings potential of these unused NAT Gateways and access detailed utilization metrics to verify unused conditions before taking action. This new feature is available in all AWS Regions where AWS Compute Optimizer is available except the AWS GovCloud (US) and the China Regions. To learn more about the new feature updates, please visit Compute Optimizer’s product page and user guide.

The AWS API MCP Server is now available on AWS Marketplace

AWS announces the availability of the AWS API MCP Server on AWS Marketplace, enabling customers to deploy the Model Context Protocol (MCP) server to Amazon Bedrock AgentCore. The marketplace entry includes step-by-step configuration and deployment instructions for deploying the AWS API MCP Server as a managed service with built-in authentication and session isolation to Bedrock Agent Core Runtime.\n The AWS Marketplace deployment simplifies container management while providing enterprise-grade security, scalability, and session isolation through Amazon Bedrock AgentCore Runtime. Customers can deploy the AWS API MCP Server with configurable authentication methods (SigV4 or JWT), implement least-privilege IAM policies, and leverage AgentCore’s built-in logging and monitoring capabilities. The deployment lets customers configure IAM roles, authentication methods, and network settings according to their security requirements. The AWS API MCP Server can now be deployed from AWS Marketplace in all AWS Regions where Amazon Bedrock AgentCore is supported. Get started by visiting the AWS API MCP Server listing on AWS Marketplace or explore the deployment guide on AWS Labs GitHub repository. Learn more about Amazon Bedrock AgentCore in the AWS documentation.

Amazon Route 53 announces accelerated recovery for managing public DNS records

Amazon Route 53 is excited to release the accelerated recovery option for managing DNS records in public hosted zones. Accelerated recovery targets a 60-minute recovery time objective (RTO) for regaining the ability to make DNS changes to your DNS records in Route 53 public hosted zones, if AWS services in US East (N. Virginia) become temporarily unavailable.\n The Route 53 public DNS service API is used by customers today for making changes to DNS records in order to facilitate software deployments, run infrastructure operations, and onboard new users. Customers in banking, financial technology (FinTech), and software-as-a-service (SaaS) in particular need a predictable and short RTO for meeting business continuity and disaster recovery objectives. In the past, if AWS services in US East (N. Virginia) became unavailable, customers would not be able to modify or recreate DNS records to point users and internal services to updated endpoints. Now, when you enable the accelerated recovery option on your Route 53 public hosted zone, you can make changes to Route 53 public DNS records (Resource Record Sets) in that hosted zone soon after such an interruption, most often in less than one hour. Accelerated recovery for managing public DNS records is available globally, except in AWS GovCloud and Amazon Web Services in China. There is no additional charge for using this feature. To learn more about the accelerated recovery option, visit our documentation.

Amazon Lex now supports LLMs as the primary option for natural language understanding

Amazon Lex now allows you to use Large Language Models (LLMs) as the primary option to understand customer intent across voice and chat interactions. With this capability, your voice and chat bots can better understand customer requests, handle complex utterances, maintain accuracy despite spelling errors, and extract key information from verbose inputs. When customer intent is unclear, bots can intelligently ask follow-up questions to fulfill requests accurately. For example, when a customer says “I need help with my flight,” the LLM automatically clarifies whether the customer wants to check their flight status, upgrade their flight, or change their flight.\n This feature is available in all AWS commercial regions where Amazon Connect and Lex operate. To learn more, visit the Amazon Lex documentation or explore the Amazon Connect website to learn how Amazon Connect and Amazon Lex deliver seamless end-customer self-service experiences.

Amazon EMR and AWS Glue now support audit context support with Lake Formation

Amazon EMR and AWS Glue now provide comprehensive audit context support for AWS Lake Formation credential vending APIs and AWS Glue Data Catalog GetTable and GetTables API calls. This auditing capability helps you maintain compliance with regulatory frameworks, including the Digital Markets Act (DMA) and data protection regulations. The feature is enabled by default, offering seamless integration into existing workflows while strengthening security and compliance monitoring across your data lake infrastructure.\n You can view this audit context information in AWS CloudTrail logs, enabling enhanced security auditing, regulatory compliance, and improved troubleshooting for EMR for Apache Spark native fine-grained access control (FGAC) and full table access jobs. The audit logging feature automatically records the platform type (EMR-EC2, EMR on EKS, EMR Serverless, or AWS Glue) and its corresponding identifiers like such as Cluster ID, Step ID, Job Run ID, and Virtual Cluster ID. This enables security teams to track and correlate API calls from individual Spark jobs, streamline compliance reporting, and analyze historical data access patterns. Additionally, data engineers can quickly troubleshoot access-related issues by connecting them to specific job executions, resolve FGAC permission challenges, and monitor access patterns across different compute platforms. This feature is available in all AWS Regions that support Amazon EMR, AWS Glue, and AWS Lake Formation, requiring EMR version 7.12+ or AWS Glue version 5.1+.

AWS Knowledge MCP Server now supports topic-based search

Today, AWS announces enhanced search capabilities for the AWS Knowledge MCP Server, which now supports topic-based search across specialized AWS documentation domains. The AWS Knowledge MCP Server is a Model Context Protocol (MCP) server that provides AI agents and developers with programmatic access to AWS documentation and knowledge resources. This enhancement enables more precise and relevant search results by allowing MCP clients and agentic frameworks to query specific documentation domains such as Troubleshooting, AWS Amplify, AWS CDK, CDK Constructs, and AWS CloudFormation, reducing noise and improving response accuracy for domain-specific queries.\n These topic-based searches complement existing capabilities for searching API references, What’s New announcements, and general AWS documentation. Developers building AI agents can now retrieve targeted information for specific use cases—for example, searching Troubleshooting documentation for error resolution, Amplify documentation for frontend development guidance, or CDK Constructs for production-ready architectural patterns. This focused approach accelerates development workflows and improves the quality of AI-generated responses for AWS-specific queries. The enhanced search capabilities are available immediately at no additional cost through the AWS Knowledge MCP Server. Usage remains subject to standard rate limits. To learn more and get started, see the AWS Knowledge MCP Server documentation.

Amazon EMR and AWS Glue now support write operations with AWS Lake Formation fine-grained access controls

Amazon EMR and AWS Glue now enable you to enforce fine-grained access control (FGAC) on both read and write operations for AWS Lake Formation registered tables in your Apache Spark jobs. Previously, you could only apply Lake Formation’s table, column, and row-level permissions for read operations (SELECT, DESCRIBE). This simplifies data workflows by allowing both read and write tasks in a single Spark job, eliminating the need for separate clusters or applications. Organizations can now execute end-to-end data workflows with consistent security controls, streamlining operations and reducing infrastructure costs.\n With this launch, administrators can control who is authorized to insert new data, update specific records, or merge changes through DML operations (CREATE, ALTER, INSERT, UPDATE, DELETE, MERGE INTO, DROP), ensuring that all data modifications adhere to specified security policies to mitigate the risk of unauthorized data modification, or misuse. This launch simplifies data governance and security frameworks by providing a single point for defining access rules in AWS Lake Formation and enforcing these rules in Spark for both read and write operations. This feature is available in all AWS Regions where Amazon EMR (EC2, EKS and Serverless), AWS Glue and AWS Lake Formation are available. To learn more, visit the open table format support documentation.

Amazon Bedrock introduces Reserved Service tier

Today, Amazon Bedrock introduces a new Reserved service tier designed for workloads requiring predictable performance and guaranteed tokens-per-minute capacity. The Reserved tier provides the ability to reserve prioritized compute capacity, keeping service levels predictable for your mission critical applications. It also includes the flexibility to allocate different input and output tokens-per-minute capacities to match the exact requirements of your workload and control cost. This is particularly valuable because many workloads have asymmetric token usage patterns. For instance, summarization tasks consume many input tokens but generate fewer output tokens, while content generation applications require less input and more output capacity. When your application needs more tokens-per-minute capacity than what you reserved , the service automatically overflows to the pay-as-you-go Standard tier, ensuring uninterrupted operations. The Reserved tier targets 99.5% uptime for model response and is available today for Anthropic Claude Sonnet 4.5. Customers can reserve capacity for 1 month or 3 month duration. Customers pay a fixed price per 1K tokens-per-minute and are billed monthly.\n With the Reserved service tier, Amazon Bedrock continues to provide more choice to customers, helping them develop, scale, and deploy applications and agents that improve productivity and customer experiences while balancing performance and cost requirements. For more information about the AWS Regions where Amazon Bedrock Reserved is available, refer to the Documentation. To get access to the Reserved tier, please contact your AWS account team.

Introducing AWS Glue 5.1

AWS Glue 5.1 is now generally available, delivering improved performance, security updates, expanded Apache Iceberg capabilities, and AWS Lake Formation write support for data integration workloads.\n AWS Glue is a serverless, scalable data integration service that simplifies discovering, preparing, moving, and integrating data from multiple sources. This release upgrades core engines to Apache Spark 3.5.6, Python 3.11, and Scala 2.12.18, bringing performance and security enhancements. It also updates support for open table format libraries, including Apache Hudi 1.0.2, Apache Iceberg 1.10.0, and Delta Lake 3.3.2.

AWS Glue 5.1 introduces support for Apache Iceberg format version 3.0, adding default column values, deletion vectors for merge-on-read tables, multi-argument transforms, and row lineage tracking. This release also extends AWS Lake Formation fine-grained access control to write operations (both DML and DDL) for Spark DataFrames and Spark SQL. Previously, this capability was limited to read operations only. AWS Glue 5.1 also adds full-table access control in Apache Spark for Apache Hudi and Delta Lake tables, providing more comprehensive security options for your data.

AWS Glue 5.1 is available in US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), Europe (Stockholm), Europe (Frankfurt), Europe (Spain), Asia Pacific (Hong Kong), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Asia Pacific (Malaysia), Asia Pacific (Thailand), Asia Pacific (Mumbai), and South America (São Paulo). Visit the AWS Glue documentation for more information.

Amazon Quick Research now includes trusted third-party industry intelligence

Amazon Quick Suite, the AI-powered workspace helping organizations get answers from their enterprise data and move swiftly from insights to action, enhances Quick Research with access to specialized third-party datasets.\n Quick Research transforms how business professionals tackle complex business problems by completing weeks of data discovery, analysis, and insight generation in minutes. Today, Quick Research launches its partner ecosystem with industry intelligence providers S&P Global, FactSet, and IDC, with more to come. Users with existing subscriptions can combine these authoritative datasets with all of their business data and real-time web search, accelerating their path to deeper insights and strategic decision-making. Additionally, all users have access to decades of US Patent and Trademark Office data along with millions of PubMed citations and abstracts in biomedical and life sciences literature. Business professionals from any industry can now access and analyze multiple data sources in one unified workspace, eliminating the need to switch between platforms. For example, a financial analyst can evaluate investment opportunities using FactSet’s financial data alongside real-time web search and internal market reports, while energy teams can optimize trading strategies using S&P Global’s commodity data combined with insights from their strategy teams. Similarly, sales and product teams can spot emerging trends faster by leveraging IDC’s industry intelligence with their customer data. By bringing critical data sources together in one place, organizations can move from insight to action with greater speed and confidence. Quick Research’s third-party data integration is available in the following AWS Regions: US East (N. Virginia), US West (Oregon), Asia Pacific (Sydney), and Europe (Ireland). To learn more, read our User Guide.

AWS Blogs

AWS Japan Blog (Japanese)

AWS Big Data Blog

AWS Compute Blog

Orchestrating large-scale document processing with AWS Step Functions and Amazon Bedrock batch inference

Containers

AWS Database Blog

AWS HPC Blog

How Aionics accelerates chemical formulation and discovery with AWS Parallel Computing Service

Recent Announcements

AWS Blogs

AWS Japan Blog (Japanese)

AWS News Blog

AWS Architecture Blog

AWS Cloud Financial Management

AWS Cloud Operations Blog

AWS Big Data Blog

AWS Compute Blog

Containers

AWS Database Blog

AWS HPC Blog

AWS for Industries

Artificial Intelligence

AWS Messaging Blog

Networking & Content Delivery

AWS Security Blog

AWS Storage Blog

Open Source Project

AWS CLI

AWS CDK

Amplify for iOS

Recent Announcements#

AWS Blogs#

AWS Japan Blog (Japanese)#

AWS News Blog#

AWS Architecture Blog#

AWS Cloud Financial Management#

AWS Cloud Operations Blog#

AWS Big Data Blog#

AWS Compute Blog#

Containers#

AWS Database Blog#

AWS HPC Blog#

AWS for Industries#

Artificial Intelligence#

AWS Messaging Blog#

Networking & Content Delivery#

AWS Security Blog#

AWS Storage Blog#

Open Source Project#

AWS CLI#

AWS CDK#

Amplify for iOS#

Recent Announcements

AWS Blogs

AWS Japan Blog (Japanese)

AWS News Blog

AWS Architecture Blog

AWS Cloud Financial Management

AWS Cloud Operations Blog

AWS Big Data Blog

AWS Compute Blog

Containers

AWS Database Blog

AWS HPC Blog

AWS for Industries

Artificial Intelligence

AWS Messaging Blog

Networking & Content Delivery

AWS Security Blog

AWS Storage Blog

Open Source Project

AWS CLI

AWS CDK

Amplify for iOS