9/15/2025, 12:00:00 AM ~ 9/16/2025, 12:00:00 AM (UTC)

Recent Announcements

Amazon S3 Batch Operations now supports managing buckets or prefixes in a single step in AWS Management Console

Amazon S3 Batch Operations now supports managing objects within an S3 bucket, prefix, suffix, or more, in a single step in AWS Management Console. When creating an S3 Batch Operation, customers can specify the objects on which to perform the operation. With this feature you have the option to instead specify an entire bucket, prefix, suffix, creation date, or storage class. Amazon S3 Batch Operations will then quickly apply the operation to all the matching objects and notify you when the job completes.\n S3 Batch Operations lets you easily perform one-time or recurring batch workloads such as copying objects between staging and production buckets, restoring archived backups from S3 Glacier storage classes, or computing objects checksum to verify the content of stored datasets, at any scale. After starting your job, S3 Batch Operations automatically processes all of the objects that match your filtering criteria. You will receive a detailed completion report with the status of each object once the job completes. This feature of S3 Batch Operations is available in all AWS Regions. You can get started through AWS Management Console, AWS Command Line Interface (CLI), or the AWS Software Development Kit (SDK) client. For pricing information, please visit the Management & Insights tab of the Amazon S3 pricing page. To learn more about S3 Batch Operations, visit the S3 User Guide.

Amazon SageMaker HyperPod announces health monitoring agent support for Slurm clusters

Today, Amazon SageMaker HyperPod announces the general availability of the health monitoring agent for Slurm clusters. SageMaker HyperPod helps you provision resilient clusters for running machine learning (ML) workloads and developing state-of-the-art models such as large language models (LLMs), diffusion models, and foundation models (FMs). The health monitoring agent performs passive, background health checks of instances to identify problems in key areas without impact on application behavior or performance, flags failures instantly, and replaces any unhealthy instances to keep your training jobs running smoothly. \n The agent runs continuously on all GPU- or Trainium-based nodes in your HyperPod cluster, watching for hardware issues such as unresponsive GPUs or NVLink error counters. When a fault is detected, it marks the node as unhealthy and automatically reboots or replaces it with a healthy node, keeping your jobs running without requiring manual intervention. The agent also follows a co-ordinated approach to handling failures with the job auto-resume functionality available with Slurm clusters. For example, jobs with auto-resume enabled will continue from the last saved checkpoint once nodes are replaced by the agent. This hands-free recovery—already available on HyperPod clusters orchestrated with Amazon EKS—now gives Slurm clusters the same resilient environment, helping teams train large models for weeks without disruption and reclaim time and costs that would otherwise be lost to mid-run failures. In addition, customers can now also reboot their nodes using a simple command in case of intermittent issues such as GPU driver issues requiring reset. 

Health monitoring agent for Slurm is available in all regions where HyperPod is generally available. The agent is auto-enabled on all newly created Slurm clusters; to enable it on an existing cluster, simply upgrade to the latest HyperPod AMI by calling the UpdateClusterSoftware API. To learn more, visit the Amazon SageMaker HyperPod documentation.

Amazon Connect Cases now supports date range filters in the case list view

Amazon Connect Cases now supports filtering by date ranges in the case list view, enabling contact center managers and agents to efficiently manage their case workloads. For example, users can filter cases created in the last 30 days for monthly reporting, view cases modified in the last 24 hours to monitor recent activity, or surface cases with potential SLA breaches in the next 2 days to help prevent violations.\n Amazon Connect Cases is available in the following AWS regions: US East (N. Virginia), US West (Oregon), Canada (Central), Europe (Frankfurt), Europe (London), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), and Africa (Cape Town) AWS regions. To learn more and get started, visit the Amazon Connect Cases webpage and documentation.

Amazon OpenSearch Service now supports OpenSearch version 3.1

You can now run OpenSearch version 3.1 in Amazon OpenSearch Service. OpenSearch 3.1 introduces several improvements in areas like search relevance and performance, and introduces features that simplify development of vector-driven applications for generative AI workloads.\n This launch incorporates Lucene 10 that enables optimized vector field indexing resulting in faster indexing times and reduced index sizes, sparse indexing for CPU and storage efficiency improvements, and vector quantization to reduce memory usage. Other key areas of improvement include improved range query performance, which benefits log analytics and time-series workloads, and reduced latency for high-cardinality aggregations.

This launch also introduces a new Search Relevance Workbench, which provides integrated tools for teams to evaluate and optimize search quality through experimentation. Additionally, this launch includes several improvements in vector search capabilities. First, Z-score normalization improves hybrid search reliability by reducing the impact of outliers and different score scales. Finally, you can now boost efficiency of searches using memory-optimized search that enables the Faiss engine to operate efficiently by memory-mapping the index file and using the operating system’s file cache to serve search requests.

For information on upgrading to OpenSearch 3.1, please see the documentation. OpenSearch 3.1 is now available in all AWS Regions where Amazon OpenSearch Service is available.

AWS Organizations now provides account state information for member accounts

AWS Organizations provides a new State field in the AWS Organizations Console and APIs (DescribeAccount, ListAccounts, and ListAccountsForParent) to enhance AWS account lifecycle visibility. With this launch, the account state, a new State field replaced the existing account status, Status field in the AWS Organizations Console, however both Status and State fields will remain available in the APIs until September 9, 2026.\n This launch allows you to have a more granular account state information such as, ‘SUSPENDED’ for AWS-enforced suspension, ‘PENDING_CLOSURE’ for in-process closure requests, and ‘CLOSED’ for accounts in their 90-day reinstatement window, and more. After September, 2026 the Status field will be fully deprecated. Customers using account vending pipelines should update their implementations to reference the State field before the Status field deprecation date. This feature is available in all AWS commercial and AWS GovCloud (US) Regions. To get started managing your accounts, please see the blog post and documentation.

Announcing on-demand deployment for custom Meta Llama models in Amazon Bedrock

Starting today, customers can use the on-demand deployment option in Amazon Bedrock for their Meta Llama 3.3 models that have been fine-tuned or distilled in Bedrock. Models customized on or after September 15, 2025 will be eligible.\n This enables Bedrock customers to reduce costs by processing requests in real time without requiring pre-provisioned compute resources. Customers only pay for what they use, eliminating the need for an always-on infrastructure. Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models from leading AI companies via a single API. Amazon Bedrock also provides a broad set of capabilities customers need to build generative AI applications with security, privacy, and responsible AI built in. To get started, visit documentation here.

Now generally available: Amazon EC2 R8gn instances

Today, AWS announces the general availability of the new Amazon Elastic Compute Cloud (Amazon EC2) R8gn instances. These instances are powered by AWS Graviton4 processors to deliver up to 30% better compute performance than AWS Graviton3 processors. R8gn instances feature the latest 6th generation AWS Nitro Cards, and offer up to 600 Gbps network bandwidth, the highest network bandwidth among network optimized EC2 instances.\n Take advantage of the enhanced networking capabilities of R8gn to scale the performance and throughput of network-intensive workloads such as SQL and NoSQL databases, and in-memory databases. For increased scalability, these instances offer instance sizes up to 48xlarge, including two metal sizes, up to 1,536 GiB of memory, and up to 60 Gbps of bandwidth to Amazon Elastic Block Store (EBS). These instances support Elastic Fabric Adapter (EFA) networking on the 16xlarge, 24xlarge, 48xlarge, metal-24xl, and metal-48xl sizes, which enables lower latency and improved cluster performance for workloads deployed on tightly coupled clusters. The new instances are available in the following AWS Regions: US East (N. Virginia), and US West (Oregon). Metal sizes are only available in US East (N. Virginia). To learn more, see Amazon R8gn Instances. To begin your Graviton journey, visit the Level up your compute with AWS Graviton page. To get started, see AWS Management Console, AWS Command Line Interface (AWS CLI), and AWS SDKs.

Amazon Managed Service for Prometheus now available in 11 additional AWS Regions

Amazon Managed Service for Prometheus is now available in Asia Pacific (Jakarta), Asia Pacific (Hyderabad), Asia Pacific (Osaka), Asia Pacific (Melbourne), Asia Pacific (Taipei), Canada West (Calgary), Europe (Spain), Israel (Tel Aviv), Mexico (Central), Middle East (Bahrain), and US West (N. California). Amazon Managed Service for Prometheus is a fully managed Prometheus-compatible monitoring service that makes it easy to monitor and alarm on operational metrics at scale.\n The list of all supported regions where Amazon Managed Service for Prometheus is generally available can be found in the user guide. Customers can send up to 1 billion active metrics to a single workspace and can create multiple workspaces per account, where a workspace is a logical space dedicated to the storage and querying of Prometheus metrics.

To learn more about Amazon Managed Service for Prometheus, visit the user guide or product page.

AWS Blogs

AWS News Blog

AWS Cloud Operations Blog

AWS Big Data Blog

Artificial Intelligence

AWS Security Blog

Open Source Project

AWS CLI

AWS CDK

Bottlerocket OS

Karpenter