8/12/2025, 12:00:00 AM ~ 8/13/2025, 12:00:00 AM (UTC)
Recent Announcements
Amazon EC2 Single GPU P5 instances are now generally available
Today, AWS announces new Amazon Elastic Compute Cloud (Amazon EC2) P5 instance size with one NVIDIA H100 GPU that allows businesses to right-size their machine learning (ML) and high-performance computing (HPC) resources with cost-effectiveness.\n The new instance size enables customers to start small and scale in granular increments, providing more flexible control over infrastructure costs. Customers developing small to medium Large Language Models (LLMs) such as chatbots or specialized language translation tools can now run inference tasks more economically. Customers can also use these instances to deploy HPC applications for pharmaceutical discovery, fluid flow analysis, and financial modeling without committing to expensive, large-scale GPU deployments.
P5.4xlarge instances are now available through Amazon EC2 Capacity Blocks for ML in the following AWS Regions: US East (North Virginia, Ohio), US West (Oregon), Europe (London), Asia Pacific (Mumbai, Sydney, Tokyo) and South America (Sao Paulo) regions. These instances can be purchased On-Demand, Spot or through Savings Plans in Europe (London), Asia Pacific (Mumbai, Jakarta, Tokyo), and South America (Sao Paulo) regions.
To learn more about P5.4xlarge instances, visit Amazon EC2 P5 instances.
Amazon SageMaker AI now supports P6e-GB200 UltraServers
Today, Amazon SageMaker AI announces support for P6e-GB200 UltraServers in SageMaker HyperPod and Training Jobs. With P6e-GB200 UltraServers, you can leverage up to 72 NVIDIA Blackwell GPUs under one NVLink domain to accelerate training and deployment of foundational models at trillion-parameter scale. P6e-GB200 UltraServers are available in two sizes: ml.u-p6e-gb200x72 (72 GPUs within NVLink) and ml.u-p6e-gb200x36 (36 GPUs within NVLink).\n P6e-GB200 UltraServers deliver over 20x compute and over 11x memory under one NVIDIA NVLink compared to P5en instances. Within each NVLink domain you can leverage 360 petaflops of FP8 compute (without sparsity) and 13.4 TB of total high bandwidth memory (HBM3e). When you use P6e-GB200 UltraServers on SageMaker AI, you get the GB200’s superior performance combined with SageMaker’s managed infrastructure such as security, built-in fault tolerance, topology aware scheduling (SageMaker HyperPod EKS & Slurm), integrated monitoring capabilities, and native integration with other SageMaker AI and AWS services. The UltraServers are available through SageMaker Flexible Training Plans in the Dallas Local Zone (“us-east-1-dfw-2a”), an extension of the US East (N. Virginia) AWS Region. For on-demand reservation of GB200 UltraServers, please reach out to your account manager. Amazon SageMaker AI lets you easily train and deploy machine learning models at scale using fully managed infrastructure optimized for performance and cost. To get started with UltraServers on SageMaker AI, visit the documentation.
Announcing new incentives for ISVs selling in AWS Marketplace
Amazon Web Services, Inc. (AWS) announces the launch of the AWS Marketplace Private Offer Promotion Program (MPOPP) in AWS Partner Central to support independent software vendors (ISVs) with driving new customer acquisition. This program is designed to accelerate sales through AWS Marketplace by offering AWS Promotional Credits to customers as an incentive for purchasing from participating ISVs. MPOPP offers benefits for AWS Partners at different stages in their AWS Marketplace journey. New AWS Marketplace Sellers can benefit from immediate funding support, and established sellers can benefit from special incentives for driving AWS Marketplace renewals.\n Eligible Partners can submit self-service requests for funds through the AWS Partner Funding Portal year-round, enabling funding to be targeted for next business day delivery following Private Offer acceptance. The simplified funding template can help accelerate deal closure and provides better speed-to-market with a fully automated approval process. Following deal completion and AWS Marketplace transaction verification, Promotional Credits will be issued to the customer’s AWS account based on the Total Contract Value (TCV) and applicable program rates, streamlining the entire process from planning to credit disbursement. To learn more about the MPOPP, eligibility, and benefits, visit the AWS Partner Funding Benefits Guide (AWS Partner Central login required).
Amazon SageMaker HyperPod now supports custom AMIs (Amazon Machine Images)
Amazon SageMaker HyperPod now supports custom AMIs, enabling customers to deploy clusters with pre-configured, security-hardened environments that meet their specific organizational requirements. Customers deploying AI/ML workloads on HyperPod need customized environments that meet strict security, compliance, and operational requirements while maintaining fast cluster startup times, but often struggle with complex lifecycle configuration scripts that slow deployment and create inconsistencies across cluster nodes.\n This capability allows customers to build upon HyperPod’s performance-optimized base AMIs while incorporating customized security agents, compliance tools, proprietary libraries, and specialized drivers directly into the image, delivering faster startup times, improved reliability, and enhanced security compliance. Security teams can embed organizational policies directly into base images, allowing AI/ML teams to use pre-approved environments that accelerate time-to-training while meeting enterprise security standards. You can specify custom AMIs when creating new HyperPod clusters using the CreateCluster API, adding instance groups with UpdateCluster API, or patching existing clusters with UpdateClusterSoftware API. Custom AMIs must be built using HyperPod’s public base AMIs to maintain compatibility with distributed training libraries and cluster management capabilities. This feature is available in all AWS Regions where Amazon SageMaker HyperPod is supported. To learn more about custom AMI support, see the Amazon SageMaker HyperPod User Guide.
Anthropic’s Claude Sonnet 4 in Amazon Bedrock Expanded Context Window
Anthropic’s Claude Sonnet 4 in Amazon Bedrock is launching today with a significantly expanded context window in public preview. The context window has been increased from 200,000 to 1 million tokens, representing a 5x expansion. This enhancement allows Claude to process and reason over much larger amounts of text in a single request, opening up new possibilities for comprehensive analysis and generation tasks.\n This expanded context window for Sonnet 4 brings many benefits to customers. For large-scale code analysis, users can now load entire codebases, including source files, tests, and documentation, enabling Sonnet 4 to understand project architecture, identify cross-file dependencies, and suggest improvements that account for the complete system design. In document synthesis, the model can now process extensive document sets like legal contracts, lengthy research papers, large datasets, or technical specifications in a single API call, analyzing relationships across hundreds of documents while maintaining full context. Additionally, this expansion allows for the creation of more sophisticated context-aware agents that can maintain coherence across hundreds of tool calls and multi-step workflows, including complete API documentation and interaction histories. The expanded context window for Claude Sonnet 4 is now available in public preview in Amazon Bedrock in US West (Oregon), US East (N. Virginia), and US East (Ohio) AWS regions. Prompts over 200,000 tokens will incur approximately twice the token price for input and 1.5 times for output. To get started with the expanded context window for Claude Sonnet 4, visit the Amazon Bedrock console.
AWS Direct Connect announces 100G expansion in Cape Town, South Africa
Today, AWS announced the expansion of 100 Gbps dedicated connections at the AWS Direct Connect location in the Teraco CT1 data center near Cape Town, South Africa. You can now establish private, direct network access to all public AWS Regions (except those in China), AWS GovCloud Regions, and AWS Local Zones from this location. This is the second AWS Direct Connect location in South Africa to provide 100 Gbps connections with MACsec encryption capabilities.\n The Direct Connect service enables you to establish a private, physical network connection between AWS and your data center, office, or colocation environment. These private connections can provide a more consistent network experience than those made over the public internet.
For more information on the over 142 Direct Connect locations worldwide, visit the locations section of the Direct Connect product detail pages. Or, visit our getting started page to learn more about how to purchase and deploy Direct Connect.
AWS Deadline Cloud introduces new cost-saving compute option
AWS Deadline Cloud is a fully managed service that simplifies render management for teams creating computer-generated graphics and visual effects for films, television, broadcasting, web content, and design. Today, we’re excited to announce a new wait and save feature for Deadline Cloud service-managed fleets that can reduce rendering costs with prices starting as low as $0.006 per vCPU-hour.\n This new feature is ideal for non time-sensitive rendering workloads with flexible completion times. Submitting jobs using this wait and save approach allows you to achieve significant cost savings so you can do more creative iteration and exploration on your next project. This feature complements existing AWS Deadline Cloud compute options in its service-managed fleets, giving you more flexibility to optimize your resource utilization across different priorities and budgets. AWS Deadline Cloud wait and save is available in all AWS Regions where AWS Deadline Cloud is offered. To learn more about this new cost-saving feature and how it can help optimize your rendering workloads, visit the AWS Deadline Cloud product page or review the AWS Deadline Cloud documentation.
Amazon OpenSearch Serverless now supports kNN Byte vector and new data types
Amazon OpenSearch Serverless has introduced several new features including kNN Byte vector support, radial search capabilities for Vector collections, and new data types and mapping parameters such as strict_allow_templates, wildcard field type, and kuromoji_completion analyzer.\n These enhancements deliver significant benefits for search and analytics workloads. The kNN Byte vector support helps reduce costs through lower memory and storage requirements while improving latency and performance. The additional features like nested fields for storing multiple vectors in a single document and new mapping parameters provide greater flexibility and control in managing search operations without the complexity of infrastructure management. Please refer to the AWS Regional Services List for more information about Amazon OpenSearch Service availability. To learn more about OpenSearch Serverless, see the documentation.
Today, we are introducing multiple enhancements to Amazon EC2 On-Demand Capacity Reservations in Cluster Placement Groups (CPG-ODCRs). CPG-ODCRs provide customers with assured capacity and offer low latency and high throughput between instances within the same Cluster Placement Group (CPG). Now, customers using CPG-ODCRs can benefit from two additional capabilities that make them easier to use. First, customers can now add ODCRs belonging to different CPGs to Resource Groups which will enable customers to manage and target groups of reservations spread across multiple Placement Groups. Second, customers can share CPG-ODCRs across multiple AWS accounts through AWS Resource Access Manager, which allow them to create central pools of capacity and use them efficiently across workloads in different accounts.\n Customers can get started with these capabilities of CPG-ODCR by using the AWS CLI/APIs or by visiting the AWS Management console. These capabilities are now available in all AWS regions except China, and they are available at no additional cost. To learn more about these capabilities, please refer to the Capacity Reservations user guide.
AWS Blogs
AWS Japan Blog (Japanese)
- Proven practices for a successful multi-cloud strategy
- Utilizing data to support challenges in the logistics industry — from the case of Nippon Express
- Weekly Generative AI with AWS — 2025/8/4
- AWS Weekly — 2025/8/4
- Information on the release of materials and videos for the AWS Black Belt webinar in June and July 2025
- Development of large-scale language models specific to industry tasks ~ Interview with Nomura Research Institute ~
- Ministry of Economy, Trade and Industry begins support for selected businesses in GENIAC Infrastructure Model Development Support Project (Phase 3)
Containers
AWS Database Blog
AWS for Industries
- Lotte Homeshopping Reduces Human Agent Workload by 40% with Sendbird on AWS
- Smoking Out Costs: How Traeger Grills Cut Per-Device Cloud Costs by 50% with AWS
Artificial Intelligence
- Train and deploy AI models at trillion-parameter scale with Amazon SageMaker HyperPod support for P6e-GB200 UltraServers
- How Indegene’s AI-powered social intelligence for life sciences turns social media conversations into insights
- Unlocking enhanced legal document review with Lexbe and Amazon Bedrock
- Automate AIOps with SageMaker Unified Studio Projects, Part 2: Technical implementation
- Automate AIOps with Amazon SageMaker Unified Studio projects, Part 1: Solution architecture