ASAPAi Soon As Possible · AI & tech, delivered fastest
Article

Amazon EC2 G7 launches: NVIDIA Blackwell brings 4.6x AI inference and 10x vector search

2026-06-24 · 3 min read

Cloud AI inference and vector search are getting faster at the same time. On June 23, 2026, NVIDIA and AWS launched the Blackwell-based Amazon EC2 G7 instances. G7 uses NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs to deliver up to 4.6x AI inference over the prior G6 generation, while Amazon OpenSearch Serverless uses NVIDIA cuVS to index vectors up to 10x faster at a quarter of the cost. ASAP summarizes the announcement from the primary source.

Blackwell-based EC2 G7: 4.6x inference, 2.1x graphics

Amazon EC2 G7 instances use NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs to deliver up to 4.6x AI inference and up to 2.1x graphics over the prior G6 generation. A single instance packs up to 8 GPUs, 256GB of total GPU memory, 700 Gbps EFA networking, and a 7.6TB NVMe SSD. It runs on AWS Deep Learning AMIs and Containers, EMR, EKS, and ECS, with SageMaker AI support coming soon.

OpenSearch Serverless plus cuVS: 10x vector indexing

Amazon OpenSearch Serverless makes GPU-accelerated vector indexing a default capability through NVIDIA cuVS, running up to 10x faster than CPU at a quarter of the cost. NVIDIA and AWS say billion-scale vector databases become practical to build in under an hour. The change cuts both the cost and the time of indexing for RAG and search services at once.

Down to GB300 training: AWS earns NVIDIA Exemplar Cloud status

AWS earned NVIDIA Exemplar Cloud certification for NVIDIA GB300 training workloads. Exemplar status means AWS meets NVIDIA's reference-architecture performance benchmarks. Verification now spans not just inference GPUs but large-scale training infrastructure as well.

What it means: inference, search, and training in one cloud stack

The NVIDIA and AWS announcement ties together G7 for inference, cuVS for search, and GB300 for training in a single cloud stack. The 4.6x and 10x figures are maximums versus the prior generation and CPU, so real workloads may differ. That makes the unified path from inference to training inside one cloud the core message.

Wrap-up

NVIDIA and AWS launched the Blackwell-based Amazon EC2 G7 instances. RTX PRO 4500 Blackwell delivering up to 4.6x inference, OpenSearch cuVS up to 10x vector indexing, and AWS's GB300 Exemplar Cloud certification are the core. Inference, search, and training converge into one cloud stack.

Source: ASAP summary of NVIDIA's blog "NVIDIA and AWS Collaborate to Bring AI to Production at Scale" (June 23, 2026; Amazon EC2 G7 with RTX PRO 4500 Blackwell Server Edition, up to 4.6x AI inference and up to 2.1x graphics over G6, up to 8 GPUs, 256GB, 700 Gbps EFA, 7.6TB NVMe, Amazon OpenSearch Serverless with NVIDIA cuVS for up to 10x vector indexing at a quarter of the cost, billion-scale vector databases in under an hour, and AWS NVIDIA GB300 Exemplar Cloud certification).

ASAP

AI & tech,
delivered fastest

Beyond the headlines — into the context and the structure

Ai Soon As Possible · asapai.co.kr

AI TOP 100 (CAMPUS) 2026 finalist badge
← All posts