Amazon EC2 G7 launches: NVIDIA Blackwell brings 4.6x AI inference and 10x vector search
Cloud AI inference and vector search are getting faster at the same time. On June 23, 2026, NVIDIA and AWS launched the Blackwell-based Amazon EC2 G7 instances. G7 uses NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs to deliver up to 4.6x AI inference over the prior G6 generation, while Amazon OpenSearch Serverless uses NVIDIA cuVS to index vectors up to 10x faster at a quarter of the cost. ASAP summarizes the announcement from the primary source.
Blackwell-based EC2 G7: 4.6x inference, 2.1x graphics
Amazon EC2 G7 instances use NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs to deliver up to 4.6x AI inference and up to 2.1x graphics over the prior G6 generation. A single instance packs up to 8 GPUs, 256GB of total GPU memory, 700 Gbps EFA networking, and a 7.6TB NVMe SSD. It runs on AWS Deep Learning AMIs and Containers, EMR, EKS, and ECS, with SageMaker AI support coming soon.
OpenSearch Serverless plus cuVS: 10x vector indexing
Amazon OpenSearch Serverless makes GPU-accelerated vector indexing a default capability through NVIDIA cuVS, running up to 10x faster than CPU at a quarter of the cost. NVIDIA and AWS say billion-scale vector databases become practical to build in under an hour. The change cuts both the cost and the time of indexing for RAG and search services at once.
Down to GB300 training: AWS earns NVIDIA Exemplar Cloud status
AWS earned NVIDIA Exemplar Cloud certification for NVIDIA GB300 training workloads. Exemplar status means AWS meets NVIDIA's reference-architecture performance benchmarks. Verification now spans not just inference GPUs but large-scale training infrastructure as well.
What it means: inference, search, and training in one cloud stack
The NVIDIA and AWS announcement ties together G7 for inference, cuVS for search, and GB300 for training in a single cloud stack. The 4.6x and 10x figures are maximums versus the prior generation and CPU, so real workloads may differ. That makes the unified path from inference to training inside one cloud the core message.
Wrap-up
NVIDIA and AWS launched the Blackwell-based Amazon EC2 G7 instances. RTX PRO 4500 Blackwell delivering up to 4.6x inference, OpenSearch cuVS up to 10x vector indexing, and AWS's GB300 Exemplar Cloud certification are the core. Inference, search, and training converge into one cloud stack.
Source: ASAP summary of NVIDIA's blog "NVIDIA and AWS Collaborate to Bring AI to Production at Scale" (June 23, 2026; Amazon EC2 G7 with RTX PRO 4500 Blackwell Server Edition, up to 4.6x AI inference and up to 2.1x graphics over G6, up to 8 GPUs, 256GB, 700 Gbps EFA, 7.6TB NVMe, Amazon OpenSearch Serverless with NVIDIA cuVS for up to 10x vector indexing at a quarter of the cost, billion-scale vector databases in under an hour, and AWS NVIDIA GB300 Exemplar Cloud certification).
AI & tech,
delivered fastest
Beyond the headlines — into the context and the structure
Ai Soon As Possible · asapai.co.kr
