AMD has wrapped up its Advancing AI 2024 keynote by unveiling a range of new hardware aimed at meeting the increasing global demands in both industrial and business sectors.
Leading the charge are the 5th Gen EPYC processors, previously codenamed “Turin” when they were first announced at COMPUTEX 2024. Built on the Zen 5 architecture, these chips offer a massive performance boost, delivering more than 2.7x the performance of competitors. The cream of the crop belongs to the flagship EPYC 9965 featuring an astounding 192 cores and 384 threads.
AMD is providing various SKUs with different configurations ranging from core counts and boost clocks to L3 cache capacities, ensuring there’s something for everyone, including an 8-core entry-level model, ideal for users with lighter processing needs but still requiring professional-grade features.
The platform supports 12-channel DDR5 memory with speeds up to DDR5-6400 and includes AVX-512 with full 512b data paths. Designed for modern data center workloads, the chips offer up to 17% improved Instructions Per Clock (IPC) for enterprise and cloud tasks, and up to 37% better IPC for AI and HPC workloads compared to the previous generation.
The EPYC 9965 excels with:
– 4x faster video transcoding
– 3.9x quicker scientific computation in HPC
– 1.6x better virtualized infrastructure performance
For AI-driven workloads, the EPYC 9965 offers up to 3.7x the performance of competing processors, particularly excelling in small to medium generative AI models like Meta’s Llama 3.1-8B, providing nearly twice the throughput. The AI-specialized EPYC 9575F, boasting a 5GHz boost clock, powers large-scale AI clusters, delivering up to 700,000 more inference tokens per second.
Here’s the full list of SKUs for the initial launch.
Model | Cores | CCD (Zen5/Zen5c) | Base/Boost5 (up to GHz) | Default TDP (W) | L3 Cache (MB) | Price |
9965 | 192 cores | “Zen5c” | 2.25 / 3.7 | 500W | 384 | $14,813 |
9845 | 160 cores | “Zen5c” | 2.1 / 3.7 | 390W | 320 | $13,564 |
9825 | 144 cores | “Zen5c” | 2.2 / 3.7 | 390W | 384 | $13,006 |
9755 9745 | 128 cores | “Zen5” “Zen5c” | 2.7 / 4.1 2.4 / 3.7 | 500W 400W | 512 256 | $12,984 |
9655 | 96 cores | “Zen5” “Zen5” “Zen5c” | 2.6 / 4.5 2.6 / 4.5 2.3 / 3.7 | 400W 400W 320W | 384 384 384 | $11,852 |
9565 | 72 cores | “Zen5” | 3.15 / 4.3 | 400W | 384 | $10,486 |
9575F 9555 9555P 9535 | 64 cores | “Zen5” “Zen5” “Zen5” “Zen5” | 3.3 / 5.0 3.2 / 4.4 3.2 / 4.4 2.4 / 4.3 | 400W 360W 360W 300W | 256 256 256 256 | $11,791 |
9475F | 48 cores | “Zen5” | 3.65 / 4.8 | 400W 300W 300W | 256 192 192 | $7,592 $5,412 $4,819 |
9365 | 36 cores | “Zen5” | 3.4 / 4.3 | 300W | 256 | $4,341 |
9375F | 32 cores | “Zen5” | 3.8 / 4.8 | 320W 280W 280W 210W | 256 256 256 256 | $5,306 $3,694 $2,998 $3,178 |
9275F 9255 | 24 cores | “Zen5” “Zen5” | 4.1 / 4.8 3.25 / 4.3 | 320W 200W | 256 128 | $3,439 |
9175F | 16 cores | “Zen5” “Zen5” “Zen5” | 4.2 / 5.0 3.65 / 4.3 2.6 / 4.1 | 320W 200W 125W | 512 64 64 | $4,256 $1,214 $726 |
9015 | 8 cores | “Zen5” | 3.6 / 4.1 | 125W | 64 | $527 |
AMD also unveiled the highly anticipated Instinct MI325X accelerator, built on CDNA 3 architecture. With 256GB of HBM3E memory and 6.0TB/s throughput, it surpasses NVIDIA’s H200 with 1.8x more capacity, 1.3x more bandwidth, and superior peak FP16 and FP8 compute performance. It shows a notable 1.3x improvement in inference on models like Mistral 7B at FP16 and Llama 3.1 70B at FP8.
Availability of the MI325X is expected in Q4 2024, with platform providers such as Dell, Hewlett Packard Enterprise, Gigabyte, and Lenovo offering systems starting in Q1 2025. AMD also previewed the future Instinct MI350 lineup, featuring CDNA 4 architecture, which promises up to 35x performance improvements and up to 288GB of HBM3E memory, launching in the latter half of 2025.
In the DPU sector, AMD introduced the 3rd generation Pensando Salina DPU, offering double the performance, bandwidth, and scalability of its predecessor. Supporting 400G throughput, it enhances security and scalability for AI-driven data applications.
Additionally, the Pensando Pollara 400 NIC, featuring UEC readiness and a P4 Programmable engine, leads in performance and efficiency for accelerator-to-accelerator communication. These DPUs will be available in Q4 2024, with general availability in early 2025.
For business users, AMD introduced the PRO versions of the Ryzen AI 300 series processors, featuring two 9-tier and one 7-tier chip, powering business-ready laptops by the end of 2024. These chips match the performance of their consumer counterparts while offering exclusive security features like Secure Processor, Shadow Stack, Platform Secure Boot, and new additions like Cloud Bare Metal Recovery and AMD Device Identity.
On the software front, AMD ROCm has been updated to version 6.2, bringing support for FP8 datatype, Flash Attention 3, and Kernel Fusion, enabling up to 2.4x faster inference and 1.8x better LLM training. The ecosystem supports major frameworks such as PyTorch, Triton, Hugging Face, and generative AI models like Stable Diffusion 3 and Meta Llama 3.1, ensuring broad compatibility with AMD’s Instinct accelerators.