Micro짯soft, Dell Tech짯no짯lo짯gies, HPE, Leno짯vo, Meta, Ora짯cle, Super짯mi짯cro and others adopt new AMD Instinct MI300X and MI300A data cen짯ter AI acce짯le짯ra짯tors for trai짯ning and infe짯rence solutions

AMD also laun짯ches ROCm 6 soft짯ware stack with signi짯fi짯cant per짯for짯mance opti짯miza짯ti짯ons and new fea짯tures for Lar짯ge Lan짯guage Models and Ryzen 8040 Series note짯book pro짯ces짯sors for AI PCs

SAN JOSE, Calif., Dec. 06, 2023 (GLOBE NEWSWIRE) Today at the Advan짯cing AI event, AMD (NASDAQ: AMD) was joi짯n짯ed by indus짯try lea짯ders inclu짯ding Micro짯soft, Meta, Ora짯cle, Dell Tech짯no짯lo짯gies, HPE, Leno짯vo, Super짯mi짯cro, Aris짯ta, Broad짯com and Cis짯co to show짯ca짯se how the짯se com짯pa짯nies are working with AMD to deli짯ver advan짯ced AI solu짯ti짯ons span짯ning from cloud to enter짯pri짯se and PCs. AMD laun짯ched mul짯ti짯ple new pro짯ducts at the event, inclu짯ding the AMD Instinct꽓 MI300 Series data cen짯ter AI acce짯le짯ra짯tors, ROCm꽓 6 open soft짯ware stack with signi짯fi짯cant opti짯miza짯ti짯ons and new fea짯tures sup짯port짯ing Lar짯ge Lan짯guage Models (LLMs) and Ryzen꽓 8040 Series pro짯ces짯sors with Ryzen AI.

AI is the future of com짯pu짯ting and AMD is uni짯que짯ly posi짯tio짯ned to power the end-to-end infra짯struc짯tu짯re that will defi짯ne this AI era, from mas짯si짯ve cloud instal짯la짯ti짯ons to enter짯pri짯se clus짯ters and AI-enab짯led intel짯li짯gent embedded devices and PCs, said AMD Chair and CEO Dr. Lisa Su. 쏻e are see짯ing very strong demand for our new Instinct MI300 GPUs, which are the hig짯hest-per짯for짯mance acce짯le짯ra짯tors in the world for gene짯ra짯ti짯ve AI. We are also buil짯ding signi짯fi짯cant momen짯tum for our data cen짯ter AI solu짯ti짯ons with the lar짯gest cloud com짯pa짯nies, the industry셲 top ser짯ver pro짯vi짯ders, and the most inno짯va짯ti짯ve AI start짯ups 윿 who we are working clo짯se짯ly with to rapidly bring Instinct MI300 solu짯ti짯ons to mar짯ket that will dra짯ma짯ti짯cal짯ly acce짯le짯ra짯te the pace of inno짯va짯ti짯on across the enti짯re AI eco짯sys짯tem1.

Advan짯cing Data Cen짯ter AI from the Cloud to Enter짯pri짯se Data Cen짯ters and Supercomputers

AMD was joi짯n짯ed by mul짯ti짯ple part짯ners during the event to high짯light the strong adop짯ti짯on and gro짯wing momen짯tum for the AMD Instinct data cen짯ter AI accelerators.

Micro짯soft detail짯ed how it is deploy짯ing AMD Instinct MI300X acce짯le짯ra짯tors to power the new Azu짯re ND MI300x v5 Vir짯tu짯al Machi짯ne ( VM ) series opti짯mi짯zed for AI workloads.

Brin짯ging an Open, Pro짯ven and Rea짯dy AI Soft짯ware Plat짯form to Market

AMD high짯ligh짯ted signi짯fi짯cant pro짯gress expan짯ding the soft짯ware eco짯sys짯tem sup짯port짯ing AMD Instinct data cen짯ter accelerators.

AMD unvei짯led the latest ver짯si짯on of the open-source soft짯ware stack for AMD Instinct GPUs, ROCm 6, which has been opti짯mi짯zed for gene짯ra짯ti짯ve AI , par짯ti짯cu짯lar짯ly lar짯ge lan짯guage models. ROCm 6 boasts sup짯port for new data types, advan짯ced graph and ker짯nel opti짯miza짯ti짯ons, opti짯mi짯zed libra짯ri짯es and sta짯te of the art atten짯ti짯on algo짯rith짯ms, which tog짯e짯ther with MI300X deli짯ver an ~8x per짯for짯mance increase for over짯all laten짯cy in text gene짯ra짯ti짯on on Llama 2 com짯pared to ROCm 5 run짯ning on the MI250 . 2

Read here for more infor짯ma짯ti짯on about AMD Instinct MI300 Series acce짯le짯ra짯tors, ROCm 6 and the gro짯wing eco짯sys짯tem of AMD-powered AI solutions.

Con짯tin짯ued Lea짯der짯ship in Advan짯cing AI PCs

With mil짯li짯ons of AI PCs ship짯ped to date, AMD announ짯ced new lea짯der짯ship mobi짯le pro짯ces짯sors with the launch of the latest AMD Ryzen 8040 Series pro짯ces짯sors that deli짯ver robust AI com짯pu짯te capa짯bi짯li짯ty. AMD also laun짯ched Ryzen AI 1.0 Soft짯ware, a soft짯ware stack that enables deve짯lo짯pers to easi짯ly deploy apps that use pre짯trai짯ned models to add AI capa짯bi짯li짯ties for Win짯dows appli짯ca짯ti짯ons. AMD also dis짯c짯lo짯sed that the upco짯ming next-gen 쏶trix Point CPUs, plan짯ned to launch in 2024, will include the AMD XDNA꽓 2 archi짯tec짯tu짯re desi짯gned to deli짯ver more than a 3x increase in AI com짯pu짯te per짯for짯mance com짯pared to the pri짯or gene짯ra짯ti짯on3 that will enable new gene짯ra짯ti짯ve AI expe짯ri짯en짯ces. Micro짯soft also joi짯n짯ed to dis짯cuss how they are working clo짯se짯ly with AMD on future AI expe짯ri짯en짯ces for Win짯dows PCs.

1 Mea짯su짯re짯ments con짯duc짯ted by AMD Per짯for짯mance Labs as of Novem짯ber 11th, 2023 on the AMD Instinct꽓 MI300X (750W) GPU desi짯gned with AMD CDNA꽓 3 5nm | 6nm Fin짯FET pro짯cess tech짯no짯lo짯gy at 2,100 MHz peak boost engi짯ne clock resul짯ted in 163.4 TFLOPs peak theo짯re짯ti짯cal dou짯ble pre짯cis짯i짯on Matrix (FP64 Matrix), 81.7 TFLOPs peak theo짯re짯ti짯cal dou짯ble pre짯cis짯i짯on (FP64), 163.4 TFLOPs peak theo짯re짯ti짯cal sin짯gle pre짯cis짯i짯on Matrix (FP32 Matrix), 163.4 TFLOPs peak theo짯re짯ti짯cal sin짯gle pre짯cis짯i짯on (FP32), 653.7 TFLOPs peak theo짯re짯ti짯cal Ten짯sor짯Float-32 (TF32), 1307.4 TFLOPs peak theo짯re짯ti짯cal half pre짯cis짯i짯on (FP16), 1307.4 TFLOPs peak theo짯re짯ti짯cal Bfloat16 for짯mat pre짯cis짯i짯on (BF16), 2614.9 TFLOPs peak theo짯re짯ti짯cal 8멳it pre짯cis짯i짯on (FP8), 2614.9 TOPs INT8 floa짯ting-point performance.

Published results on Nvi짯dia H100 SXM (80GB) GPU resul짯ted in 66.9 TFLOPs peak theo짯re짯ti짯cal dou짯ble pre짯cis짯i짯on ten짯sor (FP64 Ten짯sor), 33.5 TFLOPs peak theo짯re짯ti짯cal dou짯ble pre짯cis짯i짯on (FP64), 66.9 TFLOPs peak theo짯re짯ti짯cal sin짯gle pre짯cis짯i짯on (FP32), 494.7 TFLOPs peak Ten짯sor짯Float-32 (TF32)*, 989.4 TFLOPs peak theo짯re짯ti짯cal half pre짯cis짯i짯on ten짯sor (FP16 Ten짯sor), 133.8 TFLOPs peak theo짯re짯ti짯cal half pre짯cis짯i짯on (FP16), 989.4 TFLOPs peak theo짯re짯ti짯cal Bfloat16 ten짯sor for짯mat pre짯cis짯i짯on (BF16 Ten짯sor), 133.8 TFLOPs peak theo짯re짯ti짯cal Bfloat16 for짯mat pre짯cis짯i짯on (BF16), 1,978.9 TFLOPs peak theo짯re짯ti짯cal 8멳it pre짯cis짯i짯on (FP8), 1,978.9 TOPs peak theo짯re짯ti짯cal INT8 floa짯ting-point performance.

Nvi짯dia H100 source:

https://resources.nvidia.com/en-us-tensor-core/

* Nvi짯dia H100 GPUs don셳 sup짯port FP32 Tensor.

MI300-18

2 Text gene짯ra짯ted with Llama2-70b chat using input sequence length of 4096 and 32 out짯put token com짯pa짯ri짯son using cus짯tom docker con짯tai짯ner for each sys짯tem based on AMD inter짯nal test짯ing as of 11/17/2023. Con짯fi짯gu짯ra짯ti짯ons: 2P Intel Xeon Pla짯ti짯num CPU ser짯ver using 4x AMD Instinct꽓 MI300X (192GB, 750W) GPUs, ROCm짰 6.0 pre-release, PyTorch 2.2.0, vLLM for ROCm, Ubun짯tu짰 22.04.2. Vs. 2P AMD EPYC 7763 CPU ser짯ver using 4x AMD Instinct꽓 MI250 (128 GB HBM2e, 560W) GPUs, ROCm짰 5.4.3, PyTorch 2.0.0., Hug짯ging짯Face Trans짯for짯mers 4.35.0, Ubun짯tu 22.04.6.

4 GPUs on each sys짯tem was used in this test. Ser짯ver manu짯fac짯tu짯r짯ers may vary con짯fi짯gu짯ra짯ti짯ons, yiel짯ding dif짯fe짯rent results. Per짯for짯mance may vary based on use of latest dri짯vers and opti짯miza짯ti짯ons. MI300-33

3 An AMD Ryzen 쏶trix point pro짯ces짯sor is pro짯jec짯ted to offer 3x fas짯ter NPU per짯for짯mance for AI workloads when com짯pared to an AMD Ryzen 7040 series pro짯ces짯sor. Per짯for짯mance pro짯jec짯tion by AMD engi짯nee짯ring staff. Engi짯nee짯ring pro짯jec짯tions are not a gua짯ran짯tee of final per짯for짯mance. Spe짯ci짯fic pro짯jec짯tions are based on refe짯rence design plat짯forms and are sub짯ject to chan짯ge when final pro짯ducts are released in mar짯ket. STX-01.