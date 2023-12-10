— Micro­soft, Dell Tech­no­lo­gies, HPE, Leno­vo, Meta, Ora­cle, Super­mi­cro and others adopt new AMD Instinct MI300X and MI300A data cen­ter AI acce­le­ra­tors for trai­ning and infe­rence solutions —

— AMD also laun­ches ROCm 6 soft­ware stack with signi­fi­cant per­for­mance opti­miza­ti­ons and new fea­tures for Lar­ge Lan­guage Models and Ryzen 8040 Series note­book pro­ces­sors for AI PCs —

SAN JOSE, Calif., Dec. 06, 2023 (GLOBE NEWSWIRE) — Today at the “Advan­cing AI” event, AMD (NASDAQ: AMD) was joi­n­ed by indus­try lea­ders inclu­ding Micro­soft, Meta, Ora­cle, Dell Tech­no­lo­gies, HPE, Leno­vo, Super­mi­cro, Aris­ta, Broad­com and Cis­co to show­ca­se how the­se com­pa­nies are working with AMD to deli­ver advan­ced AI solu­ti­ons span­ning from cloud to enter­pri­se and PCs. AMD laun­ched mul­ti­ple new pro­ducts at the event, inclu­ding the AMD Instinct™ MI300 Series data cen­ter AI acce­le­ra­tors, ROCm™ 6 open soft­ware stack with signi­fi­cant opti­miza­ti­ons and new fea­tures sup­port­ing Lar­ge Lan­guage Models (LLMs) and Ryzen™ 8040 Series pro­ces­sors with Ryzen AI.

“AI is the future of com­pu­ting and AMD is uni­que­ly posi­tio­ned to power the end-to-end infra­struc­tu­re that will defi­ne this AI era, from mas­si­ve cloud instal­la­ti­ons to enter­pri­se clus­ters and AI-enab­led intel­li­gent embedded devices and PCs,” said AMD Chair and CEO Dr. Lisa Su. “We are see­ing very strong demand for our new Instinct MI300 GPUs, which are the hig­hest-per­for­mance acce­le­ra­tors in the world for gene­ra­ti­ve AI. We are also buil­ding signi­fi­cant momen­tum for our data cen­ter AI solu­ti­ons with the lar­gest cloud com­pa­nies, the industry’s top ser­ver pro­vi­ders, and the most inno­va­ti­ve AI start­ups ꟷ who we are working clo­se­ly with to rapidly bring Instinct MI300 solu­ti­ons to mar­ket that will dra­ma­ti­cal­ly acce­le­ra­te the pace of inno­va­ti­on across the enti­re AI eco­sys­tem1.”

Advan­cing Data Cen­ter AI from the Cloud to Enter­pri­se Data Cen­ters and Supercomputers

AMD was joi­n­ed by mul­ti­ple part­ners during the event to high­light the strong adop­ti­on and gro­wing momen­tum for the AMD Instinct data cen­ter AI accelerators.

Micro­soft detail­ed how it is deploy­ing AMD Instinct MI300X acce­le­ra­tors to power the new Azu­re ND MI300x v5 Vir­tu­al Machi­ne ( VM ) series opti­mi­zed for AI workloads.

Meta shared that the com­pa­ny is adding AMD Instinct MI300X acce­le­ra­tors to its data cen­ters in com­bi­na­ti­on with ROCm 6 to power AI infe­ren­cing workloads and reco­gni­zed the ROCm 6 opti­miza­ti­ons AMD has done on the Llama 2 fami­ly of models.

Ora­cle unvei­led plans to offer OCI bare metal com­pu­te solu­ti­ons fea­turing AMD Instinct MI300X acce­le­ra­tors as well as plans to include AMD Instinct MI300X acce­le­ra­tors in their upco­ming gene­ra­ti­ve AI service.

bare metal com­pu­te solu­ti­ons fea­turing Instinct acce­le­ra­tors as well as plans to include Instinct acce­le­ra­tors in their upco­ming gene­ra­ti­ve service. The lar­gest data cen­ter infra­struc­tu­re pro­vi­ders announ­ced plans to inte­gra­te AMD Instinct MI300 acce­le­ra­tors across their pro­duct port­fo­li­os. Dell announ­ced the inte­gra­ti­on of AMD Instinct MI300X acce­le­ra­tors with their PowerEdge XE9680 ser­ver solu­ti­on to deli­ver ground­brea­king per­for­mance for gene­ra­ti­ve AI workloads in a modu­lar and sca­lable for­mat for cus­to­mers. HPE announ­ced plans to bring AMD Instinct MI300 acce­le­ra­tors to its enter­pri­se and HPC offe­rings. Leno­vo shared plans to bring AMD Instinct MI300X acce­le­ra­tors to the Leno­vo Think­Sys­tem plat­form to deli­ver AI solu­ti­ons across indus­tries inclu­ding retail, manu­fac­tu­ring, finan­cial ser­vices and health­ca­re. Super­mi­cro announ­ced plans to offer AMD Instinct MI300 GPUs across their AI solu­ti­ons port­fo­lio. Asus, Giga­byte, Ingra­sys, Inven­tec, QCT , Wist­ron and Wiwynn also all plan to offer solu­ti­ons powered by AMD Instinct MI300 accelerators.

The lar­gest data cen­ter infra­struc­tu­re pro­vi­ders announ­ced plans to inte­gra­te AMD Instinct MI300 acce­le­ra­tors across their pro­duct port­fo­li­os. Dell announ­ced the inte­gra­ti­on of AMD Instinct MI300X acce­le­ra­tors with their PowerEdge XE9680 ser­ver solu­ti­on to deli­ver ground­brea­king per­for­mance for gene­ra­ti­ve AI workloads in a modu­lar and sca­lable for­mat for cus­to­mers. HPE announ­ced plans to bring AMD Instinct MI300 acce­le­ra­tors to its enter­pri­se and HPC offe­rings. Leno­vo shared plans to bring AMD Instinct MI300X acce­le­ra­tors to the Leno­vo Think­Sys­tem plat­form to deli­ver AI solu­ti­ons across indus­tries inclu­ding retail, manu­fac­tu­ring, finan­cial ser­vices and health­ca­re. Super­mi­cro announ­ced plans to offer AMD Instinct MI300 GPUs across their AI solu­ti­ons port­fo­lio. Asus, Giga­byte, Ingra­sys, Inven­tec, QCT , Wist­ron and Wiwynn also all plan to offer solu­ti­ons powered by AMD Instinct MI300 accelerators. Spe­cia­li­zed AI cloud pro­vi­ders inclu­ding Ali­gned, Arkon Ener­gy, Cir­ras­ca­le, Cru­soe, Den­vr Data­works and Ten­sor­wa­ves all plan to pro­vi­de offe­rings that will expand access to AMD Instinct MI300X GPUs for deve­lo­pers and AI startups.

Brin­ging an Open, Pro­ven and Rea­dy AI Soft­ware Plat­form to Market

AMD high­ligh­ted signi­fi­cant pro­gress expan­ding the soft­ware eco­sys­tem sup­port­ing AMD Instinct data cen­ter accelerators.

AMD unvei­led the latest ver­si­on of the open-source soft­ware stack for AMD Instinct GPUs, ROCm 6, which has been opti­mi­zed for gene­ra­ti­ve AI , par­ti­cu­lar­ly lar­ge lan­guage models. ROCm 6 boasts sup­port for new data types, advan­ced graph and ker­nel opti­miza­ti­ons, opti­mi­zed libra­ri­es and sta­te of the art atten­ti­on algo­rith­ms, which tog­e­ther with MI300X deli­ver an ~8x per­for­mance increase for over­all laten­cy in text gene­ra­ti­on on Llama 2 com­pared to ROCm 5 run­ning on the MI250 . 2

AMD unvei­led the latest ver­si­on of the open-source soft­ware stack for AMD Instinct GPUs, ROCm 6, which has been opti­mi­zed for gene­ra­ti­ve AI , par­ti­cu­lar­ly lar­ge lan­guage models. ROCm 6 boasts sup­port for new data types, advan­ced graph and ker­nel opti­miza­ti­ons, opti­mi­zed libra­ri­es and sta­te of the art atten­ti­on algo­rith­ms, which tog­e­ther with MI300X deli­ver an ~8x per­for­mance increase for over­all laten­cy in text gene­ra­ti­on on Llama 2 com­pared to ROCm 5 run­ning on the MI250 . Dat­ab­ricks, Essen­ti­al AI and Lami­ni, three AI start­ups buil­ding emer­ging models and AI solu­ti­ons, joi­n­ed AMD on stage to dis­cuss how they're lever­aging AMD Instinct MI300X acce­le­ra­tors and the open ROCm 6 soft­ware stack to deli­ver dif­fe­ren­tia­ted AI solu­ti­ons for enter­pri­se customers.

Ope­nAI is adding sup­port for AMD Instinct acce­le­ra­tors to Tri­ton 3.0, pro­vi­ding out-of-the-box sup­port for AMD acce­le­ra­tors that will allow deve­lo­pers to work at a hig­her level of abs­trac­ti­on on AMD hardware.

Read here for more infor­ma­ti­on about AMD Instinct MI300 Series acce­le­ra­tors, ROCm 6 and the gro­wing eco­sys­tem of AMD-powered AI solutions.

Con­tin­ued Lea­der­ship in Advan­cing AI PCs

With mil­li­ons of AI PCs ship­ped to date, AMD announ­ced new lea­der­ship mobi­le pro­ces­sors with the launch of the latest AMD Ryzen 8040 Series pro­ces­sors that deli­ver robust AI com­pu­te capa­bi­li­ty. AMD also laun­ched Ryzen AI 1.0 Soft­ware, a soft­ware stack that enables deve­lo­pers to easi­ly deploy apps that use pre­trai­ned models to add AI capa­bi­li­ties for Win­dows appli­ca­ti­ons. AMD also dis­c­lo­sed that the upco­ming next-gen “Strix Point” CPUs, plan­ned to launch in 2024, will include the AMD XDNA™ 2 archi­tec­tu­re desi­gned to deli­ver more than a 3x increase in AI com­pu­te per­for­mance com­pared to the pri­or gene­ra­ti­on3 that will enable new gene­ra­ti­ve AI expe­ri­en­ces. Micro­soft also joi­n­ed to dis­cuss how they are working clo­se­ly with AMD on future AI expe­ri­en­ces for Win­dows PCs.

1 Mea­su­re­ments con­duc­ted by AMD Per­for­mance Labs as of Novem­ber 11th, 2023 on the AMD Instinct™ MI300X (750W) GPU desi­gned with AMD CDNA™ 3 5nm | 6nm Fin­FET pro­cess tech­no­lo­gy at 2,100 MHz peak boost engi­ne clock resul­ted in 163.4 TFLOPs peak theo­re­ti­cal dou­ble pre­cis­i­on Matrix (FP64 Matrix), 81.7 TFLOPs peak theo­re­ti­cal dou­ble pre­cis­i­on (FP64), 163.4 TFLOPs peak theo­re­ti­cal sin­gle pre­cis­i­on Matrix (FP32 Matrix), 163.4 TFLOPs peak theo­re­ti­cal sin­gle pre­cis­i­on (FP32), 653.7 TFLOPs peak theo­re­ti­cal Ten­sor­Float-32 (TF32), 1307.4 TFLOPs peak theo­re­ti­cal half pre­cis­i­on (FP16), 1307.4 TFLOPs peak theo­re­ti­cal Bfloat16 for­mat pre­cis­i­on (BF16), 2614.9 TFLOPs peak theo­re­ti­cal 8‑bit pre­cis­i­on (FP8), 2614.9 TOPs INT8 floa­ting-point performance.

Published results on Nvi­dia H100 SXM (80GB) GPU resul­ted in 66.9 TFLOPs peak theo­re­ti­cal dou­ble pre­cis­i­on ten­sor (FP64 Ten­sor), 33.5 TFLOPs peak theo­re­ti­cal dou­ble pre­cis­i­on (FP64), 66.9 TFLOPs peak theo­re­ti­cal sin­gle pre­cis­i­on (FP32), 494.7 TFLOPs peak Ten­sor­Float-32 (TF32)*, 989.4 TFLOPs peak theo­re­ti­cal half pre­cis­i­on ten­sor (FP16 Ten­sor), 133.8 TFLOPs peak theo­re­ti­cal half pre­cis­i­on (FP16), 989.4 TFLOPs peak theo­re­ti­cal Bfloat16 ten­sor for­mat pre­cis­i­on (BF16 Ten­sor), 133.8 TFLOPs peak theo­re­ti­cal Bfloat16 for­mat pre­cis­i­on (BF16), 1,978.9 TFLOPs peak theo­re­ti­cal 8‑bit pre­cis­i­on (FP8), 1,978.9 TOPs peak theo­re­ti­cal INT8 floa­ting-point performance.

Nvi­dia H100 source:

https://resources.nvidia.com/en-us-tensor-core/

* Nvi­dia H100 GPUs don’t sup­port FP32 Tensor.

MI300-18

2 Text gene­ra­ted with Llama2-70b chat using input sequence length of 4096 and 32 out­put token com­pa­ri­son using cus­tom docker con­tai­ner for each sys­tem based on AMD inter­nal test­ing as of 11/17/2023. Con­fi­gu­ra­ti­ons: 2P Intel Xeon Pla­ti­num CPU ser­ver using 4x AMD Instinct™ MI300X (192GB, 750W) GPUs, ROCm® 6.0 pre-release, PyTorch 2.2.0, vLLM for ROCm, Ubun­tu® 22.04.2. Vs. 2P AMD EPYC 7763 CPU ser­ver using 4x AMD Instinct™ MI250 (128 GB HBM2e, 560W) GPUs, ROCm® 5.4.3, PyTorch 2.0.0., Hug­ging­Face Trans­for­mers 4.35.0, Ubun­tu 22.04.6.

4 GPUs on each sys­tem was used in this test. Ser­ver manu­fac­tu­r­ers may vary con­fi­gu­ra­ti­ons, yiel­ding dif­fe­rent results. Per­for­mance may vary based on use of latest dri­vers and opti­miza­ti­ons. MI300-33

3 An AMD Ryzen “Strix point” pro­ces­sor is pro­jec­ted to offer 3x fas­ter NPU per­for­mance for AI workloads when com­pared to an AMD Ryzen 7040 series pro­ces­sor. Per­for­mance pro­jec­tion by AMD engi­nee­ring staff. Engi­nee­ring pro­jec­tions are not a gua­ran­tee of final per­for­mance. Spe­ci­fic pro­jec­tions are based on refe­rence design plat­forms and are sub­ject to chan­ge when final pro­ducts are released in mar­ket. STX-01.