New AMD Instinct™ MI200 Series Accelerators Bring Leadership HPC and AI Performance to Power Exascale Systems and More

Novem­ber 08, 2021 12:00pm EST

- With new AMD CDNA™ 2 archi­tec­tu­re, AMD Instinct MI200 series acce­le­ra­tors deli­ver ground-brea­king 4.9x advan­ta­ge in HPC per­for­mance1 com­pared to com­pe­ting data cen­ter acce­le­ra­tors, expe­diting sci­ence and discovery -

- MI200 series acce­le­ra­tors are first mul­ti-die GPU, first to sup­port 128GB of HBM2e memo­ry, and deli­ver a sub­stan­ti­al boost for appli­ca­ti­ons cri­ti­cal to the foun­da­ti­on of science -

SANTA CLARA, Calif., Nov. 08, 2021 (GLOBE NEWSWIRE) — AMD (NASDAQ: AMD) today announ­ced the new AMD Instinct™ MI200 series acce­le­ra­tors, the first exas­ca­le-class GPU acce­le­ra­tors. AMD Instinct MI200 series acce­le­ra­tors includes the world’s fas­test high per­for­mance com­pu­ting (HPC) and arti­fi­ci­al intel­li­gence (AI) acce­le­ra­tor,1 the AMD Instinct™ MI250X.

Built on AMD CDNA™ 2 archi­tec­tu­re, AMD Instinct MI200 series acce­le­ra­tors deli­ver lea­ding appli­ca­ti­on per­for­mance for a broad set of HPC workloads.2 The AMD Instinct MI250X acce­le­ra­tor pro­vi­des up to 4.9X bet­ter per­for­mance than com­pe­ti­ti­ve acce­le­ra­tors for dou­ble pre­cis­i­on (FP64) HPC appli­ca­ti­ons and sur­pas­ses 380 tera­flops of peak theo­re­ti­cal half-pre­cis­i­on (FP16) for AI workloads to enable dis­rup­ti­ve approa­ches in fur­ther acce­le­ra­ting data-dri­ven rese­arch.1

AMD Instinct MI200 acce­le­ra­tors deli­ver lea­der­ship HPC and AI per­for­mance, hel­ping sci­en­tists make gene­ra­tio­nal leaps in rese­arch that can dra­ma­ti­cal­ly shor­ten the time bet­ween initi­al hypo­the­sis and dis­co­very,” said For­rest Nor­rod, seni­or vice pre­si­dent and gene­ral mana­ger, Data Cen­ter and Embedded Solu­ti­ons Busi­ness Group, AMD. “With key inno­va­tions in archi­tec­tu­re, pack­a­ging and sys­tem design, the AMD Instinct MI200 series acce­le­ra­tors are the most advan­ced data cen­ter GPUs ever, pro­vi­ding excep­tio­nal per­for­mance for super­com­pu­ters and data cen­ters to sol­ve the world’s most com­plex problems.”

Exas­ca­le With AMD
AMD, in col­la­bo­ra­ti­on with the U.S. Depart­ment of Ener­gy, Oak Ridge Natio­nal Labo­ra­to­ry, and HPE, desi­gned the Fron­tier super­com­pu­ter expec­ted to deli­ver more than 1.5 exa­flops of peak com­pu­ting power. Powered by opti­mi­zed 3rd Gen AMD EPYC™ CPUs and AMD Instinct MI250X acce­le­ra­tors, Fron­tier will push the boun­da­ries of sci­en­ti­fic dis­co­very by dra­ma­ti­cal­ly enhan­cing per­for­mance of AI, ana­ly­tics, and simu­la­ti­on at sca­le, hel­ping sci­en­tists to pack in more cal­cu­la­ti­ons, iden­ti­fy new pat­terns in data, and deve­lop inno­va­ti­ve data ana­ly­sis methods to acce­le­ra­te the pace of sci­en­ti­fic discovery.

The Fron­tier super­com­pu­ter is the cul­mi­na­ti­on of a strong col­la­bo­ra­ti­on bet­ween AMD, HPE and the U.S. Depart­ment of Ener­gy, to pro­vi­de an exas­ca­le-capa­ble sys­tem that pushes the boun­da­ries of sci­en­ti­fic dis­co­very by dra­ma­ti­cal­ly enhan­cing per­for­mance of arti­fi­ci­al intel­li­gence, ana­ly­tics, and simu­la­ti­on at sca­le,” said Tho­mas Zacha­ria, direc­tor, Oak Ridge Natio­nal Laboratory.

Powe­ring The Future of HPC
The AMD Instinct MI200 series acce­le­ra­tors, com­bi­ned with 3rd Gen AMD EPYC CPUs and the ROCm™ 5.0 open soft­ware plat­form, are desi­gned to pro­pel new dis­co­veries for the exas­ca­le era and tack­le our most pres­sing chal­lenges from cli­ma­te chan­ge to vac­ci­ne research.

Key capa­bi­li­ties and fea­tures of the AMD Instinct MI200 series acce­le­ra­tors include:

  • AMD CDNA™ 2 archi­tec­tu­re – 2nd Gen Matrix Cores acce­le­ra­ting FP64 and FP32 matrix ope­ra­ti­ons, deli­ve­ring up to 4X the peak theo­re­ti­cal FP64 per­for­mance vs. AMD pre­vious gen GPUs. 1,3,4
  • Lea­der­ship Pack­a­ging Tech­no­lo­gy – Indus­try-first mul­ti-die GPU design with 2.5D Ele­va­ted Fanout Bridge (EFB) tech­no­lo­gy deli­vers 1.8X more cores and 2.7X hig­her memo­ry band­width vs. AMD pre­vious gen GPUs, offe­ring the industry’s best aggre­ga­te peak theo­re­ti­cal memo­ry band­width at 3.2 tera­bytes per second. 4,5,6
  • 3rd Gen AMD Infi­ni­ty Fabric™ tech­no­lo­gy – Up to 8 Infi­ni­ty Fabric links con­nect the AMD Instinct MI200 with 3rd Gen EPYC CPUs and other GPUs in the node to enable uni­fied CPU/GPU memo­ry cohe­ren­cy and maxi­mi­ze sys­tem through­put, allo­wing for an easier on-ramp for CPU codes to tap the power of accelerators.

Soft­ware for Enab­ling Exas­ca­le Science
AMD ROCm™ is an open soft­ware plat­form allo­wing rese­ar­chers to tap the power of AMD Instinct™ acce­le­ra­tors to dri­ve sci­en­ti­fic dis­co­veries. The ROCm plat­form is built on the foun­da­ti­on of open por­ta­bi­li­ty, sup­port­ing envi­ron­ments across mul­ti­ple acce­le­ra­tor ven­dors and archi­tec­tures. With ROCm 5.0, AMD extends its open plat­form powe­ring top HPC and AI appli­ca­ti­ons with AMD Instinct MI200 series acce­le­ra­tors, incre­asing acces­si­bi­li­ty of ROCm for deve­lo­pers and deli­ve­ring lea­der­ship per­for­mance across key workloads.

Through the AMD Infi­ni­ty Hub, rese­ar­chers, data sci­en­tists and end-users can easi­ly find, down­load and install con­tai­ne­ri­zed HPC apps and ML frame­works that are opti­mi­zed and sup­port­ed on AMD Instinct acce­le­ra­tors and ROCm. The hub curr­ent­ly offers a ran­ge of con­tai­ners sup­port­ing eit­her Rade­on Instinct™ MI50, AMD Instinct™ MI100 or AMD Instinct MI200 acce­le­ra­tors inclu­ding seve­ral appli­ca­ti­ons like Chro­ma, CP2k, LAMMPS, NAMD, OpenMM and more, along with popu­lar ML frame­works Ten­sor­Flow and PyTorch. New con­tai­ners are con­ti­nu­al­ly being added to the hub.

Available Ser­ver Solutions
The AMD Instinct MI250X and AMD Instinct MI250 are available in the open-hard­ware com­pu­te acce­le­ra­tor modu­le or OCP Acce­le­ra­tor Modu­le (OAM) form fac­tor. The AMD Instinct MI210 will be available in a PCIe® card form fac­tor in OEM servers.

The AMD MI250X acce­le­ra­tor is curr­ent­ly available from HPE in the HPE Cray EX Super­com­pu­ter, and addi­tio­nal AMD Instinct MI200 series acce­le­ra­tors are expec­ted in sys­tems from major OEM and ODM part­ners in enter­pri­se mar­kets in Q1 2022, inclu­ding ASUS, ATOS, Dell Tech­no­lo­gies, Giga­byte, Hew­lett Packard Enter­pri­se (HPE), Leno­vo, Pen­gu­in Com­pu­tin­gand Supermicro.

MI200 Series Specifications

Models Com­pu­te Units Stream Pro­ces­sors FP64 | FP32 Vec­tor (Peak) FP64 | FP32 Matrix (Peak) FP16 | bf16
(Peak)
INT4INT8
(Peak)
HBM2e
ECC
Memory
Memo­ry Bandwidth Form Fac­tor
AMD Instinct MI250x 220 14,080 Up to 47.9 TF Up to 95.7 TF Up to 383.0 TF Up to 383.0 TOPS 128GB 3.2 TB/sec OCP Acce­le­ra­tor Module
AMD Instinct MI250 208 13,312 Up to 45.3 TF Up to 90.5 TF Up to 362.1 TF Up to 362.1 TOPS 128GB 3.2 TB/sec OCP Acce­le­ra­tor Module

Sup­port­ing Resources

About AMD
For more than 50 years AMD has dri­ven inno­va­ti­on in high-per­for­mance com­pu­ting, gra­phics and visua­liza­ti­on tech­no­lo­gies ― the buil­ding blocks for gam­ing, immersi­ve plat­forms and the data cen­ter. Hundreds of mil­li­ons of con­su­mers, lea­ding For­tu­ne 500 busi­nesses and cut­ting-edge sci­en­ti­fic rese­arch faci­li­ties around the world rely on AMD tech­no­lo­gy dai­ly to impro­ve how they live, work and play. AMD employees around the world are focu­sed on buil­ding gre­at pro­ducts that push the boun­da­ries of what is pos­si­ble. For more infor­ma­ti­on about how AMD is enab­ling today and inspi­ring tomor­row, visit the AMD (NASDAQ: AMDweb­siteFace­bookLin­ke­dIn and Twit­ter pages.

CAUTIONARY STATEMENT
This press release con­ta­ins for­ward-loo­king state­ments con­cer­ning Advan­ced Micro Devices, Inc. (AMD) such as the fea­tures, func­tion­a­li­ty, per­for­mance, avai­la­bi­li­ty, timing and expec­ted bene­fits of AMD pro­ducts inclu­ding the AMD Instinct™ MI200 series acce­le­ra­tors, which are made pur­su­ant to the Safe Har­bor pro­vi­si­ons of the Pri­va­te Secu­ri­ties Liti­ga­ti­on Reform Act of 1995. For­ward-loo­king state­ments are com­mon­ly iden­ti­fied by words such as “would,” “may,” “expects,” “belie­ves,” “plans,” “intends,” “pro­jects” and other terms with simi­lar mea­ning. Inves­tors are cau­tio­ned that the for­ward-loo­king state­ments in this press release are based on cur­rent beliefs, assump­ti­ons and expec­ta­ti­ons, speak only as of the date of this press release and invol­ve risks and uncer­tain­ties that could cau­se actu­al results to dif­fer mate­ri­al­ly from cur­rent expec­ta­ti­ons. Such state­ments are sub­ject to cer­tain known and unknown risks and uncer­tain­ties, many of which are dif­fi­cult to pre­dict and gene­ral­ly bey­ond AMD’s con­trol, that could cau­se actu­al results and other future events to dif­fer mate­ri­al­ly from tho­se expres­sed in, or impli­ed or pro­jec­ted by, the for­ward-loo­king infor­ma­ti­on and state­ments. Mate­ri­al fac­tors that could cau­se actu­al results to dif­fer mate­ri­al­ly from cur­rent expec­ta­ti­ons include, wit­hout limi­ta­ti­on, the fol­lo­wing: Intel Corporation’s domi­nan­ce of the micro­pro­ces­sor mar­ket and its aggres­si­ve busi­ness prac­ti­ces; glo­bal eco­no­mic uncer­tain­ty; loss of a signi­fi­cant cus­to­mer; impact of the COVID-19 pan­de­mic on AMD’s busi­ness, finan­cial con­di­ti­on and results of ope­ra­ti­ons; com­pe­ti­ti­ve mar­kets in which AMD’s pro­ducts are sold; mar­ket con­di­ti­ons of the indus­tries in which AMD pro­ducts are sold; cycli­cal natu­re of the semi­con­duc­tor indus­try; quar­ter­ly and sea­so­nal sales pat­terns; AMD’s abili­ty to ade­qua­te­ly pro­tect its tech­no­lo­gy or other intellec­tu­al pro­per­ty; unfa­vorable cur­ren­cy exch­an­ge rate fluc­tua­tions; abili­ty of third par­ty manu­fac­tu­r­ers to manu­fac­tu­re AMD’s pro­ducts on a time­ly basis in suf­fi­ci­ent quan­ti­ties and using com­pe­ti­ti­ve tech­no­lo­gies; avai­la­bi­li­ty of essen­ti­al equip­ment, mate­ri­als, sub­stra­tes or manu­fac­tu­ring pro­ces­ses; abili­ty to achie­ve expec­ted manu­fac­tu­ring yields for AMD’s pro­ducts; AMD’s abili­ty to intro­du­ce pro­ducts on a time­ly basis with expec­ted fea­tures and per­for­mance levels; AMD’s abili­ty to gene­ra­te reve­nue from its semi-cus­tom SoC pro­ducts; poten­ti­al secu­ri­ty vul­nerabi­li­ties; poten­ti­al secu­ri­ty inci­dents inclu­ding IT outa­ges, data loss, data brea­ches and cyber-attacks; uncer­tain­ties invol­ving the orde­ring and ship­ment of AMD’s pro­ducts; AMD’s reli­ance on third-par­ty intellec­tu­al pro­per­ty to design and intro­du­ce new pro­ducts in a time­ly man­ner; AMD’s reli­ance on third-par­ty com­pa­nies for design, manu­fac­tu­re and sup­p­ly of mother­boards, soft­ware and other com­pu­ter plat­form com­pon­ents; AMD’s reli­ance on Micro­soft and other soft­ware ven­dors’ sup­port to design and deve­lop soft­ware to run on AMD’s pro­ducts; AMD’s reli­ance on third-par­ty dis­tri­bu­tors and add-in-board part­ners; impact of modi­fi­ca­ti­on or inter­rup­ti­on of AMD’s inter­nal busi­ness pro­ces­ses and infor­ma­ti­on sys­tems; com­pa­ti­bi­li­ty of AMD’s pro­ducts with some or all indus­try-stan­dard soft­ware and hard­ware; cos­ts rela­ted to defec­ti­ve pro­ducts; effi­ci­en­cy of AMD’s sup­p­ly chain; AMD’s abili­ty to rely on third par­ty sup­p­ly-chain logi­stics func­tions; AMD’s abili­ty to effec­tively con­trol sales of its pro­ducts on the gray mar­ket; impact of govern­ment actions and regu­la­ti­ons such as export admi­nis­tra­ti­on regu­la­ti­ons, tariffs and trade pro­tec­tion mea­su­res; AMD’s abili­ty to rea­li­ze its defer­red tax assets; poten­ti­al tax lia­bi­li­ties; cur­rent and future claims and liti­ga­ti­on; impact of envi­ron­men­tal laws, con­flict mine­rals-rela­ted pro­vi­si­ons and other laws or regu­la­ti­ons; impact of acqui­si­ti­ons, joint ven­tures and/or invest­ments on AMD’s busi­ness, inclu­ding the announ­ced acqui­si­ti­on of Xilinx, and abili­ty to inte­gra­te acqui­red busi­nesses; AMD’s abili­ty to com­ple­te the Xilinx mer­ger; impact of the announce­ment and pen­den­cy of the Xilinx mer­ger on AMD’s busi­ness; impact of any impair­ment of the com­bi­ned company’s assets on the com­bi­ned company’s finan­cial posi­ti­on and results of ope­ra­ti­on; rest­ric­tions impo­sed by agree­ments gover­ning AMD’s notes and the revol­ving cre­dit faci­li­ty; AMD’s indeb­ted­ness; AMD’s abili­ty to gene­ra­te suf­fi­ci­ent cash to meet its working capi­tal requi­re­ments or gene­ra­te suf­fi­ci­ent reve­nue and ope­ra­ting cash flow to make all of its plan­ned R&D or stra­te­gic invest­ments; poli­ti­cal, legal, eco­no­mic risks and natu­ral dis­as­ters; future impairm­ents of good­will and tech­no­lo­gy licen­se purcha­ses; AMD’s abili­ty to attract and retain qua­li­fied per­son­nel; AMD’s stock pri­ce vola­ti­li­ty; and world­wi­de poli­ti­cal con­di­ti­ons. Inves­tors are urged to review in detail the risks and uncer­tain­ties in AMD’s Secu­ri­ties and Exch­an­ge Com­mis­si­on filings, inclu­ding but not limi­t­ed to AMD’s most recent reports on Forms 10‑K and 10‑Q.

©2021 Advan­ced Micro Devices, Inc. All rights reser­ved. AMD, the AMD Arrow logo, AMD CDNA, EPYC, AMD Instinct, Infi­ni­ty Fabric, Rade­on Instinct, ROCm and com­bi­na­ti­ons the­reof are trade­marks of Advan­ced Micro Devices, Inc. PyTorch is a trade­mark or regis­tered trade­mark of PyTorch. Ten­sor­Flow, the Ten­sor­Flow logo and any rela­ted marks are trade­marks of Goog­le Inc. Other pro­duct names used in this publi­ca­ti­on are for iden­ti­fi­ca­ti­on pur­po­ses only and may be trade­marks of their respec­ti­ve companies.

Addi­tio­nal bench­mark data is available on AMD.com

  1. World’s fas­test data cen­ter GPU is the AMD Instinct™ MI250X. Cal­cu­la­ti­ons con­duc­ted by AMD Per­for­mance Labs as of Sep 15, 2021, for the AMD Instinct™ MI250X (128GB HBM2e OAM modu­le) acce­le­ra­tor at 1,700 MHz peak boost engi­ne clock resul­ted in 95.7 TFLOPS peak theo­re­ti­cal dou­ble pre­cis­i­on (FP64 Matrix), 47.9 TFLOPS peak theo­re­ti­cal dou­ble pre­cis­i­on (FP64), 95.7 TFLOPS peak theo­re­ti­cal sin­gle pre­cis­i­on matrix (FP32 Matrix), 47.9 TFLOPS peak theo­re­ti­cal sin­gle pre­cis­i­on (FP32), 383.0 TFLOPS peak theo­re­ti­cal half pre­cis­i­on (FP16), and 383.0 TFLOPS peak theo­re­ti­cal Bfloat16 for­mat pre­cis­i­on (BF16) floa­ting-point per­for­mance. Cal­cu­la­ti­ons con­duc­ted by AMD Per­for­mance Labs as of Sep 18, 2020 for the AMD Instinct™ MI100 (32GB HBM2 PCIe® card) acce­le­ra­tor at 1,502 MHz peak boost engi­ne clock resul­ted in 11.54 TFLOPS peak theo­re­ti­cal dou­ble pre­cis­i­on (FP64), 46.1 TFLOPS peak theo­re­ti­cal sin­gle pre­cis­i­on matrix (FP32), 23.1 TFLOPS peak theo­re­ti­cal sin­gle pre­cis­i­on (FP32), 184.6 TFLOPS peak theo­re­ti­cal half pre­cis­i­on (FP16) floa­ting-point per­for­mance. Published results on the NVi­dia Ampere A100 (80GB) GPU acce­le­ra­tor, boost engi­ne clock of 1410 MHz, resul­ted in 19.5 TFLOPS peak dou­ble pre­cis­i­on ten­sor cores (FP64 Ten­sor Core), 9.7 TFLOPS peak dou­ble pre­cis­i­on (FP64). 19.5 TFLOPS peak sin­gle pre­cis­i­on (FP32), 78 TFLOPS peak half pre­cis­i­on (FP16), 312 TFLOPS peak half pre­cis­i­on (FP16 Ten­sor Flow), 39 TFLOPS peak Bfloat 16 (BF16), 312 TFLOPS peak Bfloat16 for­mat pre­cis­i­on (BF16 Ten­sor Flow), theo­re­ti­cal floa­ting-point per­for­mance. The TF32 data for­mat is not IEEE com­pli­ant and not included in this com­pa­ri­son. https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf, page 15, Table 1. MI200-01
  2. AMD Instinct MI250X acce­le­ra­tor appli­ca­ti­on and bench­mark per­for­mance can be found at https://www.amd.com/en/graphics/server-accelerators-benchmarks.
  3. Cal­cu­la­ti­ons con­duc­ted by AMD Per­for­mance Labs as of Sep 15, 2021, for the AMD Instinct™ MI250X acce­le­ra­tor (128GB HBM2e OAM modu­le) at 1,700 MHz peak boost engi­ne clock resul­ted in 95.7 TFLOPS peak dou­ble pre­cis­i­on matrix (FP64 Matrix) theo­re­ti­cal, floa­ting-point per­for­mance. Published results on the NVi­dia Ampere A100 (80GB) GPU acce­le­ra­tor resul­ted in 19.5 TFLOPS peak dou­ble pre­cis­i­on (FP64 Ten­sor Core) theo­re­ti­cal, floa­ting-point per­for­mance. Results found at:https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf, page 15, Table 1.MI200-02
  4. Cal­cu­la­ti­ons con­duc­ted by AMD Per­for­mance Labs as of Sep 21, 2021, for the AMD Instinct™ MI250X and MI250 (128GB HBM2e) OAM acce­le­ra­tors desi­gned with AMD CDNA™ 2 6nm Fin­Fet pro­cess tech­no­lo­gy at 1,600 MHz peak memo­ry clock resul­ted in 128GB HBM2e memo­ry capa­ci­ty and 3.2768 TFLOPS peak theo­re­ti­cal memo­ry band­width per­for­mance. MI250/MI250X memo­ry bus inter­face is 4,096 bits times 2 die and memo­ry data rate is 3.20 Gbps for total memo­ry band­width of 3.2768 TB/s ((3.20 Gbps*(4,096 bits*2))/8).The hig­hest published results on the NVi­dia Ampere A100 (80GB) SXM GPU acce­le­ra­tor resul­ted in 80GB HBM2e memo­ry capa­ci­ty and 2.039 TB/s GPU memo­ry band­width performance.https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-us-nvidia-1758950-r4-web.pdf MI200-07
  5. The AMD Instinct™ MI250X acce­le­ra­tor has 220 com­pu­te units (CUs) and 14,080 stream cores. The AMD Instinct™ MI100 acce­le­ra­tor has 120 com­pu­te units (CUs) and 7,680 stream cores. MI200-027
  6. Cal­cu­la­ti­ons con­duc­ted by AMD Per­for­mance Labs as of Sep 21, 2021, for the AMD Instinct™ MI250X and MI250 (128GB HBM2e) OAM acce­le­ra­tors desi­gned with AMD CDNA™ 2 6nm Fin­Fet pro­cess tech­no­lo­gy at 1,600 MHz peak memo­ry clock resul­ted in 3.2768 TFLOPS peak theo­re­ti­cal memo­ry band­width per­for­mance. MI250/MI250X memo­ry bus inter­face is 4,096 bits times 2 die and memo­ry data rate is 3.20 Gbps for total memo­ry band­width of 3.2768 TB/s ((3.20 Gbps*(4,096 bits*2))/8). Cal­cu­la­ti­ons by AMD Per­for­mance Labs as of OCT 5th, 2020 for the AMD Instinct™ MI100 acce­le­ra­tor desi­gned with AMD CDNA 7nm Fin­FET pro­cess tech­no­lo­gy at 1,200 MHz peak memo­ry clock resul­ted in 1.2288 TFLOPS peak theo­re­ti­cal memo­ry band­width per­for­mance. MI100 memo­ry bus inter­face is 4,096 bits and memo­ry data rate is 2.40 Gbps for total memo­ry band­width of 1.2288 TB/s ((2.40 Gbps*4,096 bits)/8) MI200-33