Xilinx Launches Alveo U55C, Its Most Powerful Accelerator Card Ever, Purpose-Built for HPC and Big Data Workloads

Breakthrough HPC clustering solution and simplified programmability enable massive scale-out of cutting-edge compute across existing customer infrastructure and network

ST. LOUIS–(BUSINESS WIRE)–SC21  Xilinx, Inc. (NASDAQ: XLNX), the lea­der in adap­ti­ve com­pu­ting, today at the SC21 super­com­pu­ting con­fe­rence intro­du­ced the Alveo™ U55C data cen­ter acce­le­ra­tor card and a new stan­dards-based, API-dri­ven clus­te­ring solu­ti­on for deploy­ing FPGAs at mas­si­ve sca­le. The Alveo U55C acce­le­ra­tor brings supe­ri­or per­for­mance-per-watt to high per­for­mance com­pu­ting (HPC) and data­ba­se workloads and easi­ly sca­les through the Xilinx® HPC clus­te­ring solution.

Pur­po­se-built for HPC and big data workloads, the new Alveo U55C card is the company’s most powerful Alveo acce­le­ra­tor card ever, offe­ring the hig­hest com­pu­te den­si­ty and HBM capa­ci­ty in the Alveo acce­le­ra­tor port­fo­lio. Tog­e­ther with the new Xilinx RoCE v2-based clus­te­ring solu­ti­on, a broad spec­trum of cus­to­mers with lar­ge-sca­le com­pu­te workloads can now imple­ment powerful FPGA-based HPC clus­te­ring using their exis­ting data cen­ter infra­struc­tu­re and network.

Sca­ling out Alveo com­pu­te capa­bi­li­ties to tar­get HPC workloads is now easier, more effi­ci­ent and more powerful than ever,” said Salil Raje, exe­cu­ti­ve vice pre­si­dent and gene­ral mana­ger, Data Cen­ter Group at Xilinx. “Archi­tec­tu­ral­ly, FPGA-based acce­le­ra­tors like Alveo cards pro­vi­de the hig­hest per­for­mance at the lowest cost for many com­pu­te-inten­si­ve workloads. By intro­du­cing a stan­dards-based metho­do­lo­gy that enables the crea­ti­on of Alveo HPC clus­ters using a customer’s exis­ting infra­struc­tu­re and net­work, we’re deli­ve­ring tho­se key advan­ta­ges at mas­si­ve sca­le to any data cen­ter. This is a major leap for­ward for even broa­der adop­ti­on of Alveo and adap­ti­ve com­pu­ting throug­hout the data center.”

Built for HPC and big data applications

The Alveo U55C card com­bi­nes many key fea­tures that today’s HPC workloads requi­re. It deli­vers more par­al­le­lism of data pipe­lines, supe­ri­or memo­ry manage­ment, opti­mi­zed data move­ment throug­hout the pipe­line, and the hig­hest per­for­mance-per-watt in the Alveo port­fo­lio. The Alveo U55C card is a sin­gle-slot full height, half length (FHHL) form fac­tor with a low 150W max power. It offers supe­ri­or com­pu­te den­si­ty and dou­bles the HBM2 to 16GB com­pared to its pre­de­ces­sor, the dual-slot Alveo U280 card. The U55C pro­vi­des more com­pu­te in a smal­ler form fac­tor for crea­ting den­se Alveo acce­le­ra­tor-based clus­ters. It’s built for high-den­si­ty strea­ming data, high IO math, and big com­pu­te pro­blems that requi­re sca­le-out like big data ana­ly­tics and AI applications.

Lever­aging RoCE v2 and data cen­ter bridging, cou­pled with 200 Gbps band­width, the API-dri­ven clus­te­ring solu­ti­on enables an Alveo net­work that com­pe­tes with Infi­ni­Band net­works in per­for­mance and laten­cy, with no ven­dor lock-in. MPI inte­gra­ti­on allows for HPC deve­lo­pers to sca­le out Alveo data pipe­lining from the Xilinx Vitis™ uni­fied soft­ware plat­form. Uti­li­zing exis­ting open stan­dards and frame­works, it’s now pos­si­ble to sca­le out across hundreds of Alveo cards regard­less of the ser­ver plat­forms and net­work infra­struc­tu­re and with shared workloads and memory.

Soft­ware deve­lo­pers and data sci­en­tists can unlock the bene­fits of Alveo and adap­ti­ve com­pu­ting through high-level pro­gramma­bi­li­ty of both the appli­ca­ti­on and clus­ter uti­li­zing the Vitis platform​. Xilinx has inves­ted hea­vi­ly in the Vitis deve­lo­p­ment plat­form and tools flow to make adap­ti­ve com­pu­ting more acces­si­ble to soft­ware deve­lo­pers and data sci­en­tists wit­hout hard­ware exper­ti­se. The major AI frame­works like Pytorch and Ten­sor­flow are sup­port­ed, as well as high-level pro­gramming lan­guages like C, C++ and Python, allo­wing deve­lo­pers to build domain solu­ti­ons using spe­ci­fic APIs and libra­ri­es, or uti­li­ze Xilinx soft­ware deve­lo­p­ment kits, to easi­ly acce­le­ra­te key HPC workloads within an exis­ting data center.

HPC cus­to­mer use cases

CSIRO, Australia’s natio­nal rese­arch orga­niza­ti­on along with the world’s lar­gest radio astro­no­my anten­na array, is uti­li­zing Alveo U55C cards for signal pro­ces­sing in the Squa­re Kilo­me­ter Array radio telescope. Deploy­ing the Alveo cards as net­work-atta­ched acce­le­ra­tors with HBM allows for mas­si­ve through­put at sca­le across the HPC signal pro­ces­sing clus­ter. The Alveo acce­le­ra­tor-based clus­ter allows CSIRO to tack­le the mas­si­ve com­pu­te task of aggre­ga­ting, fil­te­ring, pre­pa­ring and pro­ces­sing data from 131,000 anten­nas in real time. The 460Gbps of HBM2 band­width across the signal pro­ces­sing clus­ter is ser­ved by 420 Alveo U55C cards ful­ly net­work­ed tog­e­ther across P4-enab­led 100Gbps swit­ches. The Alveo U55C clus­ter deli­vers pro­ces­sing per­for­mance with over­all through­put at 15Tb/s in a com­pact power and cost effi­ci­ent foot­print. CSIRO is now com­ple­ting an exam­p­le Alveo refe­rence design in order to help other radio astro­no­my or adja­cent indus­tries achie­ve the same success.

Ansys LS-DYNA crash simu­la­ti­on soft­ware is used by near­ly every auto­mo­ti­ve com­pa­ny in the world. The design of safe­ty and struc­tu­ral sys­tems hin­ges on the per­for­mance of models as they miti­ga­te the cos­ts of phy­si­cal crash test­ing with com­pu­ter-aided design fini­te ele­ment method (FEM) simu­la­ti­ons. FEM sol­vers are the pri­ma­ry algo­rith­ms dri­ving simu­la­ti­ons with hundreds of mil­li­ons of degrees of free­dom, the­se enorm­ous algo­rith­ms can be bro­ken out into more rudi­men­ta­ry sol­vers like PCG, spar­se matri­ces and ICCG. By sca­ling out across many Alveo cards with hyper­par­al­lel data pipe­lining, LS-DYNA can acce­le­ra­te per­for­mance by more than 5X in com­pa­ri­son to x86 CPUs. This results in more work per clock cycle in an Alveo pipe­line with LS-DYNA cus­to­mers bene­fiting from game chan­ging simu­la­ti­on times.

In the spi­rit of relent­less inno­va­ti­on, we’re exci­ted about col­la­bo­ra­ting with Xilinx to signi­fi­cant­ly acce­le­ra­te the fini­te-ele­ment sol­vers, which can repre­sent 90% of the com­pu­te workload for impli­cit mecha­nics, in our LS-DYNA simu­la­ti­on appli­ca­ti­on,” said Wim Slag­ter, stra­te­gic part­ner­ships direc­tor at Ansys. “We look for­ward to Xilinx acce­le­ra­ti­on hel­ping us in our mis­si­on to sup­port inno­va­tors in engi­nee­ring what’s ahead.”

Tiger­Graph, pro­vi­der of a lea­ding graph ana­ly­tics plat­form, is using mul­ti­ple Alveo U55C cards to clus­ter and acce­le­ra­te the two most pro­li­fic algo­rith­ms that dri­ve graph-based recom­men­da­ti­on and clus­te­ring engi­nes. Graph data­ba­ses are a dis­rup­ti­ve plat­form for data sci­en­tists. Graphs take data from silos and bring focus to the rela­ti­onships bet­ween data. The next fron­tier for graph is fin­ding tho­se ans­wers in real time. Alveo U55C acce­le­ra­tes the query times and pre­dic­tions for recom­men­da­ti­on engi­nes from minu­tes down to mil­li­se­conds. By uti­li­zing mul­ti­ple U55C cards to sca­le up ana­ly­tics, the supe­ri­or com­pu­ta­tio­nal power and memo­ry band­width acce­le­ra­tes graph query speeds up to 45X fas­ter com­pared to CPU-based clus­ters. The qua­li­ty of scores also increa­ses by up to 35%, resul­ting in grea­ter con­fi­dence dra­ma­ti­cal­ly lowe­ring fal­se posi­ti­ves to low sin­gle digits.

Pro­duct avai­la­bi­li­ty and easy evaluations

The Alveo U55C card is curr­ent­ly available on Xilinx.com and through Xilinx aut­ho­ri­zed dis­tri­bu­tors. It’s also available for easy eva­lua­ti­on via public cloud-based FPGA-as-a-Ser­vice pro­vi­ders, as well as sel­ect colo­ca­ti­on data cen­ters for pri­va­te pre­views. Clus­te­ring is available now for pri­va­te pre­views, with gene­ral avai­la­bi­li­ty expec­ted in the second quar­ter of next year.

Xilinx is show­ca­sing the Alveo U55C acce­le­ra­tor card, along with part­ner solu­ti­ons, at the SC21 con­fe­rence taking place this week. Regis­ter at SC21 to visit the Xilinx vir­tu­al booth.

Fol­low Xilinx on Twit­terLin­ke­dIn and Face­book.

About Xilinx

Xilinx, Inc. deve­lo­ps high­ly fle­xi­ble and adap­ti­ve pro­ces­sing plat­forms that enable rapid inno­va­ti­on across a varie­ty of tech­no­lo­gies — from the cloud, to the edge, to the end­point. Xilinx is the inven­tor of the FPGA and Adap­ti­ve SoCs (inclu­ding our Adap­ti­ve Com­pu­te Acce­le­ra­ti­on Plat­form, or ACAP), desi­gned to deli­ver the most dyna­mic com­pu­ting tech­no­lo­gy in the indus­try. We col­la­bo­ra­te with our cus­to­mers to crea­te sca­lable, dif­fe­ren­tia­ted and intel­li­gent solu­ti­ons that enable the adap­ta­ble, intel­li­gent and con­nec­ted world of the future. For more infor­ma­ti­on, visit xilinx.com.