Xilinx Expands Alveo Portfolio with Industry’s First Adaptable Compute, Network and Storage Accelerator Card Built for Any Server, Any Cloud

First low-pro­fi­le PCIe Gen 4 card deli­vers dra­ma­tic impro­ve­ments in through­put, laten­cy and power effi­ci­en­cy for cri­ti­cal data cen­ter workloads

SAN JOSE, Calif.Aug. 6, 2019 /PRNewswire/ — Xilinx, Inc. (NASDAQ: XLNX), the lea­der in adap­ti­ve and intel­li­gent com­pu­ting, today expan­ded its Alveo data cen­ter acce­le­ra­tor card port­fo­lio with the launch of the Alveo U50. The U50 card is the industry’s first low pro­fi­le adap­ta­ble acce­le­ra­tor with PCIe Gen 4 sup­port, uni­que­ly desi­gned to super­char­ge a broad ran­ge of cri­ti­cal com­pu­te, net­work and sto­rage workloads, all on one recon­fi­gura­ble platform.

 

Alveo U50 Data Cen­ter Acce­le­ra­tor Card (PRNewsfoto/Xilinx, Inc.)

The Alveo U50 pro­vi­des cus­to­mers with a pro­gramma­ble low pro­fi­le and low-power acce­le­ra­tor plat­form built for sca­le-out archi­tec­tures and domain-spe­ci­fic acce­le­ra­ti­on of any ser­ver deploy­ment, on-pre­mi­se, in the cloud and at the edge. To meet the chal­lenges of emer­ging dyna­mic workloads such as cloud micro­ser­vices, Alveo U50 deli­vers bet­ween 10–20x impro­ve­ments in through­put, laten­cy and power effi­ci­en­cy. For acce­le­ra­ted net­wor­king and sto­rage workloads, the U50 card helps deve­lo­pers iden­ti­fy and eli­mi­na­te laten­cy and data move­ment bot­t­len­ecks by moving com­pu­te clo­ser to the data.

Powered by the Xilinx® UltraS­ca­le+™ archi­tec­tu­re, the Alveo U50 card is the first in the Alveo port­fo­lio to be packa­ged in a half-height, half-length form fac­tor and low 75-Watt power enve­lo­pe. The card fea­tures high-band­width memo­ry (HBM2), 100 giga­bit per second (100 Gbps) net­wor­king con­nec­ti­vi­ty, and sup­port for the PCIe Gen 4 and CCIX inter­con­nects. By fit­ting into stan­dard PCIe ser­ver slots and using one-third the power, the Alveo U50 signi­fi­cant­ly expands the scope in which adap­ta­ble acce­le­ra­ti­on can be deploy­ed to unlock dra­ma­tic through­put and laten­cy impro­ve­ments for deman­ding com­pu­te, net­work and sto­rage workloads. The 8GB of HBM2 deli­vers over 400 Gbps data trans­fer speeds and the QSFP ports pro­vi­de up to 100 Gbps net­work con­nec­ti­vi­ty. The high-speed net­wor­king I/O also sup­ports advan­ced appli­ca­ti­ons like NVMe-oF™ solu­ti­ons (NVM Express over Fabrics™), dis­ag­gre­ga­ted com­pu­ta­tio­nal sto­rage and spe­cia­li­zed finan­cial ser­vices applications. 

From machi­ne lear­ning infe­rence, video trans­co­ding and data ana­ly­tics to com­pu­ta­tio­nal sto­rage, elec­tro­nic tra­ding and finan­cial risk mode­ling, the Alveo U50 brings pro­gramma­bi­li­ty, fle­xi­bi­li­ty, and high through­put and low laten­cy per­for­mance advan­ta­ges to any ser­ver deploy­ment. Unli­ke fixed archi­tec­tu­re alter­na­ti­ves, the soft­ware and hard­ware pro­gramma­bi­li­ty of the Alveo U50 allows cus­to­mers to meet ever-chan­ging demands and opti­mi­ze appli­ca­ti­on per­for­mance as workloads and algo­rith­ms con­ti­nue to evolve. 

Alveo U50 acce­le­ra­ted solu­ti­ons deli­ver signi­fi­cant cus­to­mer value across a ran­ge of appli­ca­ti­ons, including:

  • Deep lear­ning infe­rence acce­le­ra­ti­on (speech trans­la­ti­on): deli­vers up to 25x lower laten­cy, 10x hig­her through­put and signi­fi­cant­ly impro­ved power effi­ci­en­cy per node com­pared to GPU-only for speech trans­la­ti­on per­for­mance1;
  • Data ana­ly­tics acce­le­ra­ti­on (data­ba­se query): run­ning the TPC‑H Query bench­mark, Alveo U50 deli­vers 4x hig­her through­put per hour and redu­ced ope­ra­tio­nal cos­ts by 3x com­pared to in-memo­ry CPU2;
  • Com­pu­ta­tio­nal sto­rage acce­le­ra­ti­on (com­pres­si­on): deli­vers 20x more compression/decompression through­put, fas­ter Hadoop and big data ana­ly­tics, and over 30 per­cent lower cost per node com­pared to CPU-only nodes3;
  • Net­work acce­le­ra­ti­on (elec­tro­nic tra­ding): deli­vers 20x lower laten­cy and sub-500ns tra­ding time com­pared to CPU-only laten­cy of 10us4;
  • Finan­cial mode­ling (grid com­pu­ting): run­ning the Mon­te Car­lo simu­la­ti­on, Alveo U50 deli­vers 7x grea­ter power effi­ci­en­cy com­pared to GPU-only per­for­mance5 for a fas­ter time to insight, deter­mi­ni­stic laten­cy and redu­ced ope­ra­tio­nal costs.

Ever-gro­wing demands on the data cen­ter are pushing exis­ting infra­struc­tu­re to its limit, dri­ving the need for adap­ta­ble solu­ti­ons that can opti­mi­ze per­for­mance across a broad ran­ge of workloads and extend the life­cy­cle of exis­ting infra­struc­tu­re, ulti­m­ate­ly redu­cing TCO,” said Salil Raje, exe­cu­ti­ve vice pre­si­dent and gene­ral mana­ger, Data Cen­ter Group, at Xilinx. “The new Alveo U50 brings an opti­mi­zed form fac­tor and unpre­ce­den­ted per­for­mance and adap­ta­bi­li­ty to data cen­ter workloads, and we con­ti­nue to build out solu­ti­on stacks with a gro­wing eco­sys­tem of appli­ca­ti­on part­ners to deli­ver pre­vious­ly unthinkable capa­bi­li­ties to a ran­ge of industries.”

Indus­try Support

The forth­co­ming 2nd Gen AMD EPYC pro­ces­sor is ide­al­ly sui­ted for data cen­ter-first acce­le­ra­tors like the Alveo U50 that com­bi­ne com­pu­te, net­work and sto­rage acce­le­ra­ti­on all on the same plat­form,” said Rag­hu Nam­bi­ar, vice pre­si­dent & CTO of appli­ca­ti­on engi­nee­ring at AMD. “Taking advan­ta­ge of AMD’s lea­der­ship, first x86 ser­ver-class PCIe 4.0 CPU, the Alveo U50 will be the industry’s first adap­ta­ble acce­le­ra­tor card with PCIe 4.0 sup­port. We look for­ward to working with Xilinx to com­bi­ne the bene­fits of AMD EPYC based solu­ti­ons with Alveo acce­le­ra­ti­on to hypers­ca­le and enter­pri­se customers.”

IBM is exci­ted about the expan­si­on of the Xilinx Alveo port­fo­lio with the addi­ti­on of the Alveo U50 adap­ta­ble acce­le­ra­tor card,” said Ste­ve Fields, Chief Archi­tect for IBM Power Sys­tems. “We belie­ve the com­bi­na­ti­on of low-pro­fi­le form-fac­tor, HBM2 memo­ry per­for­mance, and PCIe Gen 4 speed to inter­face with IBM Power pro­ces­sors will enable the Open­POWER eco­sys­tem to pro­vi­de cut­ting edge adap­ta­ble acce­le­ra­ti­on solutions.” 

With the smal­ler design and advan­ced fea­tures of the Alveo U50, Xilinx is well posi­tio­ned to expand the mar­kets for acce­le­ra­ti­on with con­fi­gura­ble logic,” said Karl Freund, seni­or ana­lyst, HPC and deep lear­ning, Moor Insights & Stra­tegy. “The new Alveo U50 should allow them to break through the mar­ket noi­se with demons­tra­ted and dra­ma­tic per­for­mance advan­ta­ges in high-growth use cases.”

We are exci­ted to be col­la­bo­ra­ting with Xilinx at FMS, show­ca­sing the fle­xi­bi­li­ty and per­for­mance of the Alveo U50 and our Open­Flex com­posable NVMe-oF plat­form,” said Scott Hamil­ton, seni­or direc­tor of pro­duct manage­ment, Data Cen­ter Sys­tems busi­ness unit at Wes­tern Digi­tal. “Xilinx is lea­ding the char­ge in fabric-based com­pu­ta­tio­nal sto­rage using NVMe-oF to enable full dis­ag­gre­ga­ti­on of ser­ver resour­ces. We belie­ve the new Alveo U50 will be an important part of the eco­sys­tem as orga­niza­ti­ons take a tru­ly dis­ag­gre­ga­ted approach to SDS infrastructure.”

Avai­la­bi­li­ty: 
The Alveo U50 is sam­pling now with OEM sys­tem qua­li­fi­ca­ti­ons in pro­cess. Gene­ral avai­la­bi­li­ty is sla­ted for fall 2019.

Flash Memo­ry Summit: 
Xilinx will be show­ca­sing the Alveo U50 and other pro­duct demons­tra­ti­ons in booth 313 at Flash Memo­ry Sum­mit (FMS) 2019, taking place August 6–8 at the San­ta Cla­ra Con­ven­ti­on Cen­ter in San­ta Cla­ra, Calif. 

Addi­tio­nal­ly, Salil Raje, exe­cu­ti­ve vice pre­si­dent and gene­ral mana­ger, Data Cen­ter Group, at Xilinx, will be giving a key­note titled, “FPGAs: The Key to Acce­le­ra­ting High-Speed Sto­rage Sys­tems” on August 7 at 2:40 p.m. PT in the Mis­si­on City Ballroom.

About Xilinx
Xilinx deve­lo­ps high­ly fle­xi­ble and adap­ti­ve pro­ces­sing plat­forms that enable rapid inno­va­ti­on across a varie­ty of tech­no­lo­gies – from the end­point to the edge to the cloud. Xilinx is the inven­tor of the FPGA, hard­ware pro­gramma­ble SoCs, and the ACAP, desi­gned to deli­ver the most dyna­mic pro­ces­sor tech­no­lo­gy in the indus­try and enable the adap­ta­ble, intel­li­gent and con­nec­ted world of the future. For more infor­ma­ti­on, visit www.xilinx.com.

Foot­no­tes:

  1. Per­for­mance of Alveo U50, with both Alveo U50 and Nvi­dia Tes­la T4 run­ning (B=2, L=8), Tes­la T4 (B=8, L=8) (esti­ma­ted data)
  2. Alveo U50=24ms, 150k query/hr / CPU Query time = 210ms, 34k query/hr. based on Intel Xeon Pla­ti­num 8260 Pro­ces­sor (35.75M Cache, 2.40 GHz) 24 core
  3. Intel Sky­la­ke-SP 6152 @2.10GHz CPU (Ubun­tu 16.04) CPU Query time = 210ms, 34k query/hr. Alveo U50=24ms, 150k query/hr Xilinx Alveo U50 SDAc­cel 2018.3 (esti­ma­te) GB/s com­pres­si­on per CPU core = .0229. Alveo U50 = 10GB/s (esti­ma­te)
  4. Alveo U50 laten­cy is <0.5us, CPU laten­cy is 10us. Mea­su­red from start of packet in on Tick (Mar­ket Data) to start of packet out on the order to Start Packet Out on the Order (esti­ma­te)
  5. Intel Xeon E5-2697 v4 GCC 5.4.0 Nvi­dia Tes­la V100 16GB PCIe CUDA 10.1 / GCC 5.4.0 Intel Sky­la­ke-SP 6152 @2.10GHz CPU (Ubun­tu 16.04) CPU Query time = 210ms, 34k query/hr. Alveo U50=24ms, 150k query/hr Xilinx Alveo U50 SDAc­cel 2018.3 (esti­ma­ted data).

© Copy­right 2019 Xilinx, Inc. Xilinx, the Xilinx logo, and other desi­gna­ted brands included her­ein are trade­marks of Xilinx in the United Sta­tes and other count­ries. NVMe-oF and NVM Express over Fabrics are trade­marks of NVM Express, Inc. PCI, PCIe and PCI Express are trade­marks of PCI-SIG and used under licen­se.  All other trade­marks are the pro­per­ty of their respec­ti­ve owners.