{"id":22990,"date":"2016-04-11T15:54:03","date_gmt":"2016-04-11T13:54:03","guid":{"rendered":"http:\/\/www.planet3dnow.de\/cms\/?p=22990"},"modified":"2016-04-11T17:36:19","modified_gmt":"2016-04-11T15:36:19","slug":"neuer-compiler-gcc-6-mit-hsa-und-zen-support","status":"publish","type":"post","link":"https:\/\/www.planet3dnow.de\/cms\/22990-neuer-compiler-gcc-6-mit-hsa-und-zen-support\/","title":{"rendered":"Neuer Compiler <span class=\"caps\">GCC<\/span> 6 mit <span class=\"caps\">HSA-<\/span> und Zen-Support"},"content":{"rendered":"<p>Der schon l\u00e4n\u00adger ange\u00adk\u00fcn\u00addig\u00adte Com\u00adpi\u00adler <span class=\"caps\">GCC<\/span> in der Ver\u00adsi\u00adon 6 steht offen\u00adbar kurz vor der Ver\u00ad\u00f6f\u00adfent\u00adli\u00adchung. <span class=\"caps\">GCC<\/span> ist der Stan\u00addard-Com\u00adpi\u00adler unter Linux, mit dem die C\u2011Quellcodes in aus\u00adf\u00fchr\u00adba\u00adre Maschi\u00adnen\u00adspra\u00adche \u00fcber\u00adsetzt werden.<\/p>\n<p>Neu in der Ver\u00adsi\u00adon 6 ist die  Unter\u00adst\u00fct\u00adzung f\u00fcr AMDs kom\u00admen\u00adde CPU-Archi\u00adtek\u00adtur Zen sowie f\u00fcr <span class=\"caps\">HSA<\/span>, also der zur L\u00f6sung einer Auf\u00adga\u00adbe gemein\u00adsam von <span class=\"caps\">CPU<\/span> und <span class=\"caps\">GPU<\/span> genutz\u00adte Spei\u00adcher, erst\u00admals unter\u00adst\u00fctzt in AMDs Kaveri-APUs, voll in Carrizo.<\/p>\n<p>Hier die f\u00fcr die x86-Welt rele\u00advan\u00adte Fea\u00adture\u00adlis\u00adte von <span class=\"caps\">GCC<\/span>&nbsp;6:<\/p>\n<blockquote><p><strong>Hete\u00adro\u00adge\u00adneous Sys\u00adtems Architecture<\/strong><\/p>\n<p><span class=\"caps\">GCC<\/span> can now gene\u00adra\u00adte <span class=\"caps\">HSAIL<\/span> (Hete\u00adro\u00adge\u00adneous Sys\u00adtem Archi\u00adtec\u00adtu\u00adre Inter\u00adme\u00addia\u00adte Lan\u00adguage) for simp\u00adle OpenMP device con\u00ads\u00adtructs if con\u00adfi\u00adgu\u00adred with \u2013enable-offload-targets=hsa. A new libgomp plug\u00adin then runs the <span class=\"caps\">HSA<\/span> <span class=\"caps\">GPU<\/span> ker\u00adnels imple\u00admen\u00adting the\u00adse con\u00ads\u00adtructs on <span class=\"caps\">HSA<\/span> capa\u00adble GPUs via a stan\u00addard <span class=\"caps\">HSA<\/span> run&nbsp;time.<\/p>\n<p>If the <span class=\"caps\">HSA<\/span> com\u00adpi\u00adla\u00adti\u00adon back end deter\u00admi\u00adnes it can\u00adnot out\u00adput <span class=\"caps\">HSAIL<\/span> for a par\u00adti\u00adcu\u00adlar input, it gives a war\u00adning by default. The\u00adse war\u00adnings can be sup\u00adpres\u00adsed with \u2011Wno-hsa. To give a few examp\u00adles, the <span class=\"caps\">HSA<\/span> back end does not imple\u00adment com\u00adpi\u00adla\u00adti\u00adon of code using func\u00adtion poin\u00adters, auto\u00adma\u00adtic allo\u00adca\u00adti\u00adon of varia\u00adble sized arrays, func\u00adtions with varia\u00addic argu\u00adments as well as a num\u00adber of other less com\u00admon pro\u00adgramming constructs.<\/p>\n<p>When com\u00adpi\u00adla\u00adti\u00adon for <span class=\"caps\">HSA<\/span> is enab\u00adled, the com\u00adpi\u00adler attempts to com\u00adpi\u00adle com\u00adpo\u00adsi\u00adte OpenMP constructs<br>\n#prag\u00adma omp tar\u00adget teams dis\u00adtri\u00adbu\u00adte par\u00adal\u00adlel&nbsp;for<br>\ninto par\u00adal\u00adlel <span class=\"caps\">HSA<\/span> <span class=\"caps\">GPU<\/span> kernels.<\/p>\n<p><strong><span class=\"caps\">IA-32<\/span>\/x86-64<\/strong><\/p>\n<p><span class=\"caps\">GCC<\/span> now sup\u00adports the Intel <span class=\"caps\">CPU<\/span> named Sky\u00adla\u00adke with <span class=\"caps\">AVX-512<\/span> exten\u00adsi\u00adons through \u2011march=skylake-avx512. The switch enables the fol\u00adlo\u00adwing <span class=\"caps\">ISA<\/span> exten\u00adsi\u00adons: <span class=\"caps\">AVX-512F<\/span>, <span class=\"caps\">AVX512VL<\/span>, <span class=\"caps\">AVX-512CD<\/span>, <span class=\"caps\">AVX-512BW<\/span>, <span class=\"caps\">AVX-512DQ<\/span>.<\/p>\n<p>Sup\u00adport for new <span class=\"caps\">AMD<\/span> ins\u00adtruc\u00adtions moni\u00adtorx and mwaitx has been added. This includes new intrin\u00adsic and built-in sup\u00adport. It is enab\u00adled through opti\u00adon \u2011mmwaitx. The ins\u00adtruc\u00adtions moni\u00adtorx and mwaitx imple\u00adment the same func\u00adtion\u00ada\u00adli\u00adty as the old moni\u00adtor and mwait ins\u00adtruc\u00adtions. In addi\u00adti\u00adon mwaitx adds a con\u00adfi\u00adgura\u00adble timer. The timer value is recei\u00adved as third argu\u00adment and stored in regis\u00adter %ebx.<\/p>\n<p>x86-64 tar\u00adgets now allow stack rea\u00adlignment from a word-ali\u00adgned stack poin\u00adter using the com\u00admand-line opti\u00adon \u2011mstack\u00adrea\u00adlign or __attribute__ ((force_align_arg_pointer)). This allows func\u00adtions com\u00adpi\u00adled with a vec\u00adtor-ali\u00adgned stack to be invo\u00adked from objects that keep only word-alignment.<\/p>\n<p>Sup\u00adport for address spaces __seg_fs, __seg_gs, and __seg_tls. The\u00adse can be used to access data via the %fs and %gs seg\u00adments wit\u00adhout having to resort to inline assem\u00adbly. Plea\u00adse refer to the docu\u00admen\u00adta\u00adti\u00adon for usa\u00adge instructions.<\/p>\n<p>Sup\u00adport for <span class=\"caps\">AMD<\/span> Zen (fami\u00adly 17h) pro\u00adces\u00adsors is now available through the \u2011march=znver1 and \u2011mtune=znver1 options.<\/p><\/blockquote>\n<p>Lei\u00adder wer\u00adden die meis\u00adten Dis\u00adtri\u00adbu\u00adtio\u00adnen und Anwen\u00addun\u00adgen nicht auto\u00adma\u00adtisch in den Genuss der Opti\u00admie\u00adrun\u00adgen kom\u00admen, da die Pake\u00adte in der Regel vor\u00adkom\u00adpi\u00adliert aus Repo\u00adsi\u00adto\u00adrys gela\u00adden wer\u00adden. Wer opti\u00admier\u00adten Code m\u00f6ch\u00adte, muss eine Dis\u00adtri\u00adbu\u00adti\u00adon w\u00e4h\u00adlen, die aus Quell\u00adcode kom\u00adpi\u00adliert wird oder sich wenigs\u00adtens die rele\u00advan\u00adten Anwen\u00addun\u00adgen selbst aus dem Quell\u00adcode kom\u00adpi\u00adlie\u00adren, was bei Open-Source-Soft\u00adware zumin\u00addest theo\u00adre\u00adtisch mach\u00adbar ist \u2013 die ent\u00adspre\u00adchen\u00adden F\u00e4hig\u00adkei\u00adten vorausgesetzt.<\/p>\n<p>Wie\u00adviel opti\u00admier\u00adter Code brin\u00adgen kann, sieht man an den GCC-Kom\u00adpi\u00adla\u00adten mit ver\u00adschie\u00adde\u00adnen CPU-Flags, die wir vor eini\u00adger Zeit <a href=\"http:\/\/www.planet3dnow.de\/cms\/18564-amd-piledriver-vs-steamroller-vs-excavator-leistungsvergleich-der-architekturen\/subpage-praxistests-fritzchess-lame\/\">f\u00fcr die Bull\u00addo\u00adzer-Archi\u00adtek\u00adtur ver\u00ad\u00f6f\u00adfent\u00adlicht<\/a> haben. Hier ein Kom\u00admen\u00adtar dazu aus dem dama\u00adli\u00adgen Artikel:<\/p>\n<blockquote><p>Wenig \u00fcber\u00adra\u00adschend dau\u00adert das Encoden einer <span class=\"caps\">WAV<\/span> ins MP3-For\u00admat weni\u00adger lang, je st\u00e4r\u00adker das Kom\u00adpi\u00adlat auf die CPU-Archi\u00adtek\u00adtur opti\u00admiert ist. Die Unter\u00adschie\u00adde inner\u00adhalb der Kom\u00adpi\u00adla\u00adte jedoch lie\u00adgen immer auf \u00e4hn\u00adli\u00adchem Niveau. Steam\u00adrol\u00adler ist ~5 % schnel\u00adler als Piledri\u00adver und Excava\u00adtor legt noch mal 10 Pro\u00adzent\u00adpunk\u00adte drauf. So lie\u00adgen zwi\u00adschen dem Enco\u00addie\u00adren mit dem Stan\u00addard-Kom\u00adpi\u00adlat auf Piledri\u00adver und dem des maxi\u00admal opti\u00admier\u00adten auf Excava\u00adtor immer\u00adhin 65 % Per\u00adfor\u00admance-Gewinn f\u00fcr die\u00adsel\u00adbe <span class=\"caps\">WAV<\/span> bei iden\u00adti\u00adscher Kon\u00adfi\u00adgu\u00adra\u00adti\u00adon und iden\u00adti\u00adschem Takt. Hal\u00adlo Soft\u00adware-Ent\u00adwick\u00adler! Hier schlum\u00admert Poten\u00adzi\u00adal, das ein\u00adfach per Com\u00adpi\u00adler-Flag akti\u00adviert wer\u00adden&nbsp;kann!<\/p><\/blockquote>\n<p><strong>Quel\u00adle:<\/strong> <a href=\"https:\/\/gcc.gnu.org\/gcc-6\/changes.html\" target=\"_blank\"><span class=\"caps\">GCC<\/span> 6 Release Series Chan\u00adges, New Fea\u00adtures, and&nbsp;Fixes<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Der schon l\u00e4n\u00adger ange\u00adk\u00fcn\u00addig\u00adte Com\u00adpi\u00adler <span class=\"caps\">GCC<\/span> in der Ver\u00adsi\u00adon 6 steht offen\u00adbar kurz vor der Ver\u00ad\u00f6f\u00adfent\u00adli\u00adchung. <span class=\"caps\">GCC<\/span> ist der Stan\u00addard-Com\u00adpi\u00adler unter Linux, mit dem die C\u2011Quellcodes in aus\u00adf\u00fchr\u00adba\u00adre Maschi\u00adnen\u00adspra\u00adche \u00fcber\u00adsetzt wer\u00adden. Neu in der Ver\u00adsi\u00adon 6 ist die Unter\u00adst\u00fct\u00adzung f\u00fcr AMDs kom\u00admen\u00adde CPU-Archi\u00adtek\u00adtur Zen sowie f\u00fcr <span class=\"caps\">HSA<\/span>, also der zur L\u00f6sung einer Auf\u00adga\u00adbe gemein\u00adsam von von <span class=\"caps\">CPU<\/span> und <span class=\"caps\">GPU<\/span> genutz\u00adte Spei\u00adcher, erst\u00admals unter\u00adst\u00fctzt in AMDs Kaveri-APUs, voll in Car\u00adri\u00adzo. (\u2026) <a class=\"moretag\" href=\"https:\/\/www.planet3dnow.de\/cms\/22990-neuer-compiler-gcc-6-mit-hsa-und-zen-support\/\">Wei\u00adter\u00adle\u00adsen&nbsp;\u00bb<\/a><\/p>\n","protected":false},"author":2,"featured_media":5194,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"wp_typography_post_enhancements_disabled":false,"ngg_post_thumbnail":0,"footnotes":""},"categories":[12,11],"tags":[1074,1003,985,656],"class_list":["post-22990","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-aktuelles","category-news","tag-gcc","tag-hsa","tag-linux","tag-zen","entry"],"share_on_mastodon":{"url":"","error":""},"_links":{"self":[{"href":"https:\/\/www.planet3dnow.de\/cms\/wp-json\/wp\/v2\/posts\/22990","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.planet3dnow.de\/cms\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.planet3dnow.de\/cms\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.planet3dnow.de\/cms\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.planet3dnow.de\/cms\/wp-json\/wp\/v2\/comments?post=22990"}],"version-history":[{"count":12,"href":"https:\/\/www.planet3dnow.de\/cms\/wp-json\/wp\/v2\/posts\/22990\/revisions"}],"predecessor-version":[{"id":23030,"href":"https:\/\/www.planet3dnow.de\/cms\/wp-json\/wp\/v2\/posts\/22990\/revisions\/23030"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.planet3dnow.de\/cms\/wp-json\/wp\/v2\/media\/5194"}],"wp:attachment":[{"href":"https:\/\/www.planet3dnow.de\/cms\/wp-json\/wp\/v2\/media?parent=22990"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.planet3dnow.de\/cms\/wp-json\/wp\/v2\/categories?post=22990"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.planet3dnow.de\/cms\/wp-json\/wp\/v2\/tags?post=22990"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}