Home » AMD, the controversy with the AVX-512 and the performance of the Ryzen 7000
Technology

AMD, the controversy with the AVX-512 and the performance of the Ryzen 7000

AMD’s presentation with its Ryzen 7000 continues to kick ass two days later, and after unpacking the amount of data on display today, it’s the turn of the “new” AVX-512 user manual from Intel which now also include the red team processors. The performances displayed were very good, but due to the fact that little data is offered and according to the tests that have been exposed, the controversy is in the environment and it is heating up between detractors and critics of the implementation, and on the other hand, those who believe it is a step forward.

There are really too many holes to fill to comment on what he did with it. AVX-512 those of Lisa Su in these new processors, we will therefore see the arguments of each other, and above all, put into context before this amalgam of data that is emerging.

AVX instructions, worthy successors to SSE and still up to date

Intel AVX-512

This type of instruction is nothing but a diversion by Intel to compute and accelerate workloads in scientific environments, financial analysis or AI and DL, the main reasons for which they are used in PCs, whether in the consumer or HEDT range, as well as on servers.

They are also used in 3D modeling and analysis, for image and video processing, cryptography or data compression. Their uses are multiple, but they have a fixed base on which these AVX-512 rely to operate, the SIMD units. This type of unit parallelizes the work and the load according to its implementation, because, for example, at Intel there are 9 versions of these instructions:

  • AVX-512-CD
  • AVX-512-ER
  • AVX-512-PF
  • AVX-512-VBMI
  • AVX-512-VL
  • AVX-512-DQ
  • AVX-512-BW
  • AVX-512-VNNI

Each specializes in different tasks, not all of them can perform the same thing and therefore we are not sure which ones AMD has chosen for the Ryzen 7000. Knowing this and bearing in mind that the idea of ​​the red team is precisely not needing a powerful iGPU to be able to speed up the tasks they seek and in return reduce consumption within the limits of what they can afford, the data they have provided is, to say the least controversial.

AMD Ryzen 7000 with AVX-512, as good as it looks?

amd-avx-512-ryzen-7000-performance

Before we get into AMD’s data, let’s see what Intel officially says about its instructions:

Applications can include 32 double-precision floating-point operations and 64 single-precision floating-point operations per clock cycle on 512-bit vectors, as well as eight 64-bit and sixteen 32-bit integers, with up to two combined multiplication and addition units. -bit (FMA), thus doubling both the width of the data registers and the number of registers and the width of the FMA units, compared to Intel Advanced Vector Extensions 2 (Intel AVX2).

Well, understanding all that has been said, we have to move on to the empirical data, which we already know on the one hand, and the news that comes to us filtered. What we see on the top slide is that AMD is making a 30% more performance in the FP32 against the Ryzen 5000 and a 150% on int8 (AVX-512 VNNI) compared to the same series of processors in ONNX.

The controversy here comes from ONNX itself and the instructions used, as they wouldn’t be the best option for comparing performance, let alone against Intel and Raptor Lake. Double-step controversy, the scores of Core 13 processors against the Ryzen 7000 in Geekbench 5 they wouldn’t be quite realbecause rumors indicate that Intel ES do not have AVX-512 enabled while AMD does.

From the version of Geekbench 5.1.0 this suite supports these instructions and helps workloads in the section AES-XTS, and so AMD would have a slight advantage in terms of scoring if the Raptor Lake rumor turns out to be true. How much advantage? As it is only one section among many others, not much. It is estimated that between 2% and 3% on average of the total of the score, so there’s no need to worry that it’s really muddling the data, but it needed to be nuanced, especially since Intel with an AVX-512 unit might stretch the distance a bit further if it manages to keep the frequencies at high values.

AVX-512 1 x FMA or 2 x FMA, which is better?

intel-avx-512-registers

The correct term here is to speak of FMA (Fused Multiple Add) units available, since from AVX2 several of them are authorized depending on the processor and its architecture (AVX512F and its extensions for programmers in this case).

What is the problem? Well, AMD will implement two 256-bit FMAs instead of a 512-bit unit, so that means each AVX-512 instruction requires two clock cycles do the job instead of just one. Well, this subtle difference changes everything.

Intel disabled AVX-512 drives at Alder Lake for a fairly simple “problem”: using them, the consumption has increased a little and the frequency has decreased. The reason is easy to understand, because when using these instructions, the part of the processor that implements them requires effort both in terms of voltage and workload and that results in a reduction in MHz in order to execute them safe for himself. (temperature mainly from consumption). Moreover, the load is so great that saturates the performance of other attached statements to him for the amount of resources and records he needs.

i.e. use AVX-512 at Intel is synonymous with frequency drop and therefore lower performance in the work to be done. The problem is that despite this fall, Intel is still committed to them and includes them in the P-Core of Alder Lakeeven though disabled by firmware or physically in the matrix, depending on when we purchased our CPU.

Alder-Lake-die-diagram

Therefore, and knowing that AMD uses 2 x 256 bit on the Ryzen 7000 With these instructions and knowing that there is a performance penalty for this compared to the one SIMD per core option, the debate is whether this is the smartest option to choose and if, logically, Intel is obliged (in theory) with Raptor Lake to switch from 2 x 256 bits to the reverse scheme of the Ryzen 7000, i.e. 1 x 512 bits. Alder Lake is left out of the equation for the reasons already discussed.

If they were able to do this without (or minimal) performance losses, Raptor Lake would easily win the game on AMD Ryzen 7000 Zen 4 in AVX-512, since it starts from a higher frequency, but at the same time a consumption higher (the main handicap). On another side, 1 x 512 bit takes up more physical space in the matrix than 2 x 256 bits by the hierarchy itself and the layout of the HMIS FMAtherefore also the proportional share of the total area of ​​the Core 13.

As at the moment we do not know the performance of either of the two opposites in pure AVX-512, we can only wait, but we already have the explanation and the prelude when the data starts to come out .

About the author

admin

Add Comment

Click here to post a comment

Your email address will not be published. Required fields are marked *