Server ARM processors Grace presented by Nvidia already in April 2021. The processor contains 72 ARM v9 Neoverse (N2 Perseus) cores, the cores can be connected to each other via the NVLink-C2C interface with a throughput of 900 GB/s. The processor is equipped with an LPDDR5X controller with ECC support and achieves a throughput of 512 GB/s.
On the occasion of HotChips, there was a slide of Nvidia and ARM presenting the performance Grace in five different loads:
source: Nvidia
The left graph presents the performance as such, the right shows the performance for a system sized for the same consumption.
It is logical that the manufacturer tries to present its product in the best possible light, however, some elements are not quite correct. First of all, the label (in fine print at the bottom) only lists the specs for the right chart, not the left. It cannot be ruled out or confirmed, here when testing the data for the left graph, 1 x86 processor did not stand up against a dual processor module Grace Superchip (144 jader + 1 TB/s).
One can also ask why Nvidia is comparing this year’s server processor with last year’s model from AMD. It has been offered by Epyc for several months Bergamowhich is opposite to Epyc Genoa more economical and at the same time more powerful, and also Epyc Genoa-X equipped with V-cache, which in some of the loads tested here provides up to several times higher performance at the same TDP.
Epyc Genoa-X: V-cache effect on performance (Phoronix)
For example, in OpenFOAM, according to the Phoronix test, V-cache increases performance by 2.05x. So, in the first graph, Genoa-X the Epyc column was not ~10% lower than the column Grace, but 2x higher and in the second it would not be half as much, but 8% higher. This is just to illustrate how much can “appropriate” selection of competing hardware will affect how the comparison chart turns out.
2023-08-31 22:08:32
#Nvidia #CPU #Grace #performance #close #AMD #Genoa #higher #energy #efficiency