In the offer of professional graphics cards Nvidia new models on architecture appeared There’s Lovelace. This, for example, brings up to 2 times higher throughput in FP32 calculations, similarly up to 2 times higher performance of RT and Tensor cores. Here we have DLSS 3, which will find use not only in games, but also in professional applications for the acceleration of imaging thanks to AI. There is also a larger memory that uses ECC (error control), and the cards are also optimized for different types of augmented reality (AR, VR and MR).
The offer starts at two RTX 4000, existing in a smaller SFF version and a new one in the classic version. The clocks of the chip and memory differ, which affects performance, but also consumption. The SFF version is sufficient with only 70 W, while in FP32 it offers a high performance of 19.2 TFLOPS. This is a performance somewhere between the GeForce RTX 4060 (15.1 TFLOPS) and the RTX 4060 Ti (22.1 TFLOPS), but they consume 115 W, respectively. 160 W. The 160-bit memory bus can be used here, and we have the unusual memory capacity of 20 GB. The full-fledged version with higher clocks is slightly below the GeForce RTX 4070 with a performance of 26.7 TFLOPS, but it only manages 130 W against 200 W. Compared to the professional RTX A2000, the new product has 2.1 times the graphics performance, 70% higher in CAD, 2 times for rendering, by 80% in AI tasks and 2.1 times higher in HPC.
RTX 4000 SFF
RTX 4000
RTX 4500
RTX 5000
RTX 6000
CUDA cores
6144
6144
7680
12800
18176
The kernel tensor
192
192
240
400
568
RT cores
48
48
60
100
142
Frequency approx. 1.56 GHz approx. 2.17 GHz approx. 2.58 GHz approx. 2.55 GHz approx. 2.51 GHz Power in FP32
19,2 TFLOPS
26,7 TFLOPS
39,6 TFLOPS
65,3 TFLOPS
91,1 TFLOPS
RT Performance 44.3 TFLOPS 61.8 TFLOPS 91.6 TFLOPS 151 TFLOPS 210.6 TFLOPS Tensor Performance 306.8 TFLOPS 327.6 TFLOPS 634 TFLOPS 1044.4 TFLOPS 1457 TFLOPS Interface PCIe 4.0 x16 PCIe 4.0 x16 PCIe 4.0 x16 4.0 x16 PCIe 4.0 x16 Memory 20 GB GDDR6 20 GB GDDR6 24 GB GDDR6 32 GB GDDR6 48 GB GDDR6 Memory bus 160-bit 160-bit 192-bit 256-bit 384-bit Memory speed 14 Gbps 18 Gbps 18 Gbps 18 Gbps 20 Gbps Memory bandwidth 280 GB/s 360 GB/s 432 GB/s 576 GB/s 960 GB/s Consumption
70 W
130 W
210 W
250 W
300 W
Dimensions
69×168 mm
112×241 mm
112×267 mm
112×267 mm
112×267 mm Ports
4× miniDP 1.4a
4× DP 1.4a
4× DP 1.4a
4× DP 1.4a
4× DP 1.4a
RTX 4500 reaches 7680 CUDA cores, which is the same as the GeForce RTX 4070 Ti, and due to the similar clocks, it also has similar performance, however, the consumption is only 210 W instead of 285 W. Its 192-bit interface works with 24 GB of memory, but they have lower clocks . Nvidia says it will offer 40% more performance than the RTX A4500 for generative AI, 60% more in graphics, 50% more in rendering, and the same goes for evaluating AI algorithms and CAD. In Omniverse, the performance is supposed to be 2.7 times higher.
It concludes the trio of news RTX 5000. It gets 12,800 CUDA cores, significantly surpassing the RTX 4080 with 9,728 cores. Thanks to this, it has a 34% higher gross performance, however, it is satisfied with only 250W consumption, while Nvidia states 320W for the GeForce RTX 4080. This card also has 256-bit memories, but they have twice the capacity of 32GB, but they run much slower (18 Gbps instead of 22.4 Gbps). Against the RTX A5500, it offers 50% more performance in training and generative AI, rendering is 2.1 times faster, graphics by 90%. HPC calculations are supposed to be 2.2 times faster on average, and Omniverse even 3.3 times faster. RTX 6000 is already familiar.
2023-08-09 17:29:40
#Nvidia #workstation #RTX #higher #performance