–
A new king of memory for graphics cards is on the horizon. Hynix already has developed HBM3 memories, which will sweep with performance with everything we’ve seen so far.
So far, the new GDDR7 memory technology is probably far away (although Nvidia seems to have outpaced standardization with Micron and set up its own GDDR6X). But a powerful GPU could get another new memory technology in the near future: HBM3 chips. Hynix has now announced that it has developed the first generation of these memories, which could increase the throughput of graphics almost an order of magnitude. A single HBM3 chip (case) has a throughput higher than the entire GeForce RTX 3080.
SK Hynix, which was behind the very first generation of this technology (hereinafter referred to as “HBM”), has now announced the development of the third generation of the entire technology. Respectively, fourth, if we consider HBM2E, which increased the frequency and capacity against HBM2, but otherwise did not differ much, for a separate version.
About HBM3 memory was spoken of in the future, but only now is he slowly getting ready for the real world. In the summer, the availability of IP for memory controllers from Synopsys and Rambus was announced, so now the memory has also been announced. However, it seems that the standard has not yet been finally ratified, we will probably wait until 2022 for the arrival of ready-made devices (GPUs, CPUs and accelerators) with these memories.
Giant throughput
HBM3 will naturally increase performance, quite a bit. It is calculated with an effective frequency of up to 6400 MHz, which is at the level of LPDDR5, GDDR5 or overclocked future DDR5. HBM started at relatively low frequencies, but gradually reduces this deficit. This would mean that the memories will have 3.2 times better throughput than 2.0GHz (effectively) HBM2, for example in the card AMD Radeon VII. It’s also twice the acceleration of the 3.2GHz HBM2E, although since then there have been faster variants beyond the standard.
HBM3 memories should again be manufactured in the form of multi-chip packages with a very wide bus (has 1024 bits), so in this case one “cockroach” is not one chip, but several chips layered inside. One such case will be replaced by several GDDR6 chips, graphics cards could theoretically have only one case.
If the HBM3 will run at the announced maximum clock rate of 6400 MHz efficiently (or, if you want, 6400 MT / s per pin / bit), then a single case will have a throughput of 819.2 GB / s, which is now the equipment of a high-end graphics card – Nvidia GeForce RTX 3080 has lower throughput (760 GB / s), GeForce RTX 3080 Ti has more (912 GB / s). And those 819 GB / s are, by the way, twice the throughput of a brutal integrated GPU Apple M1 Max processor, which impressed with the 512-bit LPDDR5-6400 memory controller with a throughput of 400 GB / s (more precisely it is about 409 GB / s). Compared to the possibilities of HBM3, however, it no longer looks like much.
If the GPU were equipped with only two of these cases (like the once Radeon RX Vega 56/64), it would have a throughput of 1,638 TB / s, much more than the most powerful gaming graphics yet. With four packages (4096-bit bus) already 3,276 TB / s and if even six packages (6144-bit bus) were used, which they have recently supported high-end computing GPU Nvidia, we get almost 5 TB / s.
With HBM3, higher capacities should also be available for GPUs or other processors. The currently developed HBM3 memories from Hynix will exist in the form of 16GB and 24GB cases, in which there are 12 or 8 layers of DRAM memory. Each 16 Gb / 2 GB slice is only about 30 micrometers thick and is connected by TSV (vertical wires passing through the chip).
Tip: Nvidia GeForce RTX 3090 has new GDDR6X memories with speeds up to 21 GHz, Micron confirmed
Maybe in Nvidia Hopper
The most likely users of HBM3 memory will be manufacturers of AI accelerators and similar hardware. They can also be specialized ASICs, but of course also the most powerful computing GPU Nvidia, maybe even the upcoming GPU Hopper, allegedly chiplet and maybe very large, because they are said to have a consumption of over 1000 W. This would imply a huge number of units as well as computing power, so perhaps such a monster could really need as crazy memory throughput as those six HBM3 packages with a throughput of 5 TB / s.
While who knows, maybe Nvidia will jump to even more pieces, there are probably no specific limits, for example upcoming Xeony Sapphire Rapids from Intel integrating HBM2E memory they have up to eight housings under the hood (ie an 8192-bit bus). Also, the processors could theoretically have HBM3 integrated in their case, so we may see it in some future generation of Xeons.
Will HBM3 ever be in the game graphics?
Game graphics cards have stopped using HBM and HBM2 memory since the departure of the Radeon R9 Fury and then the Radeon RX Vega 56 and 64 cards (the only exception was Radeon Pro 5600M with Navi 12 chip from AMD used exclusively by Apple). The reason is that these memories have to be mounted on a silicon interposer, which makes their use much more expensive. Therefore, they are now only in compute GPUs for servers that sell at significantly higher prices than game cards (or at least that was before the cryptocurrency bubble). However, the fact that HBM memory technology could return to gaming graphics cards is probably not completely out of the question.
The advent of advanced encapsulation techniques such as Intel’s EMIB could help. These are silicon bridges, which are used only to connect the GPU and HBM2 memory under the chips. There is no need for a large-area interposer, so the unit is less expensive. Other companies, and not only Intel, should have similar technologies, so it is possible that in the future the use of HBM3 will become quite cheaper and the door to its use on game graphics will open again.
The energy savings that HBM should bring compared to relatively voracious GDDR type memories (especially GDDR6X seems to significantly increase the TDP of GeForce RTX 3000 cards) could then be used to increase the performance of the GPU itself. There would be more energy left for that. Alternatively, HBM3’s size and lower power consumption could enable better GPU performance in gaming notebooks.