True, several GPUs had and in the still very limited application today – need – the same data in memory for video games. An affordable solution with the same inter-GPU bandwidth as GPUs on the circuit board itself has never been found.
There have been some builds with 2 GPUs on one circuit board with an idea in mind to build a kind of multi-GPU ring bus with shared memory, but I think it always stayed with step 1, a limited bandwidth SLI / Crossfire bridge on the PCB, instead of equivalent memory access.
Thus it has become less and less interesting for all parties. The (VR / Raytracing) rendering complexity continues to increase and optimizing all of that for multi-GPU in every game is financial and technical hassle. Negative reviews around microstutter and games where SLI / Crossfire even had a negative impact on performance eventually made it die out in my opinion.
With DX12 you can now put multiple video cards to work independently of each other. Then you can also use the memory separately for certain things. For example physics, spacial audio, AI opponents and ray tracing to a separate GPU + VRAM and the rendering + textures in the fastest GPU + VRAM. Then you could technically use some extra VRAM for the first time and save on the primary GPU. However, that amount of memory saved is relatively small, where people will expect to buy a 2nd card for nearly doubling in performance. It is now possible, but technically challenging and hardly anyone has a 2nd GPU with full features, so practically nobody can use it.
AMD takes an approach where theoretically you could gain something with 2x16GB. However, 16GB today is so much VRAM that you will no longer speak for 2x16GB in a few years in the future, but on a time horizon that, given the GPU power, you probably won’t be able to use it anymore.
With NVidia you have a 3070 with 8GB, a 3080 with 10GB, 3090 with 24GB. There, a 3080 with 10GB or purely from NVidia would make the 3090 a more logical choice than 2x a 3070. Where 16GB and 24GB is already a lot of VRAM, you may wonder how well you will get with 8GB VRAM in 2023, for example. Offloading the difference between the 3070 and 3080 worth 2GB of VRAM with secondary tasks only won’t work out anytime soon, so even after all that extra technical effort, a single 3080 would remain the better choice than 2x 3070 in any game architecture imaginable.
It’s a different story for simulations. This kind of work is often much easier to parallelize, and attention is usually paid to parallel algorithms in the first design and a plan to distribute calculations over several computers. For such software, working with multiple independent GPUs is child’s play compared to multiple computers. These GPUs often respond directly, but with DX12 multi-GPU they could probably also distribute some calculations over the cards in a sensible way. Workloads like that scale quietly with 80-99% efficiency across multiple cards, but don’t require SLI / Crossfire.
–