The entry of the blue brand into the graphics card market is something that is a fact and not for a single generation, but for several different ones. That is to say, there is an Intel GPU Roadmap, which consists of a series of graphic architectures that will appear during the following years to compete against NVIDIA and AMD. Let’s see, therefore, what we know and what we can expect.
At the moment Intel has announced neither more nor less than four different generations of graphics cards or at least four successive product ranges for the coming years. Which are Alchemist, Battlemage, Celestial and Druid. Of all of them that we know more about due to the fact that their launch should take place at some point in 2022, it is from Alchemist, but about the rest we know rather little, except that they are the name of the architecture that follows.
What we do know is the nomenclature of each generation, for example all Alchemist models will receive the name of ARC Axx such as ARC A380, while models with Battlemage architecture will be Bxx and Celestial Cxx. As you can see, it is a logical way of giving a name to each range of products that, despite being unimaginative, differs from the usual way of naming the products.
However, roadmaps are often partially misleading or change in the middle of them, either in the form of product cancellations and overtaking of others. That is why we have decided to review not only the Intel GPU roadmap, but we have also tried to square it with the different information that is coming to us.
The Intel ARC GPU Roadmap and its architecture
Intel’s ARC Alchemist architecture doesn’t differ much from what the competition offers. Just like with AMD and NVIDIA we have the use of different signifiers for units with the same or similar functionality. So we are going to be concise in this we are going to focus on what is important.
Intel’s objective is to capture as much market share as possible and for this they are not going to go for the jugular of their rival in CPU, what they want is to go against NVIDIA and that is why their Alchemist are designed to compete face to face with the RTX 30. So if we compare unit by unit we will find things like a Ray Tracing unit very similar and superior to AMD’s with the ability to traverse the BVH tree by hardware. The Tensor Core equivalent that is not found in the Radeons is found in the ARC Alchemist and the number of ALUs per shader unit is 128 instead of 64.
So the first generation of Intel ARC is more of a first entry that says little or nothing about Intel’s ambition and future plans. A simple cover letter in a market so far dominated by the AMD and NVIDIA duopoly.
The development of Ponte Vecchio is key
We know two things about the GPU for high-performance computing called the Ponte Vecchio: the first is that it will not appear on PCs, since it is a design for supercomputers and high-performance computing systems. However, there are several concepts that we are going to see in the Intel GPU Roadmap. Although the most important is that the knowledge accumulated for its development is what will allow them to deploy the next generations very quickly compared to the competition. In the words of the chief architect, Raja Koduri, we can expect the use of the same unit in both CPU and GPU.
Meteor Lake will be a new architecture that will allow tiled GPUs (or chiplets) to be integrated into 3D packaging. This is something very exciting that will allow us to offer dedicated graphics card performance with the efficiency of integrated graphics.
One of the things Ponte Vecchio uses is Intel’s new 3D packaging and silicon bridge technologies. We are talking about Foveros and EMIB, which will be key to realizing the roadmap of Intel ARC GPUs using several tiles or chiplets instead of a monolithic chip or a single piece. They will not be the only ones, but they have made more progress than the competition.
The importance of TSMC’s 3nm node in Intel’s roadmap
The agreement between Intel and TSMC where the latter will build Tile GPUs for both its graphics and its processors will allow Pat Gelsinger’s team to take advantage of the Taiwanese 3nm node long before NVIDIA or AMD do. The reason? The small size of each tile is key to quickly deploying the different generations of ARC GPUs. However, the Tile GPU developed for Ponte Vecchio is not powerful enough to face the RTX 4090 in terms of the number of ALUs in FP32.
So Intel has decided to take advantage of its privileged access to the 3nm node to create a Tile GPU with higher computing power than the Ponte Vecchio in order to offer a top-of-the-line GPU with much higher power than it can. get NVIDIA with the RTX 4090 Ti. To do this, what they will do is mount the same Tile GPUs from Meteor Lake and Arrow Lake. The difference is that dedicated GPUs will use configurations of 2, 4 and who knows if 6 and even 8 tiles on the same GPU.
We can’t give official figures, but there is talk of “320 EU” configurations per Tile GPU at 3nm in Intel’s Roadmap, which translates to 2560 FP32 ALUs in the 1 tile only configuration, which would allow Intel have a GPU with more than 20,000 “cores” in the high range; however, at the moment we do not know if we will see her as Battlemage or Celestial. In any case, the name is the least important of all.
How does Intel intend to unite several different GPUs into one?
Here we enter a topic that is extremely interesting, desktop GPUs usually draw with a single screen list per frame, so if we use several of them we have three solutions:
- Alternate frames, which means that the CPU will have to have the screen list of the next frames already prepared. It gets to the point where you can’t do this and you can’t scale on GPUs.
- Dividing the frame into several parts of the screen, the problem is that all the generation of the scene up to the rasterization is not done at the screen level, but at the geometry of the world, so it is done by a single GPU.
Intel’s idea for its future GPUs is easy to understand, first the scene is rendered with a single Tile GPU, but without applying shader programs to any graphical primitive and neither textures to know where each polygon of the scene will be from the beginning. , knowing which ones will not be visible and will have to be discarded, especially to create screen lists to render the scene to target each slot in it and thus have each tGPU render its own part of the scene.
It is neither more nor less than adopting the same solutions as Tile Rendering, but with the difference that the ordering of the geometry is done before the final rendering of the scene. The pre-rendering to create the list is done via the compute pipeline, which allows multiple Tile GPUs to be used in parallel from the start.