—
There is skepticism about whether big.LITTLE Alder Lake processor concepts can work well on PCs and Windows. But Intel has the technology to do that: Thread Director provides feedback to Windows 11, making small and large cores ideal for use.
We have already published analyzes of the CPU architecture of Alder Lake processors – large cores Golden Cove and “small” Gracemont. But Intel has already unveiled a perhaps equally important third component: a special scheduler to assign programs where they fit – to a small or large kernel. To ensure that “big.LITTLE” works properly. Maybe it would depend on him whether all the skeptics’ reservations were confirmed or whether Alder Lake wiped their eyes.
This scheduler, or component for the scheduler, is called Intel Thread Director and is a technology at the interface of hardware, firmware and software. There has been information about Alder Lake that it will have a hardware scheduler for tasks on the processor, but that is not good enough. What is really at stake is that the CPU scheduler in the operating system will receive assistance from the hardware in deciding how to assign tasks to the processor cores. It is not about “acceleration”, but just about conveying software information. The decision itself must remain with the operating system, but because it receives more information from the CPU, it can make better decisions.
Intel is working with Microsoft on this issue and the result is Thread Director and on the other hand special support for the use of this feature in the new Windows 11. You may have already reported that Windows 11 is about to improve the operation of big.LITTLE processors. Hardware scheduler support for the CPU scheduler is included.
How hardware assistance works for the CPU scheduler
Intel Thread Director uses an integrated microcontroller (it should be a control unit, this technology is probably connected to the Performance Monitoring Unit), which monitors the performance and various load indicators for each core / thread of the processor. It should monitor the load, the number of load / store operations that show how much memory the running program is limited by memory (instead of limiting the gross kernel performance), the number of branches, the types of instructions used, and other patterns of behavior. It also monitors whether running processes use high-power instructions such as AVX2 or VNNI.
From this data, the Thread Director then creates feedback, which it passes to the running operating system (in this case, Windows 11). It should be communicated via the EHFI (Extended Hardware Frequency Interaction) interface.
Thread Director passes information to Windows 11 on whether a thread running on the processor is “important” (in the sense that it would benefit if it were processed faster) or not. On the one hand, it can state that a certain process should go to a powerful kernel (even at the expense of those already running there), and on the other hand, it also indicates the processes that are the most suitable candidates for moving to small kernels.
Windows 11 can then use this to decide which programs / processes can “strangle” when the processor runs out of free cores / threads – for example, which threads to move from large cores to small cores. In this way, Thread Director affects the outcome in situations where the OS must choose which tasks to place on more powerful and which on more efficient kernels. Its telemetry should improve the likelihood that the decision will be made correctly.
Importantly, Thread Director analysis is very fast – Intel can allegedly “recognize” the behavior of the process in as little as 30 microseconds, while the operating system scheduler is said to be much slower in evaluating running tasks and responding to their character in hundreds of milliseconds. . But the OS can probably never analyze a running program in such depth, because if it examined which instructions were used and so on, it would consume the CPU power itself and slow everything down. However, if this work is performed in a special unit integrated in the CPU, this analysis has no impact and the additional consumption is probably minimal.
![Intel Thread Director 04 Intel Thread Director 04](https://i0.wp.com/www.cnews.cz/wp-content/uploads/2021/08/Intel-Thread-Director-04.png?resize=900%2C506&ssl=1)
Tuning for power and consumption
In general, Thread Director should help identify applications (processes) that really need performance and their speed is critical to users. He will then mark such applications / processes for Windows and recommend moving them to large kernels. On the contrary, it helps evaluate which processes are different background services and similar things that you prefer not to bother with higher power consumption / fan speed and not take performance for more important activities. Such processes should be marked for moving to small cores.
For example, a microcontroller can detect situations where a program is actually waiting in memory and not limited by gross performance – in such cases, it is possible to reduce the frequency and voltage of a large core (and thus reduce power consumption) without compromising performance. Or the Thread Director should be able to know that an application that seems to be fully utilizing the kernel is actually just going through a waiting loop that doesn’t really do anything. Such a process will then help redirect to the small kernel again by tagging it for the operating system.
Another thing that Thread Director follows is the use of advanced instructions such as AVX / AVX2 or VNNI (an extension designed to accelerate artificial intelligence). Their use should probably lead to processes being marked for assignment to large kernels.
Thread Director will not only focus on performance, but also on power consumption, which will be used in notebooks when running on batteries. In battery-powered notebooks, where power is important, efficiency will still play a role in all of these decisions, so in that case the operating system, in conjunction with Thread Director, can send many jobs preferentially to small cores, although they would be routed for desktop use. on large cores. The whole technology should be able to make dynamic decisions according to the input parameters, which are also the current consumption settings, but also the temperature.
![Alder Lake processor with eight Golden Cove cores and eight Gracemont energy-saving cores Alder Lake processor with eight Golden Cove cores and eight Gracemont energy-saving cores](https://i0.wp.com/www.cnews.cz/wp-content/uploads/2021/08/Procesor-Alder-Lake-s-osmi-jádry-Golden-Cove-a-osmi-úspornými-jádry-Gracemont.jpg?resize=400%2C492&ssl=1)
How does the system distinguish between large cores, small cores and Hyper Threading?
In addition to this feedback, the Alder Lake processor also tells the operating system the topology of its cores, ie which core (thread) is which type. This is additional information that the OS needs in order to be able to make the right decisions when assigning programs to kernels.
Intel states that in terms of assigning programs (processes) to individual cores or threads of the Alder Lake processor, there is a hierarchy of performance, which has three levels. The most powerful situation is when one thread is running on Performance Core / P-Core (ie a large Golden Cove core) and its other thread is not used. In second place is not a large core with HT and both fibers active, but an Efficient Core / E-Core, a “small” Gracemont core. Only the third in the hierarchy is one thread of the large P-Core in a situation where both of its threads are loaded.
Thus, according to Intel, one Gracemont core has higher performance than the P-Core performance on each of its two HT threads (which also has the implication that a pair of Gracemont should have a higher price for MT performance than one large core).
Therefore, when the processor is gradually loaded by multithreaded programs, the default behavior is that Windows 11 should preferably occupy large cores, but each with only one thread, so that the performance of one thread does not go down. After the free completely large cores are exhausted, the tasks first start adding to the small cores and the second threads of the large cores remain empty. Only after the depletion of small cores will HT be used, ie the second fiber in large cores. This is because although the use of HT increases the total MT power, the power per fiber decreases with active HT (with HT, the power of both fibers is roughly symmetrical, so instead of 100% in single-threaded mode, you get 2 × 60% single-threaded power).
![Intel introduces Alder Lake 05 processors Intel introduces Alder Lake 05 processors](https://i0.wp.com/www.cnews.cz/wp-content/uploads/2021/08/Intel-představuje-procesory-Alder-Lake-05-1024x576.png?resize=900%2C506&ssl=1)
–
Alder Lake has three variants. All chips have eight small cores, large cores are either eight, six, or two Source: Intel
–
Gallery: Intel introduces Alder Lake processors
Intel Thread Director at first only in Windows 11
This advanced feedback for the operating system’s CPU scheduler will at least initially be used only in Windows 11, so you will need this system for optimal performance and behavior of Alder Lake processors. For Windows 10, a simpler version called Hardware Guided Scheduling might be available, in which case the operating system probably doesn’t work with as much different information from the firmware and processor and will therefore not make decisions as “well-founded” as W11.
While developing the feature, Intel reportedly focused on running it with Microsoft on Windows 11, which will be released this fall. However, it is planned to incorporate at least part of this functionality into Linux, so this platform will probably not be at a major disadvantage, at least in the long run.
However, “upstreaming” support in Linux can take a few development cycles and months to revise and incorporate patches. It could take several months, but maybe more than a year. Depending on how Intel patches will appeal to developers and whether they will require changes.