Windows 10 is here and among many features and improvements, it brings a real revolution in PC gaming world. DirectX 12 libraries will help PC graphics industry to continue down the path of low-overhead graphics APIs.
There are two especially significant features in DirectX 12 that will certainly be appreciated by PC gamers: “Multi-Threaded Command Buffer Recording“ and “Async shaders”. These features will improve PC performance in future DirectX 12 games even by 200 percent in some cases. I will describe how it works in this article.
Multi-Threaded Command Buffer Recording
You can imagine command buffer as a list of things that CPU must organize and send these commands to VGA card for the further processing by GPU. These things are for example:
- Loading and storing textures
- Generating reflections
- Placing objects on 3D scene
- Lighting and shadowing
Nowadays, almost every gaming PC is equipped with four, six or even eight core CPUs (like AMD FX or Intel Core i7), but in DirectX 11 games VGA driver can’t utilize all these cores optimally. The reason for this is DirectX 11 – poorly adapted to break game command buffer into small parallel chunks for processing on all the available CPU cores (and its threads). Apart from weak CPU cores utilization, there is another lack when it comes to DirectX 11 programming model. For most of the time the CPU is consumed by the VGA driver and API interpretation (often called overhead) – so the CPU has got less time for running the game code, thus PC displays lower number of frames per second on the screen.
With DirectX 12, command buffer is severely improved in many ways, for example:
- overhead is much reduced, because it’s moved to any available CPU core or thread
- the absolute time required to complete complex CPU tasks is notably reduced
- game workloads can be reasonably shared across more than four CPU cores or CPU threads
- additional CPU compute power allows for higher peak draw calls number, and allows to create more detailed and addictive game worlds
- all CPU cores can communicate with GPU simultaneously
Take a look at the following CPU time charts showcasing the rendering process of the same scene in DirectX 11 and DirectX 12.
In case of DirectX 11 frame is rendered in 29 ms, and 2 CPU cores are left completely unused. Core 1 is much overloaded by API (DirectX 11) work.
In case of DirectX 12 frame is rendered in 15 ms and all the CPU cores are used. API (DirectX 12) work is evenly distributed to all the available CPU cores – so frame can be rendered nearly 50% faster. In simple words – the same PC can render 34 frames per second using DirectX 11, and 66 frames per second using DirectX 12. It’s about 100% faster. Game developers can choose: much better performance of a game or/and better image quality with more detailed game world in DirectX 12 comparing to DirectX 11 version of the same application. The conclusion is that the PC gamers can only benefit from the upcoming DirectX 12 titles such as RTS game “Ashes of the Singularity”.
Game developers can choose: much better performance of a game or/and better image quality with more detailed game world in DirectX 12 comparing to DirectX 11 version of the same application.
For example – in real Futuremark’s API overhead test the same VGA with the same CPU can process up to 16 times more draw calls per second using DirectX 12 than using DirectX 11.
GPU execute pipeline model in DirectX 11 has serial characteristics. GPU tasks are executed one by one – even if tasks use different GPU resources (for example streaming processors, texture units and memory). This is how it looks:
Plenty of time is wasted, because some tasks that use different VGA card resources could be executed simultaneously without performance drops. Take a look at the very same scenario using DirectX 12 API.
GPU pipeline is not limited by DirectX 11 API, therefore it can use more GPU threads at the same time. Computing, lighting and memory tasks can be smartly grouped into separate threads and are executed simultaneously – render time is saved resulting higher framerates in DirectX 12 games and better responsivity for gamer.
For best GPU threads managing all AMD Graphics Core Next GPUs are equipped with specialized hardware units called ACE (Asynchronous Compute Engine). Many ACEs serve as fundamental building blocks in modern AMD graphics hardware and they are specifically tuned to accommodate significant parallelization of complex jobs with highest possible performance.
Supported DirectX 12 hardware
DirectX 12 API is supported only by Windows 10 with latest AMD Catalyst drivers.
List of supported AMD VGA cards and APU’s:
- AMD Radeon™ R9 Series graphics
- AMD Radeon™ R7 Series graphics
- AMD Radeon™ R5 240 graphics
- AMD Radeon™ HD 8000 Series graphics for OEM systems (HD 8570 and up)
- AMD Radeon™ HD 8000M Series graphics for notebooks
- AMD Radeon™ HD 7000 Series graphics (HD 7730 and up)
- AMD Radeon™ HD 7000M Series graphics for notebooks (HD 7730M and up)
- AMD A4/A6/A8/A10-7000 Series APUs (codenamed “Kaveri”)
- AMD A6/A8/A10 PRO-7000 Series APUs (codenamed “Kaveri”)
- AMD E1/A4/A10 Micro-6000 Series APUs (codenamed “Mullins”)
- AMD E1/E2/A4/A6/A8-6000 Series APUs (codenamed “Beema”)
DirectX 12 included in Windows 10 seems to be very promising for PC gamers. Almost all gaming systems equipped with four, six and eight core CPUs will get performance boost and probably more detailed worlds in DirectX 12-supported games. And (as mentioned before) – this relates only to DirectX 12 supported games. Those are not yet available on the market – but by the end of the year we will most certainly see some of them, e.g. Gears Ultimate, Fable Legends, Elite Dangerous or Deus Ex Mankind Divided. Some other titles should also get patches with DirectX 12 support such as Batman: Arkham Knight or Witcher 3. I can’t wait to see The Witcher 3 with increased view distance and with more detailed world!