Hello yuz-ers! How are you all doing?
In this monthly episode of “yuzu — Trials and Tribulations,” we offer you: major rewrites, massive performance gains, stability improvements, bug fixes and graphical corrections. More after the commercial break.
It has its own article, and it had been guessed to hell and back before the official announcement.
Project Prometheus is a proper multithreaded emulation of the 4 CPU cores the Nintendo Switch offers.
This brings a massive performance boost to users with CPUs with 4 physical cores or more, but for this to happen, a lot of groundwork was needed.
Besides changes previously discussed in past reports, old external libraries (which yuzu needs to operate) needed to be updated,
and with that, some changes were needed for our Linux users.
With the previous VMM rewrite reducing memory use, the dependencies updated, and all the groundwork done, Blinkhawk pressed the metaphorical nuclear launch button and released Project Prometheus. yuzu can now use up to 6 or 7 CPU threads (in ideal conditions) compared to the previous 2 or 3. You should expect a performance boost in a lot of games, but still see some titles perform mostly the same due to being coded to only use a single thread.
Now, some clarifications are needed for this change. Multicore support can’t be merged into our Mainline release for now due to incompatibilities between Multicore and the Master branch of yuzu. Work is being done to resolve the conflicts, but please have patience. Additionally, users with 2 cores, and either 2 or 4 threads, should not enable multicore as it will most likely result in a performance loss for them due to the lack of physical cores on their CPUs. Our hardware recommendations have been updated accordingly.
Rodrigo implemented rendering more than one slice of 3D textures, fixing the most glaring issue in Unreal Engine 4 games, known as “the Rainbow”. This also improves the previous implementation that was used in Xenoblade games.
Your Excellency (OCTOPATH TRAVELER)
Animal Crossing: New Horizons terrain borders in Vulkan by implementing
constant attributes. This is not a native extension, constant attributes have to be emulated in Vulkan as there is currently no official support for it.
Beautiful beaches, now in Vulkan too (Animal Crossing: New Horizons)
bunnei implemented time zone support, and Windows users will find that yuzu automatically detects their time zone. For those not on Windows (or want to spice up their life), you can manually change your system time via the “Custom RTC” option in the System settings. Previously, yuzu always assumed the user was located in the GMT+0 time zone.
Rendering bugs are abundant in Xenoblade games due to the complexity of their engine, and they are not trivial to solve.
However, with the help of gdkchan and using this Pull Request from Ryujinx, Rodrigo fixed one of the major rendering issues in
Xenoblade Chronicles 2 related to front face flipping. Additional improvements to texture depth samplings resolved some rendering glitches, such as the clouds and start menu. Additionally, a better handling of mipmap overlaps solved the constantly moving textures the games previously had. You can see the results below.
Who said yuzu can’t run JRPGs? (Xenoblade Chronicles 2)
Rodrigo also optimized the performance in Xenoblade games, one method of which was profiling the texture cache line by line and finding where it bottlenecks. By improving the code, you get a faster frametime, which translates to better performance.
Another way, and not an expected one, was to log less information. This avoids saturating the GPU thread, giving more room to actual processing and rendering.
ogniK wrote a new Macro JIT (Just-in-Time) to improve the performance of games that spend too much time in the macro interpreter. This should be a global performance boost independent of GPU vendor or API.
When Rodrigo improved yuzu’s ASTC decoding, he also added a rule to use native hardware decoding whenever possible. The Nvidia driver tells yuzu it supports ASTC decoding, but as it turns out, they actually use an internal software decoder that is much slower than our own implementation. Ignoring the Nvidia driver-level software decoder produced a massive performance improvement when facing the dreaded ASTC texture format in games. It will still be immediate with Intel GPUs, as no software optimizations will beat a dedicated hardware decoder.
Vulkan development is an ongoing process in yuzu, and it has stability problems as expected of a relatively new and complex feature. Blinkhawk made a couple of critical changes to Vulkan and Asynchronous GPU, improving stability considerably.
Speaking of Vulkan, many 2D games had their sprites flipped or completely wrong, and once again, we currently lack the
extension required to fix this. Therefore, Rodrigo implemented support for
This Nvidia-exclusive extension is the only way to solve this problem in a clean manner for now, but a universal method is being
Quack (Duck Game)
The updated libraries (that the migration to Conan brought us) also gave us a new version of the cubeb audio engine which adds support for 6 channel audio, allowing ogniK to add support for surround sound.
Thank you Toxa for the screenshot (The Walking Dead: The Final Season)
Although objectively a small issue, the mouse cursor didn’t hide when running yuzu in full screen, causing a subjectively significant annoyance. Thankfully, Tobi implemented an option to automatically hide the mouse once it has been inactive after some time.
Recently released in the Early Access build, and coming soon to Mainline, is support for
assembly shaders (
refered to as
A couple decades ago, there was no common language for the then newly added programmable shading units in GPUs, so the
OpenGL Architecture Review Board decided to create a proper standardised shading language they called
GLASM. In broader terms, this is an assembly language used to communicate with the GPU. This makes it very difficult to work with, and the difficulty is only exacerbated by the limited set of debugging tools available. Furthermore, the language was developed with the hardware limitations of the time in mind.
In the present,
GLASM has been mostly deprecated in favour of easier-to-work-with, high-level shader representations like
While this means faster results for game developers due to less time spent looking at the code, it also has the disadvantage of being far slower for emulators that have to constantly intercept, decode, and recompile shaders on the fly.
In the beginning, support for
GLASM started as just an experiment. Armed with apitrace as his only debug tool, Rodrigo set to his task.
Luckily, and for no apparent sane reason, Nvidia still maintains support for such an old feature, even on the latest OpenGL versions. As such, support for
GLASM soon became a reality and with this initial assembly shading support in place, Nvidia OpenGL users can enjoy extremely fast shader compilation times.
Due to being closer to the native hardware of the Nintendo Switch, we can also expect some precision fixes, with more coming in the future.
GLASM has some limitations. To list some of them:
This is an Nvidia and OpenGL only feature — other vendors (AMD and Intel) only offer support for the specific assembly shaders that old games require and this is highly unlikely to change in the future.
Currently, some games experience bugs that will need to be ironed out, such as:
Luigi’s Mansion 3,
Astral Chain, and
The Legend of Zelda: Link’s Awakening.
There are architecture specific bugs; a Pascal GPU may face different issues than a Turing or Kepler GPU.
You can see the progress from simple things… (Cave Story)
To more complex tests (Fire Emblem Warriors)
I can’t say much here, but there is something going on with both
Project Viper and
That’s all for now, folks! See you in the June article!
Special thanks to BSoD Gaming for the comparative
GLASM video, and Toxa for providing some screenshots.