Progress Report September 2021

Written by GoldenX86 and Honghoa on October 11 2021


Hi yuz-ers! Welcome to the latest entry of our monthly progress reports. We have even more GPU rendering fixes, TAS support, 8 player mayhem, input and UI changes, some preliminary work for future big changes, and more!

Yet more AMD specific changes and other graphical fixes

Certain AMD and Intel GPUs were unable to utilize yuzu’s unlock FPS feature with the Vulkan API, due to the lack of driver support for the VK_PRESENT_MODE_MAILBOX_KHR presentation mode. They, however, support VK_PRESENT_MODE_IMMEDIATE_KHR, another mode that allows Vulkan to present at a higher framerate than the screen refresh rate, so epicboy made the necessary changes in order to unlock the FPS on these GPUs. Due to the nature of this presentation mode, this may cause visible tearing on the screen, so bear that in mind if you try this out.

 And this is with just an RX 550 (Metroid Dread)

And this is with just an RX 550 (Metroid Dread)

Booting a title in Linux with the Vulkan API using the Intel Mesa driver resulted in a crash due to a device loss error. The problem was in the synchronization between the rendering and subsequent presentation of frames.

Previously, yuzu would issue the Vulkan Present command, then wait for the frame to be rendered before continuing with the process. While this was fine for other drivers and vendors, ANV (Intel’s Vulkan driver) expected to have the frame already rendered before this command, causing this error.

epicboy fixed the synchronization behaviour so that yuzu now waits until the frame is fully rendered and ready before presenting it.

With the release of AMD’s Windows driver version 21.9.1, and its equivalent AMDVLK and AMDGPU-PRO Vulkan Linux counterparts, users started noticing crashes in most games right at boot. We rushed once again to blame AMD for breaking another extension, as it wouldn’t be the first time. We even singled out Int8Float16 as the culprit, providing an alternative path that reduced performance on all AMD GPUs running non-RADV drivers.

We were wrong.

Turns out, it was our fault. epicboy found out that during the process of initializing Vulkan, the emulator assigned Int8Float16’s values after its memory was freed. Surprisingly, this only started affecting official AMD drivers recently, after their periodical Vulkan version update. So we had to lay down the pitchforks, this time. Performance returned to normal after the introduction of this PR.

AMD Windows users are also familiar with certain stages in Super Smash Bros. Ultimate turning completely white or ghosting, akin to when applications would freeze back in the Windows XP era. Those were the good days.

Ahem, anyway, AMD Radeon GPUs lack support for fixed point 24-bit depth textures, or D24 for short, a relatively common texture format. To bypass this hardware limitation, yuzu uses D32 textures instead, which can cause precision issues during the conversion process. By adjusting the Depth Bias and Polygon Offset of yuzu’s D24 emulation, Blinkhawk solves the issue for good.

Fair play, please (Super Smash Bros. Ultimate)
Fair play, please (Super Smash Bros. Ultimate)

Fair play, please (Super Smash Bros. Ultimate)

Yet another AMD Radeon specific issue is visible when playing The Legend of Zelda: Breath of the Wild. Terrain textures were colourful and corrupted, like when a PC gamer dials up the RGB to 11.

This issue affected our regular suspects, GCN4 devices (Polaris, RX 400/500 series) and older, running on the Windows and Linux proprietary Vulkan drivers. GCN5 (Vega), RDNA1, and RDNA2 devices were unaffected. The problem resided in how we guessed the textures were being handled by the game.

Some information first: there are several ways to handle textures, and in this particular example we need to focus on two, Cube Maps and Texture Arrays.

Cube maps are a cube with its six faces filled with different textures. The coordinate used to fetch the data, unlike the regular X and Y values, is a single versor originating from the center and pointing to the surface of the cube.

Texture arrays on the other hand are just as the name implies, an ordered array of textures one after the other, with X and Y used for positioning information inside the texture, and a Z axis used to determine which texture of the array is in use.

TL;DR, one is a sphere and the other is a list.

Vulkan allows for textures to be marked for conversion into cube maps if later needed, but the sampling (reading) is determined by the texture type specified by the game’s shader instructions. This type is then passed to the graphics API. We do just this and the game decides to keep its textures as arrays, which is its own decision. However, the AMD driver decides that the textures should be sampled as cube maps, ignoring what the texture view determined just before.

While this should not be a problem on its own, as coordinates can still be pulled out from the wrong texture type, the driver can end up reading the wrong texel. This can result in happy rainbow ground in The Legend of Zelda: Breath of the Wild, or as dark and evil terrain in Hyrule Warriors: Age of Calamity.

By disabling Cube Compatibility on GCN4 and older devices running the official AMD proprietary drivers, epicboy returned proper sense to the devastated land of Hyrule.

 I prefer no RGB, thanks (The Legend of Zelda: Breath of the Wild)

I prefer no RGB, thanks (The Legend of Zelda: Breath of the Wild)

 But not THAT dark! (Hyrule Warriors: Age of Calamity)

But not THAT dark! (Hyrule Warriors: Age of Calamity)

Speaking of RGB, as discussed back in February, yuzu has to use compute shaders to convert most BGR texture formats in OpenGL to avoid mismatched colours. While this can work fine on most current GPUs, there’s a performance cost that can affect older and slower products.

Users of Kepler series Nvidia GPUs (usually GTX 600/700 series, with several renamed 800 and 900 series too) could experience those performance penalties while also producing rendering corruptions. Instead of using compute shaders to swizzle textures, epicboy figured we could just use Pixel Buffer Objects (or PBO for short) for all affected texture formats instead. This has many benefits: it solves Kepler BGR issues, improves performance on weak devices from any GPU vendor, and is also a required change for A.R.T. (the resolution scaler in development).

A Hat in Time
A Hat in Time

A Hat in Time

On the subject of changes needed for the resolution scaler, Blinkhawk implemented fixes to queries and indexed samplers. The result is fewer crashes while playing Luigi’s Mansion 3 on Intel and AMD GPUs, be it on Windows or Linux. This PR helps improve stability for A.R.T. as well.

Another issue affecting Luigi’s Mansion 3 is related to its use of Tessellation Shaders on Vulkan. The Vulkan specification requires the input-assembler topology to be PATCH_LIST in the tessellation stages. Not all games follow this, so manually forcing it solves crashes experienced in some drivers, more specifically, as you may have guessed it, AMD’s proprietary ones. All thanks to our fishy epicboy.

epicboy has also fixed some minor bugs with the StencilOp, a type of data buffer intended to help limit the size of the rendering area. Thanks to this, WarioWare: Get It Together! properly renders its models.

Waa! (WarioWare: Get It Together!)
Waa! (WarioWare: Get It Together!)

Waa! (WarioWare: Get It Together!)

vonchenplus added support for the legacy GLSL gl_Color and gl_TexCoord attributes into our Vulkan backend, so that any game that uses them can render properly when using this API.

Both these attributes are part of a set of attributes with specific definitions and uses. But they were deprecated in newer versions of OpenGL in favour of “generic” attributes that the programmer can freely define as they want, based on their needs.

While OpenGL is still able to run shaders that use this legacy feature for the sake of backwards compatibility, they were already considered obsolete by the time Vulkan was created, which means that this API lacks a fallback.

What vonchenplus did is use generic attributes in Vulkan to emulate these features, so that they behave exactly as the legacy GLSL attributes.

After that, vonchenplus corrected the definition of the values in an enum used for blending textures.

Both these changes affect DRAGON QUEST III: The Seeds of Salvation, fixing the graphical bugs present in this game.

DRAGON QUEST III: The Seeds of Salvation
DRAGON QUEST III: The Seeds of Salvation

DRAGON QUEST III: The Seeds of Salvation

Tool-assisted speedrun

MonsterDruide1 has added TAS support to yuzu! This means precise input commands can be recorded and replayed in-game. The format used to store them is the one TAS-nx implemented, and we have a guide on how to enable and use this feature here.

You can access TAS configuration by going to Tools > Configure TAS…

 TAS Configuration window

TAS Configuration window

Other input changes

Let’s start with a nice addition by german77 that will make Super Smash Bros. Ultimate players happy, and Parsec users especially so. There’s now an option to enable 8 player support for XInput devices, at the cost of disabling the Web Applet. A small price to pay for epic fights with your friends.

You can find the option in Emulation > Configure… > Controls > Advanced > Enable XInput 8 player support (disables web applet).

 yuzu Controls configuration window

yuzu Controls configuration window

v1993 later hid the option for non-Windows OSes, as this limitation doesn’t apply outside the Windows SDL builds.

Linux kernel drivers for Joy-Cons use a different naming convention than the ones we use on Windows. Properly following this convention makes the Dual Joy-Con input show up in the device list. german77 thinks of the penguins.

UI changes

With the release of Project Hades, yuzu started using a full Pipeline cache instead of single stages of the graphics pipeline, both in Vulkan and OpenGL. This means parts of our UI were outdated, so your degenerate writer decided to simply update the context menu entries from Shader cache to Pipeline cache.

Following suit, Moonlacer helped replace Use disk shader cache with Use disk pipeline cache. ¡Gracias!

Later on, Moonlacer removed the toggle for Enable audio stretching from the audio settings, as it no longer had any purpose. As a general rule, the fewer options available, the better.

Morph decided to eliminate a 2 year old feature, boxcat.

BCAT is a network service used by the Nintendo Switch to add content to its games without needing constant updates. Our old BCAT implementation only added some “gifts” our developers placed into games that were playable at the time. It was unable to support real use cases, like the game updates Animal Crossing: New Horizons regularly pushes.

While the plan is to add support for this in the future, major changes to the file system emulation need to come first.

behunin implemented much needed clean ups to our debug configuration window. View the results below:

Before and After, Debug configuration settings
Before and After, Debug configuration settings

Before and After, Debug configuration settings

General bugfixes

epicboy noticed a memory leak that would grow progressively after stopping and restarting the emulation, which was caused by yuzu’s main_process not being cleaned up. By destroying this process when stopping emulation, the resources are properly freed now, fixing the leak.

Additionally, epicboy also mitigated the crashes that happened when emulation was stopped by using std::jthread for worker threads.

std::jthread is a new implementation of the thread class that was recently introduced in C++20, which alleviates their management and usage, since they simplify some of the synchronization challenges inherent to multithreading.

With this change, the number of crashes caused by race conditions between working threads upon shutdown was supposed to decrease, but it also introduced a new bug that would cause yuzu to hang when the emulation was stopped.

This problem was caused by the order in which objects were being destroyed, which epicboy fixed in a follow-up PR.

bunnei also introduced std::jthreads into the cpu_manager, in favour of using this more efficient implementation of the class for yuzu’s host threads.

He also made changes so that the KEvents used in the nvflinger service and queue are owned by these services, instead of being owned by the process for the emulated game, which makes the implementation more accurate.

We’ve been trying to focus on improving our homebrew support, as this isn’t just a powerful tool that only developers use. For example, modders have very powerful homebrew apps that the Switch community enjoys. One important example is UltimateModManager, or UMM for short, which refuses to work on yuzu for now.

To counter this, ogniK allowed homebrew running in yuzu to also create subdirectories instead of just the parent directory, resulting in UMM managing to at least start. This is a temporary solution until our much needed filesystem rewrite is finished. Additionally, Morph pushed a partial implementation of the GetFileTimeStampRaw function, removing several warnings.

This isn’t enough to allow full UMM compatibility, but we’re getting there.

Moving on to a quality-of-life change, some games pop-up a confirmation window when trying to stop emulation.

 Like this, End emulation confirmation window

Like this, End emulation confirmation window

This kind of redundant question is generated by the game itself and while we always had a toggle to skip it, it wasn’t working properly. epicboy comes to the rescue, fixing the toggle for good and saving us precious seconds in quitting our games.

If you wish to change this behaviour, the option is in Emulation > Configure… > General > Confirm exit while emulation is running.

v1993 moved all QtWebEngine data to a more organized centralized folder, improving consistency and reducing clutter from the user’s storage. Instead of a separate folder in %localappdata%, information is now saved in yuzu’s directory, %appdata%\yuzu\qtwebengine by default.

toastUnlimited performed his first stubbing surgery with the audio input services Start, RegisterBufferEvent, and AppendAudioInBufferAuto. This way, Splatoon 2 can now be played via LAN without requiring the use of auto-stub. Happy splatting!

german77 stubbed SetTouchScreenConfiguration and implemented GetNotificationStorageChannelEvent to make Dr Kawashima's Brain Training for Nintendo Switch playable.

 Dr Kawashima's Brain Training for Nintendo Switch

Dr Kawashima's Brain Training for Nintendo Switch

He has also stubbed Match to make Cruis'n Blast playable. This game experiences some crashes, so there’s more work to do.

 Cruis'n Blast

Cruis'n Blast

ogniK implemented the EnsureTokenIdCacheAsync function, making Death Coming go in-game, albeit with some graphical bugs that we have to sort out in the future.

 Death Coming

Death Coming

Morph has been working on implementing what is needed to get Diablo II: Resurrected working. Initially, the Read socket service was implemented, but this mandates also implementing more complex services like Select and EventFD. EventFD is particularly tricky as there is no native support for it on Windows, so a considerable amount of work is needed to properly emulate it in the most popular OS.

As a temporary alternative, Read was just stubbed, allowing the game to boot.

 Diablo II: Resurrected

Diablo II: Resurrected

Future projects

For anyone wondering about Project A.R.T., the following image speaks for itself.

 Xenoblade Chronicles Definitive Edition

Xenoblade Chronicles Definitive Edition

Regarding works in progress, there are more rendering fixes underway, and we’re already starting plans on what to focus on after A.R.T. is finished.

That’s all folks! Thank you for your attention and see you next month!

 

Please consider supporting us on Patreon!
If you would like to contribute to this project, checkout our GitHub!


Advertisement

Advertisement