Wednesday, 30 June 2021

An Initial Inspection of FidelityFX Super Resolution

As I noted in an addendum to last month's post, I really expected AMD to announce that their new upscaling technology (which supplements FidelityFX Contrast Adaptive Sharpening + Upscale) would use temporal accumulation to compete with upcoming technologies like Unreal Engine 5's Temporal Super Resolution. It seemed like the obvious pivot after a couple of years of offering CAS, with their previous tech advertised as "designed to help increase the quality of existing Temporal Anti-Aliasing (TAA) solutions". AMD already have a branded option for tweaking and upscaling already-anti-aliased image buffers so to respond to nVidia's DLSS (offering close to or even beyond anti-aliased native res rendering quality at lower GPU loads due to upscaling significantly lower res aliased internal frames) the natural step would be integrating anti-aliasing, upscaling, and sharpening - something likely best achieved using a temporal buffer, to go significantly beyond the limits of previous spatial-only techniques.

Last month I linked to a few examples of where enthusiastic sharpening can have a quite poor effect on image quality (from effectively wiping out anti-aliasing to classic halo artefacts that any digital photographer well knows from trying to recover additional detail with a careful manual tweaking of Lightroom settings). This has generally limited my desire for CAS in any game where it has been offered (or turning on nVidia Image Sharpening) - when the effect strength is configurable then I'll generally apply it so lightly as to not be worth any performance cost; when I'm not able to tweak strength then it usually seems too much and I've seen some issues during combined upscaling (which do not seem inherent to the tech but an implementation failure that still managed to ship, although I did say at the time "the tech should be rebranded if fixed to work well in the future"). What we have from the new FidelityFX Super Resolution is something that could be considered CAS-Plus - it's the latest version of CAS (with what seems like a less aggressive default strength, still configurable either by the developer or passed on to a user option) along with a more involved integrated upscaler than the old implementation, one that promises to enable much higher upscaling factors without major quality loss.

Although FSR is not yet fully 1.0 and public, what we have already received is, like CAS, purely an upscaling and sharpening solution (with instructions that make that sound like this will not change) so it expects the game to have already applied anti-aliasing. We will be able to poke it in more detail soon ("The source code for FidelityFX Super Resolution 1.0 will be coming to GPUOpen in mid July") but with some games shipping implementations last week, we can give the output a first examination using our version 1.0 eyeballs. My expectations were tempered from not being blown away by CAS before and wondering how the spatial-only upscaling would deal with any aliasing, but it's pretty clear that AMD would not open-source a simple rebranding exercise so this was going to be at least a completely new generation of the ideas originally proposed via CAS and so worthy of examining on their merits rather than previous experiences.

I am actually ideally situated to take advantage of FSR, being one of the many many people (according to May's Steam survey) who has not made the jump from a GTX card to an RTX upgrade or AMD alternative (even if DLSS was offered for any of the titles currently shipping with FSR support). With shortages leading to terrible availability and ridiculous prices when there is any stock, many of us would likely have upgraded by now (this GTX 1070 shipping note is over five years old) and just need a bit more longevity to wait out supply catching up with demand. Unlike most of the other people on a Series 10 GPU, I am trying to drive a (desk-mounted, not living room) 49" 4K panel which benefits from both quality anti-aliasing and as many pixels as possible.

This blog has always been written with an intended audience of indie teams and enthusiastic amateurs with an interest in rendering; me and a few thousands visitors. Unfortunately the commentary around FSR's launch has seemed a bit toxic and divisive (especially questioning some press analysis). While occasionally forthright, I hope readers understand the aim here is to evaluate, give context with how things fit into the wider rendering landscape, and to make an occasional light-hearted jab at shipping flaws from the perspective of people who have & will continue to see that stuff in our own work because rendering is difficult (big publisher funded or not) with some hard choices being mutually exclusive.

The questions about FSR can broadly be split into two: how does this new generation of sharpening with an integrated upscaler compare in performance cost & quality to the basic fallback upscaler in the games that integrate it; and how does the combination of existing anti-aliasing solutions with FSR applied broadly hold up when other games are shipping with temporal anti-aliasing upscaling solutions either integrated into various game engines or via AI acceleration from nVidia (previously discussed last month)? But ultimately it can all somewhat collapse down to: how can developers offer the best subjective quality (be that headroom to guarantee perfect frame pacing, less flickering aliasing, or just a more pleasing or detailed final scene) on every hardware platform?

Dota 2, FSR 50%
The Riftbreaker, FSR Bal (59%)
The Riftbreaker, CAS 75%

Example Implementations

Everyone appears to have used Godfall as their primary example due to a recent marketing push combined with that being a relatively "next gen" game using some of the latest ray tracing effects available under UE4 - it's well-covered by a wealth of existing analysis (inner surfaces, sharply textured and somewhat noisy in the native presentation, get progressively blurry while edge detail can hold up but sometimes makes the underlying lower resolution apparent via stair-step artefacts; clearly beats basic upscaling at like for like framerates). I'm going to poke at two free titles (F2P or in open beta) both using slightly more bespoke rendering pipelines. Dota 2 currently uses the Source 2 engine but I'm not sure if the MLAA it uses has been much updated for years & years while The Riftbreaker uses a custom engine that just moved to a TAA solution they liked so much they completely removed the previous MLAA-optional "raw" rendering choice but, just like the stock configuration of Godfall, this does not offer an integrated upscaler with that TAA - when you use the basic upscaler it does not use the additional information from a jittered sample location in the frame history buffer to more precisely reconstruct the final high resolution image, rather it does a TAA resolve to whatever internal res you specify then upscales that as a spatial-only step likely using a cheap bilinear resample. Both games have internal framerate overlays (baking the numbers into screenshots) and offer a common "camera in the sky" not-truly-isometric perspective while using very different AA techniques as a point of contrast.

I have uploaded all the png files (to a service that may use compressed jpeg previews for the web viewer but allows you to easily download the genuine bit-identical files), including every 4K capture used for crops. These act as visual aids to the wider points I noted while the games were in motion and I recommend anyone wanting more than this summary, throw up a Dota 2 replay or check out the Prologue for The Riftbreaker to see it running on your own hardware. Accept no (highly compressed video) substitute; everyone ranks fine visual details in subtly different ways.

100% top, FSR 50% bottom
from TL: 70-80-90%; FSR70-80%, 100%
100% top, FSR 50% bottom

Dota 2 offers a simple toggle between FSR and a basic upscaler when the internal rendering resolution is scaled (40-99%) down from (100%) native. There is no option to tweak the sharpness applied and what becomes immediately apparent (centre image above) is that the sharpness Valve has chosen is significantly stronger than other implementations (where FSR is noted as softening flat textured surfaces compared to 100% resolution). Here, the large flat ground of the Dota map leaps off the screen, with 70% (image top right) and 80% scale FSR (centre right) offering almost equal perceived texture detail due to an aggressive sharpen that makes much of the very low contrast textures pop more than their native resolution presentation. The basic upscaler (image left) shows how linearly interpolating between the fewer samples into the underlying texture due to the lower internal resolution applies a blur that smears what soft detail there is available at 100% so that even 90% scale (image bottom left) is washed out. Moving to the leftmost image just above, even scaling FSR down to 50% (that is only using a 1080p internal resolution and no temporal reconstruction of any sort in this FXAA title) then we see an impressive retention of perceived texture detail that even zoomed up to 200% (quad pixels to retail original sharpness - this is the only image used that is not at original output pixel-scale) only just makes clear the sharpening artefacts and some lack of genuine detail from the 100% resolution original that rendered four times as many pixels. The grass texture detail and the dappling on the path in the top render is now more clearly absent in the bottom render and objects like the yellow flowers gain telltale dark halos while the transparent texturing of the tree leaves are clearly losing their clean edge.

I applied some generic (non-AMD branded) image sharpening to some of the unsharpened sub-native resolution captures and a lot of this texture detail can absolutely be recovered by any basic competent algorithm so I would avoid calling the CAS a secret sauce but it is at least doing the job required of it (working against the softening of using a lower internal resolution) well enough without a major performance cost. I also pushed the mip bias values way out and took a few screenshots of that, which captures how FSR compares to native resolution on edge detail retention when all the inner texture detail is blurred away with much smaller mipmaps. Some of the fine edge detail is starting to visibly break down at FSR 75% but lots of the wider edges are being extremely well retained, if rather darkened like a pencil was sketching over the edges, as long as the AA pass caught them. The strong sharpening is starting to grasp for detail not there, so causing mild posterisation in spots. The increased shadow/AO evident may be a side effect of the internal resolution being lowered (or could be an interaction with the mip bias tweaking).

When we move to a closer camera in the rightmost image above and more 3D elements that require anti-aliasing, we continue to see this clear softening on edges and evidence of the enlarging and softening of spots where the FXAA has not sufficiently cleaned up an edge in the internal resolution render. In static screenshots, I find the soft edges with sharpened interior detail to often work in favour of this technique, even if it can verge towards a dithered posterisation at points (even with textures left as intended). In motion, it inherits the issues with any MLAA technique in that elements that are unable to be anti-aliased sufficiently flicker enough to draw attention and the soft upscale here ends up drawing added attention to them not entirely unlike a more basic blur applied over the top of aliased edges (in fact, some of these captures catch artefacts very similar to the ones I noted when discussing that original release of No Man's Sky). Dota 2 will never be at the top of my list of rendering greats, and FSR can only do so much with what it is given (as we know it is not designed in any way to provide anti-aliasing itself), but I was pleasantly surprised with how, looking at paused game replays, FSR significantly increased the framerate with only a mild increase in edge shimmer (when in motion) and virtually no softening of inner detail.

Unfortunately, I then looked at the framerate counter as I unpaused from taking screenshots of a frozen moment in time. My initial impression had been that FSR turned my modest GPU (by 2021 standards) into something capable of making a new generation of 4K144 gaming screens sing with this classic title. Pushing the final step up from the ~100fps with max settings it was previously limited to (in all three of the 100% captures I cropped and discussed above). FSR 50% was able to hit ~165fps with 70% FSR giving about a 30% boost and 80% FSR a 15% boost with that exceptional image quality. But once my Ryzen 2700X has to process the extra load of running replays, which is more typical of actual gameplay, the GPU utilisation dropped. Not for running 100% scale, which sticks exactly where it was before, but even basic upscaler 80% drops from 150fps to 140fps and, more significantly, 50% FSR loses that 165fps for figures between 120-140fps. Higher internal resolution FSR squeezed in below and so was barely paying for the overhead of the FSR pass over native res. As it affects the basic upscale too, this is clearly something common to not having enough GPU load at lower res or some single-threaded weakness of the older Ryzen CPUs with Dota's workload. It's not a dealbreaker but it's why I haven't embossed the paused-time framerates onto all of these clipped shots (they are all printed onto the original so they're not hidden) to show how much framerate improves as image quality changes. Simply put, in actual motion the gains are not nearly as great as the first impression from static scenes. I hope Valve continue to tweak this implementation (as an e-sport, I'm sure their engine is constantly being tweaked to ensure it can hit those highest refresh rates on select machines) so it can saturate the GPU in motion.

My ideal implementation would allow the user to dial in a desired framerate, with Dota 2 dynamically changing the FSR factor to maintain a constant performance (as many console dynamic resolution implementations do, usually backed by a temporal component). The way FSR is implemented here, with a static percentage chosen and framerates changing based on how much is going on onscreen, seems like it would play best on a VRR/G-Sync display. Unfortunately, as you change the setting in real-time in the menus, the edge shimmer can be seen to "bubble" as the percentage scale changes. Although you can only see around the edge of the settings menu into the game itself, that was enough to make me think that the crawling edges of a dynamic FSR in Dota 2 would not be a good experience, at least unless some temporal solution was used to control the edges reshaping as internal resolutions moved around.

from TL: B-Q-UQ-100%CAS; P-75%-75%CAS-100%
from L: Bal, 75%CAS, Ultra-Qual, 100%

The Riftbreaker uses four named FSR levels AMD have suggested but also offers a basic upscaler you can use in 25% increments that allows for CAS to be enabled - this appears to be visually quite similar to enabling FSR, presumably as the game implements the very latest revision of CAS that is based on the same sharpening pass as FSR uses. Those named levels are: Ultra-Quality (77%), Quality (67%), Balanced (59%), and Performance (50%). I would prefer more granular control (or even fixing a desired framerate and a dynamic internal resolution managed by the engine) but this gives us a few fixed points to focus on and compare to the fallback basic upscaler and even using that upscaler but applying CAS. As mentioned earlier, The Riftbreaker uses TAA but does not use TAAU so using a basic upscaler from 50% will not be able to recover all of the texel information via a jitter (looking back four frames to each pixel in the 4K output from four 1080p internal renders), unlike more advanced temporal solutions. (Four frames at 60fps is a remarkably short span of time so even if you think that motion vectors would need to be very good to recover the sub-pixel jitter texel reading, there are likely to be quite a lot of places where TAAU is basically sampling the same spot so doesn't even need great motion vectors.)

This lack of TAAU's recovery of static texture information is quickly apparent when comparing (left image above) the detailed ground texture as the game starts (as our mech basks in the scenery while given orders). The 100% render (bottom right) shows excellent fine grass texturing and the geometry edge detail indicates this TAA errs on the side of sharp with slight aliasing from bright glints unable to be completely cleaned up. This comes at the cost of only just beating the screen refresh, hitting 64fps in this least demanding scene (with the ray tracing effects switched off on this old GTX card). Applying CAS to this 100% native render (bottom left) does make everything pop that tiny bit extra but the overhead drops us 10% to 59fps.

Working up the left side of the image we have quite a different choice made (again, not user configurable) on the strength of the FSR sharpening (and how high contrast the texture work started out) with FSR Ultra-Quality (that's 77% scale) losing quite a lot of that sharply-authored ground detail (while Dota 2 at similar internal resolutions was competitive with native). There could also be a difference in AA solutions at play as Dota 2 just gives FSR the lower res but otherwise barely touched texture detail while TAA could be softening everything before FSR gets involved. The edge detail (eg mech & crystals) gives hints at the lower internal resolution where the TAA couldn't quite suppress artefacts even at native resolution, but is otherwise clean (compare the sword between all clipped captures). It looks good in motion and boosts us to 75fps. Above that FSR Quality (67%) shows incremental softening and texture detail loss but in motion (now 85fps) much of this is less apparent than the direct comparison. At the very top left, Balanced (58%) is where the fine line detail is starting to break into visible stair-stepping in the screenshot and flickering in motion. 93fps also shows it's a point of slightly diminishing returns (although still far from CPU bottlenecked in this engine, which doesn't let you take screenshots of the game when paused so avoided making a similar discovery to in Dota 2). Finally for FSR, at top right is Performance (50%) which is doing well given that it's actually only dealing with a 1080p internal resolution but I'm not sure I'd play a game for extended periods of time looking like this as I'd rather scale back effects to avoid the shimmer that appears in motion and lack of texture detail (wasting the pixel count of the screen) rather than chase that 105fps.

Moving down the right side of that image, we have the basic upscaler and 75% internal res upper right. I would say this broadly shares elements of FSR Balanced and Quality - both of which are using significantly fewer internal pixels to reach their final output. Everything seems a bit softer than it should be when surrounded with all these sharpened and native resolution alternatives and the only real positive point is the 88fps, which puts it somewhere between Balanced and Quality - perceptual quality lining up quite well with rendering cost rather than raw internal resolution. Finally the lower right clip is from CAS applied to the 75% basic upscaled option and here we are given an interesting comparison point - this is effectively almost identical to Ultra-Quality in internal resolution and enjoying a sharpening pass, the only difference is the FSR upscaling (assuming CAS does genuinely use a different code path and so still uses the basic upscaler). I would suggest opening the full sized captures and flipping between them if you really want to assess the differences and why this is running at 80fps when UQ sat at a flat 75fps (with only a tiny increase in pixel count). To my eye, CAS on top of this 75% internal res basic upscale is visibly (if subtly) worse at dealing with edge detail. It's also slightly behind on bringing out that ground texture. Much better than the 75% without CAS, but also losing 10% performance to pay for the sharpening pass. The palm tree fringes, the detail both internal to surfaces and at their edge: I think UQ at 75fps is showing that FSR is more than just the latest generation of CAS (CAS-Plus) and worth paying for on top of the existing CAS performance cost. It's not competing with native res but then that's sitting at 64fps (and when things get more taxing, it takes a big hit).

The image above on the right compares four versions of the main base, from leftmost: Balanced, 75% basic upscale with CAS, Ultra-Quality, and 100% natural (no CAS). The thin geometric detail quickly makes plain the difference in underlying internal resolution and is why I like the idea of a next generation temporal solution that could, at least when the scene isn't too busily moving, have a good chance of recovering all this detail at a much lower per frame rendering cost. There's nothing "wrong" with the middle two results (again, I think that you can make out the difference in FSR vs just CAS in how those thin edges are preserved) but they are clearly on a progression towards the leftmost option, which is starting to show breakup of fine detail into aliased blobs and mild posterisation of the texture detail.

75% CAS traditional shadows
Perf (50%) RT shadows Medium

Another way of looking at FSR is that it unlocks new quality settings at the same output resolution and framerate. Above I managed to get RT shadows (at the lowest quality) enabled via the Performance profile and have compared it directly to the more primitive traditional shadows offered (RT does also use more dynamic lights, but these seem to mainly have an added cost when in the scene rather than at daytime with a single dominant lightsource) while using CAS to tweak a 75% internal resolution. Both scenes have more aliasing than I'd ideally like but the RT shadows rendered at 1080p and not the more detailed quality setting combined with the loss of texture detail makes the scene look significantly worse to my subjective evaluation. It is nice to be able to drop all the way down to 50% internal resolution (where a basic upscale would be significantly worse) but the trade-offs are not where I would go to try and unlock new effects, some of which need at least a bit more resolution than is being fed to them by picking low settings at low internal resolutions. Sometimes the best answer is new hardware after five years using something as your daily workhorse. And I'm left with an open question of if that aliasing and softness could both be sorted out (and even unlock lower internal resolutions, without leaning on FSR) if an integrated jittering TAA with Upscaler was offered - especially in scenes like the one above that contain a lot of stationary or slowly moving elements.

As I played through this beta of The Riftbreaker using a range of settings (and experiencing the quite different performance of different sections), I definitely appreciated being able to claw back performance with better image quality than the basic upscaler could provide on top of the mainly-clean TAA presentation. Right now, it offers the ability to at least look at the new ray tracing options at interactive framerates or to get much the same feeling via UQ to a native render even if it doesn't quite look the same under detailed inspection. In motion the bluring of marching ants wasn't ideal but it also softens the intensity of what would otherwise have already been a visible TAA failure. The sharpening here seems quite subtle and rarely something to negatively note adding extra artefacts. In fact, the main issue with dropping down the quality scale into the lower resolutions is my personal preference against the visual result of the FSR pass having to reconstruct a lot of data and producing slightly weird smoothing - fine in motion but something I'd like a VRS-like or temporal solution to be able to spend extra rendering budget on avoiding starving for crunchy detail when it might otherwise be available.

Dota 2, 100%
The Riftbreaker, 100%

In Conclusion

I have had some concerns over FidelityFX Super Resolution, including holding somewhat of an unflattering mirror up to these two implementations we've explored today, but my summation is actually quite positive. As I've mentioned before, I've seen more than a couple shipping sharpening and upscaling solutions that seem to actively work against the underlying renderer's quality. FSR here has performed admirably on two similar canvases (top down terrains filled with creeps) which use completely different engines (with different feature levels) and totally different anti-aliasing solutions. As internal resolution dropped, both showed increased shimmer but it seemed to be driven by underlying aliasing issues not lack of temporal stability of the spatial-only FSR technique - my leading concern going into this. Beyond a certain point the internal resolution simply doesn't have enough information to avoid some slight weirdness (often mild posterisation) in how it recovers detail without using additional samples (like a history buffer) and I've seen plenty of worse examples than anything I've seen so far with FSR - DLSS 1.0 certainly had more than a bit of weirdness to it.

It seems from my inspection that this is a good future for evolving FidelityFX Contrast Adaptive Sharpening + Upscale and that, especially if more developers provide the power for end users to tweak their own preference for sharpening strength within the bounds the developers consider reasonable, this offers performance without major sacrifices for image quality (until dropping far from the "Quality"-named end of the scale). And, as you can tweak which internal resolution FSR operates at, users can make very informed decisions about which subjective quality they are more interested in boosting. When GPU bottlenecked, the performance cost of FSR is more than reasonable, only slightly increasing the price of the latest CAS pass, and handily goes beyond the blurred result of offering a basic upscale (when comparing at the same output resolution and framerate - ie the lower internal resolution to pay for the FSR pass more than pays for itself vs simply using the cheapest upscale option). The sharpening is mainly adding local contrast where it improves detail while only mildly increasing the visibility of aliasing issues, which are actually just as much of an issue for the upscaling part of the process - often stretching them over more final pixels with somewhat of a blur and not able to reconstruct fine lines the internal resolution couldn't capture properly.

Should you integrate this into your hobby engine? We may have to wait on the source code release to see exactly how easy it is to integrate (I would guess: very easy) but if you've not currently got a good upscaling option and you're not looking at this to replace adding a good anti-aliasing solution (because it is not that) then FSR will definitely be easier than hooking up a complete TAAU solution (or DLSS 2) and tweaking the temporal jitteriness that they all seem to have early on. We will have to see how the next generation of TAAU and DLSS (or competing AI-enhanced anti-aliasing, upscaling, and sharpening algorithms) progress. In the long term, I think we will all join that future. Maybe by version 2.0 of FSR, there will be an optional temporal component that evolves what is possible if you can feed it a history buffer.

No comments:

Post a Comment