Sunday, 30 May 2021

Fewer Samples per Pixel per Frame

In my VR roundup, it turned into a bit of an impromptu comparison between various anti-aliasing techniques inside one of the most challenging environments we currently have. VR restricts acceptable (input to photons) latency, so can limit pipeline/work buffer design; uses relatively extreme field of view (close inspection of pixel-scale details) combined with ever-increasing raw pixel counts of screens; and demands more than 60 fps with good frame pacing. Add in lens distortion and a temporal reprojection emergency stage (to avoid dropped frames) and it means even without TAA, you’ve got distortion and potentially an extra reprojection stage exaggerating artefacts in the frames you do render.

I think we’re at another rather interesting point for anti-aliasing techniques, as demands for offline-render quality real-time graphics at high resolutions with fewer compromises (like screen-space effect artefacts) enabled via ray tracing acceleration becomes mainstream. Per pixel shader calculation costs are going to jump just as we saw during the adoption of HDR/physically-based materials and expensive screen-space approximations like real-time SSAO. Samples per pixel per frame may not be forced to drop as quickly as consoles jumping from targeting 1080p to targeting 4K but we are going to need some new magic to ensure a lack of very uncinematic aliasing and luckily it looks like we’re getting there.

Sampling History

It is 1994 and I’m playing Doom on my PC. The CRT is capable of displaying VGA’s 640x480 but due to colour palette limitations most DOS games run 320x200 and Doom’s 3D area is widescreen aspect due to the status bar taking up the bottom area. To make matters worse, those of us without the processor required to software render 35 frames per second (Doom’s cap, half refresh for a VGA CRT’s 70Hz) would often shrink the 3D window to improve framerates. All of this is very common for earlier 3D games (I remember playing Quake 1 two years later similarly), which often had difficulties consistently staying in the “interactive framerate” category. For most it was a dream to output near the maximum displayable image while calculating an individual output value for every pixel of every scan-out and that limitation was not primarily due to early framebuffer limitations.

It is 2004 and I’m playing Half-Life 2. Rapid advancement then convergence under a couple of API families for hardware acceleration has meant most of the last decade provided amazing 3D games that grew with hardware capabilities (even if many earlier examples contain somewhat arbitrary resolution limitations). Even 1998’s Half-Life 1 has quickly jumped past low resolution 3D consoles like the PS2. Super-sampling (SSAA) where every final pixel was internally rendered several times then blended (used extensively for offline rendering) was usually too expensive, especially as screen resolutions continued to increase (initially for 4:3 CRT then LCDs moving to 16:9). But by this point, it was standard to use MSAA to blend samples from different polygons that partially covered a single pixel (the saving being that if multiple coverage points were covered by the same triangle, the shader for the final value was only run once, unlike SSAA). Two years later, nVidia would introduce CSAA to allow more coverage sample points than cached values, making it even cheaper to provide very accurate blending between polygon edges. It was even possible to mix in SSAA for transparent textures, where the edge of the triangle is not where the aliasing happens. Note how those 2006 benchmarks are already showing PC games running at the equivalent of 1080p120 with limited MSAA or 60 fps with many many samples per pixel.

It is 2014 and I’m playing the recent reboot of Tomb Raider. MSAA continued to get faster and better in the intervening decade but unfortunately the move to deferred rendering made it extremely difficult to implement efficiently into newer engines (it is not possible in Tomb Raider, although some deferred renderers did get hacked by nVidia drivers that injected MSAA at an acceptable performance cost). The answer to major aliasing, which had been developed during the xbox 360 generation of consoles, was to run a (MLAA) post-processing pass that looks for high contrast shapes typical of aliased lines and then employ a blur to ease the sudden gradient. This technique requires very clear aliasing telltale line segments so smaller detail like foliage systems become a huge issue, which really stands out in the sequel, Rise of the Tomb Raider. It also completely fails if you apply the pass after doing some other image manipulation that distorts the telltale shapes or edge gradients.

In this 2014 era, the use of HDR intermediate values later tonemapped down to the output range, which was just emerging after HL2, also makes it so that internal calculations can output a much wider range of values and with only one sample per triangle per pixel, a new sort of temporal aliasing become dominant as the sampled locations move enough for slightly different angles to be calculated grazing incredibly bright light sources in sequential frames. Surfaces sparkle and flicker in regular patterns that become at least as distracting in motion as classic polygon edge aliasing, as I mention in my Dragon Age retrospective. A combination of the two aliasing types is easily recognisable where an angle creates a strong lighting highlight along the silhouette of a surface that may be less than a pixel wide, creating light ants crawling along those polygon edges which are too thin for MLAA to catch. A better solution was required. (And you may note the journey isn’t over as I just linked that to a trailer for a 2021 game with an engine that already uses...)

Temporal Accumulation

The problem is clear. By 2014 we are generally using one (complex) sample per pixel per frame and due to fine geometric detail (older games lacked) plus an extreme range of possible lighting values (not to mention potential ordering issues in how various stages of calculating light and darkness components are blended) this is creating pixel-scale aliased elements that are also often not temporally stable. The screenshots look relatively good but in motion anyone with flicker-sensitivity is immediately distracted by aliasing. By this time the shaders have also become complex enough that various motion vectors (showing how far the object under each pixel has moved in the previous frame) are starting to be calculated to enable somewhat accurate motion blur to be added (very important on consoles targeting 30 fps, where this provides extra temporal information missing when not using higher framerate output - it’s also “more cinematic” because most people are used to 24 fps movies with a 180 degree shutter so accumulating all light that hits the lens for 1/48th of a second before closing the shutter for another 1/48th of a second).

Those motion vectors, if they are sufficiently accurate, can point to the pixel location of the object in the previous frame. So expensive effects like real-time ambient occlusion estimation (checking the local depth buffer around a pixel to see how occluded the point is by other geometry that would limit how much bounce lighting it would likely receive) becomes an area of experimentation for temporal accumulation buffers. Sample less in each frame, create a noisy estimation of the ground truth, and filter for stability while reprojecting each frame along the motion vectors. Here’s a good walkthrough blog from this time period and subsequent refinements have worked to deal with edge cases like an incremental buffer not handling geometry arriving from off-screen (causing some early examples to obviously slowly darken geometry as it appeared along the edge of the screen).

As seen shipping in 2013’s Crysis 3, temporal accumulation for reducing aliasing not only presents the answer to MLAA’s limitations but also can operate after a cheap MLAA pass to rapidly reduce all aliasing. If you consider a slightly jittered pixel centre location (a common enhancement) then a static scene under TAA effectively generates SSAA-quality images, only spreading the samples per pixel out over time. It was popularised further by nVidia with their branding of the process as TXAA, shipping in games in 2014. Some early implementations had major ghosting issues from motion vector precision and understanding when to reject a previous frame’s data as not contributing to this new location. The actual complexity of this problem becomes apparent when you consider how objects in a scene may have changing visibility (especially during motion and animation) or output values (consider a flickering light and the subsequent illumination between frames). Progress has not always been uniform and a couple of times I've stumbled upon an anti-aliasing fail state that's hard to even explain (Dishonored 2 doesn't have very satisfying TAA due to ghosting thin elements and I don't know what the MLAA is doing here to achieve what's visible in this capture). It is a process under constant refinement but in today’s best temporal accumulation implementations it is often relatively rare to see obvious issues. As mentioned, it also errs on the side of a softer final frame so can be combined with a sharpening filter. Unfortunately this can be handled poorly, effectively paying the computational cost of TAA while then also reintroducing exactly the obvious aliasing that it was meant to remove. It also doesn’t help if your TAA implementation is broken on a platform.

Ray Tracing with DLSS and The Future

In the last couple of years, the new hotness that really explodes the computational costs of working out a stable final value of each pixel in a frame of a modern game is real-time ray tracing. Thanks to nVidia looking to brand the future, they have shipped all RTX GPUs with dedicated silicon to accelerate BVH intersection tests and machine learning tensor operations (big matrix multiplies, often with sparse data) and at least the former part of that is now also available on current AMD GPUs and consoles plus upcoming Intel discrete GPUs. If you thought the aliasing issues from rasterisation going to physically-based materials and HDR were a concern, welcome to a problem so far beyond that that if you look at the underlying data from a single frame using around one sample per pixel, it looks more like white noise than a coherent scene - accumulation with temporally reliable motion vectors is a must and site of ongoing research. The addition of Tensor cores to RTX GPUs was initially proposed as the place to run AI denoising on that ray tracing output, although most games today still denoise in the general purpose shaders. Luckily, another branch of research was to use those Tensor units to AI-accelerate all anti-aliasing and it has been wildly successful with many reviewers now noting that DLSS 2 outperforms native resolution TAA.

DLSS 1 was a bit of a mixed bag as the AI had to be trained on each game and took an aliased lower resolution image from the game then applied the classic AI Super Resolution techniques to “dream” or “hallucinate” the missing details and softened edges. However, DLSS 2 changed the inputs (this presentation originally convinced me AMD would add AI cores to RDNA2) and so required a buffer of previous low resolution input frames (including depth buffers and motion vectors) while removing the previous individual training requirement, effectively giving the AI the power of temporal accumulation information to generate the final output. So each new frame generated by the game can be run at a much lower resolution than the output, reducing the samples per output pixel, and yet will retain the look of a cleanly anti-aliased native resolution render. We are back to 1994 but rather than peering into a small box, the games look almost as good as offline rendering and output fullscreen. Even when not trained to give the exact same result as native processing, the AI seems to be quite stable and creates pleasing results in motion. It’s a game changer when targeting new screens that can accept 4K frames at or above 120Hz.

But nVidia do not have a monopoly on upscaling while anti-aliasing and more significant upscaling without compromises will be the new normal if my reading of the tea leaves (on samples per pixel per frame) is correct. Reusing information from previous frames is clearly a smart efficiency saving as long as we can reliably determine what information is useful and what isn’t (avoiding failures that create significant artefacts which are as distracting as the aliasing we’re trying to move beyond or the framerate drops we’re trying to avoid). The target of 4K on the PS4Pro forced engines to pivot to smart upscaling strategies such as the use of checkerboarding and a rotated tangram resolve in Horizon: Zero Dawn, reducing GPU costs of each new frame by alternating which pixels in a checkerboard were rendered (then blending on the diagonals for that frame while adding in contributions from the previous frame). Recent years have seen an excellent execution of targeting the fixed scan-out time of non-VRR displays by managing the rendering load around modifying the internal render resolution then upscaling for the final presentation (usually with native UI compositing over the top for maximum text clarity). Even when dynamic resolution scaling is not available on PC, it has forced renderers to provide visually pleasing upscaling that gracefully handles even fine texture transparency and pixel-wide polygon details.

The Medium, TAA 50%
The Medium, TAA 75%
The Medium, TAA 100%

The last few years of Unreal Engine 4 have had quite a clean TAA with integrated upscaler (sometimes called TAAU) for dynamic internal resolution (it tracks the sub-pixel jitter so the samples can be correctly distributed even when changing the ratio of internal res to output res; primarily used on consoles, where the APIs for precise frame time calculation and estimation have existed for longer and the fixed platform make it easier to define an ideal internal resolution window for reliable results that still come close to maximising GPU throughput - the skill is not underutilising the GPU by being too conservative and so being ready for scan-out milliseconds before needed). In the best cases, I am completely happy to run UE4 around 80% resolution (just under 1800p) and let the TAA upscaler reconstruct a soft and clean final image on my 4K PC big screen (getting close to home cinema levels of consuming my vision so making aliasing issues more apparent than someone looking at a distant TV or small monitor). It doesn’t compete with DLSS (in Performance mode that is a 50% resolution so 1080p internal renders when the output is 4K) but then head to heads show DLSS 2 reaches close to image quality parity with UE4 TAA running at 100% internal resolution on PC so clearly dropping down to 1800p is under 70% of the actual sample count (previous percentages are edges vs sample count is area) and ensuring a relatively aliasing free result without AI will err on the side of softer than DLSS Perf. The above captures from The Medium show a clear quality loss at 50% while the differences at 75% are more subtle compared to native internal resolution. The captures from Man of Medan below are where I think TAA with some upscaling is showing quality levels that you would not even imagine possible in the MLAA era (expecially noting these captures have significantly fewer samples per pixel per frame than those games from a decade ago).

Man of Medan, TAA 85%
Man of Medan, TAA 85%
Man of Medan, TAA 85%

With the public release of Unreal Engine 5’s beta shipping with default-enabled Temporal Super Resolution, we are looking at the beginning of non-AI (or at least not running on Tensor cores) TAA plus upscaling that aims to hit the same milestones as DLSS when it comes to low internal resolution. The PR for the UE5 release announces 1080p internal render resolution, aiming to hit the quality bar of 4K native. That is an ambitious target and running the editor (which also uses UE5 TSR by default) there is a lot to appreciate about this beta’s visual quality, well beyond the 50% screenshot above from UE4’s technique (and that was already significantly above some previous branded sharpen plus upscale techniques as implemented in shipping games). We are approaching a point where continued refinement of this path of research will be able to pick away at the final issues and retain detail without turning the results into a mess of sharpening halos or lingering aliasing. From there we have a far more interesting future in which some games will be able to explore the artistic choice to reject such smoothing, rather than fall into them via broken PC releases, or even take the performance wins of significant upscaling while tweaking output to retain more of the underlying grainy component of ray tracing or other contributions (while adding noise to areas where it does not naturally occur and so approach something close to movie film grain that actually looks good but reduces render cost rather than increasing it slightly).

Edit (June 2021): This was written on the assumption that the imminent reveal of AMD's FidelityFX Super Resolution would confirm a very similar technique to UE5's Temporal Super Resolution, directly chasing after DLSS's impressive results at similarly low internal rendering resolutions (using fewer samples than checkerboarding and far fewer than where other TAA upscaling, such as in UE4, shines). It has since been announced that AMD are zaging where others have zigged and will not be using a temporal solution. Worryingly this has come with rather weak results on the one pre-release promotional image used to sell the technology. As I mentioned above, DLSS 1 did not come out of the gates a winner so AMD have plenty of time to iterate or to provide an open equivalent that replicates what Epic are doing with UE5.

Friday, 30 April 2021

VR Review Roundup 1

Since last month, I've had some time enjoying my new PC VR setup [I also switched domain hosts so if anyone has had any problems with this site or any of my other subdomains, let me know]. I can definitely feel the growing room that exists for really pushing the fidelity of this VR display (and with displays only getting higher resolution from here, that will continue), so all of this is currently being given with the caveat that we are fast-approaching the fifth birthday of my GPU - at some point I'm going to be able to get more games looking nicer or enjoying far less time viewing reprojected alternating frames, maybe even at the highest 144Hz refresh rates that this headset can do. One of the difficulties I have in VR is always being able to be as analytical as I'd like while wrapped inside the virtual space and the default tools for capturing moments are not raw grabs (into the actual deformed view fed to the headset) while adding actual DirectX frame capture into the rendering chain might mess with latencies etc (I've yet to look into it). Let's run down some notable things I've played recently and what my perceptions are of the rendering going on:

No Man's Sky

I just couldn't get this working correctly. Not sure if I'm still finding my "PC expert" legs on how to set things up correctly for VR but the flying-through-space loading screen (along with very unstable movement to photons delay) was enough to make me feel slightly unwell & the framerate once I'd landed on a planet simply wasn't where it needed to be (even after tweaking the Index scaling option well below the automatic value). Maybe I needed to poke more at the in-game settings or wipe my previous config file (from before VR was patched into the game) because aiming for 2D 4K60 and aiming for VR numbers are not remotely similar optimisation processes and the game isn't reading the SteamVR requested resolution correctly. It's important to not take this initial post as being my final decree on modern PC VR (from the perspective of someone who previously has mainly been configuring console VR experiences) - it is still early days and I'm still finding my legs for tweaking VR games.


Star Wars: Squadrons

The scale is wrong. Digital Foundry noted something similar during their stream of Doom VR for PS VR last month and it's immediately very noticeable as soon as you get into this game. When sat down, the floor in the game is about at the level of your actual floor but everything is scaled as if you were standing up. If you do stand up and reset the position (your screen initially going to black as soon as you move out of the sweet spot it expects you to be in, avoiding letting you walk and clip through too much geometry) then the virtual floor is clearly at the wrong depth, as you might decide in a game designed only for seated play.

Getting into the game design choices, the cockpits may be accurate to the fictional universe but the often limited front-only (or slit window) view into space removes the field of view advantage of VR while the instruments feel insufficient to give a good sense of where things are around you (again, this may be true to the source material but I'd much rather they offer "upgraded" in-world interfaces rather than lean on an optional floating HUD). There was a later mission around setting off floating reactor cores as large ships passed them (while also skirmishing with fighters) and I realised I basically didn't have a good idea of the 3D space while playing for the majority of that mission. That seems like a failure to really utilise what VR can do. It didn't help to find plenty of threads of others swearing out that mission design, despite being something that theoretically should be cool if it was easy to judge 3D relationships - ultimately I restarted the mission rather than keep banging my head against the third checkpoint, just so I could swap to a ship with a somewhat better canopy and by then I had almost learned it rote (fly here, shoot this, then fly there, shoot that at this timing, etc) so if that had involved a more dynamic setting then I might well have just given up.

The Frostbite temporal anti-aliasing is surprisingly good (considering FoV & pixel density requirements) here. Zero ghosting issues, even with the added difficulty of regular reprojected frames because I couldn't get a high res VR output at a stable frame time budget close to what you'd want, even with lots of settings dropped and including the new [lighting: Low] forward renderer mode that was patched in precisely to try and offer higher framerates for VR. The way fine detail starts to flicker out of reality at certain distances can become visible (even in only one eye at a time for a real headache) but is generally very rare and it's a lot better than constant "army of ants" edge aliasing (especially how that works the other side of lens inverse distortion to be even more distracting than in 2D). As we get higher and higher res panels in VR, we will need to find a better solution than brute force (very high res super-sampled internal rendering) for cleaning object edges and DLSS or TAA (without introducing significant latency) seems like something that's going to be the future (not just for 2D). I was also recently playing Battlefield V (with settings trying to hit a stable 4K60) and the TAA there caused significantly more issues with thin objects fading out of existence so something in the TAA used here (with a decent 'TAA sharpness' slider that's not 90% way way too much sharpening for anyone to want) felt like the best of what EA are doing.

I was definitely far more aware of polish issues than any aliasing flicker or eye discrepancy. Plenty of walk animations seemed to have not been sorted to actually plant feet on the ground so ended up with very obvious skating feet. By no means is this just a tick-box "IK enabled" fix but it's a lot closer to a solved problem, slightly weird to see not working correctly, than it was a decade plus ago (when I was doing some light animation work for video games). Reflections on the black gloss floor of Imperial bases constantly showed that they were not well aligned with cubes for the static positions from where the player would be observing them (it seems like they could have generated enough static cubes as you teleport between very few view locations and with no free movement or room-scale VR due to the screen fading to black if you moved around).

SuperHot VR

Now here is where I wish I knew how to really prod a game (maybe one of the Unity tweaker/console tools could provide some aid). This style would be perfect for some MSAA anti-aliasing that did more than mild super-sampling around the edges (which wastes shader perf on repeatedly sampling inside basically flat-shaded triangles while undersampling at the edges where the sharp contrast demands the best). The game as shipped doesn't even seem to be able to offer the post-AA (FXAA) that the non-VR games from this team have integrated. And you can't inject FXAA at the driver layer because the inverse warp for the lenses will remove the clean aliased lines that the morphological pass is looking for (assuming the nVidia driver doesn't detect VR titles and disable such tweaking entirely). While moving the Index scaling option clearly affected framerates, the aliasing never cleaned up significantly.

The game itself is still just as fun as it was when first released but, even with tweakable internal res on PC, it's still a long way short of where I would hope it could get to visually (and will likely not be getting any more updates that could add better anti-aliasing as the teem are almost finished with the 3rd game in the series, which is not in VR, then probably moving on to new things). Even a lot of brute-force super-sampling will possibly only go so far to fixing those incredibly sharp aliased edges accentuated by the game's style - something where you're wondering if something in the pipeline explodes if you push beyond 8K rendering so it'll never be viable to do so even with GPUs several generations out. You can definitely get immersed in the experience and have it bother you slightly less over time (especially if you push up refresh rate so you're getting more temporal data rather than letting aliased frames linger, something faster GPUs certainly help with) but quite a few games I've sampled seem to have decided that AA, even a cheap post-AA pass before distortion, isn't in their performance budget and I really think it's not paying off vs targeting a lower internal res but with an AA method enabled. Of course, on PS VR you often had the combination of a low internal res and no AA so at least on PC things are always less bad.

Tetris Effect

All the games I'm talking about this month provide a contrast of different techniques and rendering challenges. I talked about this on PS VR several years ago. It was one of the best games of the year in 2018 and the multiplayer mode is a nice addition in 2021 (but not really why I come to Lumines-style games) so it's still great today. The fidelity here is clearly better than on PS VR, although I found that super-sampling can push down the framerate below where it might be (even with only a 90Hz target rather than pushing towards 144Hz) without ever really making it feel like every sharp edge is anti-aliased (in combination with the FXAA the game uses). Much of the amazing particle work doesn't need AA (despite the High setting defaulting to 150% super-sampling the entire scene) and those semi-transparent particles probably causes major issues with trying to enable a turn-key AA solution so it's a shame someone hasn't built a more bespoke solution that merges the various different techniques each element of the scene needs while maximising performance (to hit the high framerates and native resolution needed for this generation of VR headset).

As with Rez (which I have not yet tried on PC, waiting for a sale to buy a second copy for a second system), there is something I find deeply pleasing about the audio-visual combination here and the soundtrack brings out the in-built speakers on the Index when cranked all the way up. There is certainly not the same deep bass you'd get from a subwoofer (it would be interesting, if not ideal for those living in apartment complexes, to be able to feed the LFE channel to a separate audio device in a game that bothers to enable a second rumble device for additional haptic feedback) but it's not bad. This isn't tinny (which is always the fear when doing something like off-ear small speakers) and is at least as good as a quality set of in-ear canalphones, but with the potential here for better positional audio because it doesn't ever feel like the audio is originating from inside your head.

I couldn't get the Index controls working exactly how I wanted them (anything linked to the right analogue stick is locked out despite the VR mode not using the right stick for anything so I couldn't rebind it; for some reason the individual buttons & pressable surfaces on the Index controllers did not all seem to turn up in the menus, which seemed designed assuming Vive or Oculus layouts and even recommending you not use those but rather plug in an Xbox controller) and this is a bit of a recurring theme in games that I have poked at. The Index controllers are a bit of a variation on the Vive designs, which changes the angle it thinks "forward" is from them but also shuffles the inputs around so that you're sometimes wondering exactly what the game is expecting when an icon from the Vive pops up. It's something that likely won't get ported back into older VR games and hopefully Valve will provide free engineering time to assist VR developers integrating prompts and defaults into their current or upcoming releases. This game is totally fine with a very old 360 controller (as long as you map things off the d-pad, because you can't drop and move Tetris pieces with a d-pad that poor at reading precise inputs) but I'd really like it to be pick-up-and-play with the Index controllers.

Half-Life: Alyx

And here we reach the culmination of a lot of VR work. Valve created an updated version of their engine and built the next entry in the Half-Life series around the development of a new headset and controller update from their earlier cooperation on the Vive ecosystem. That is the Index headset and controllers I'm currently using. This is exactly the game you expect it to be from the developer who have infinite money and time (but seemingly far fewer developers than studios that scaled out when AAA asset creation demanded it) to iterate on their previous design ethos: constant innovation during play. When I discussed Killzone: Shadow Fall in 2014, Half-Life 2 was the obvious title to compare it to when talking about combining a narrative progression with first-person gameplay variety. And that's exactly what we get here in VR, a slow development of new tools and ways of interacting with the world that also slowly eases between several genres from action to horror. Very early on, when you're introduced to the power of 'gravity gloves' to point at an object and pull it towards you, it becomes obvious, "surely everyone should be doing this!" That's the Valve magic: making something that feels like it's the only answer and something everyone else must adopt because it so cleanly solves a problem (you don't want to have to physically slowly move over to pick up every little thing while keeping the action pace up moving through a gaming environment).

On a technical side, this engine is doing exactly what all the early best practices notes (which came from engineers pushing VR like the team at Valve) said you should. Get back to forward rendering (use forward+ or similar clustered options if you want many real-time light sources that your deferred renderer was enabling at high framerates), go back to classic MSAA, and try to get a lot of pixels rendered while maintaining modern geometry and texture detail. Step back to less dynamic lighting if you have to, which is already something HL2 was excellent at mixing to hide just how much wasn't part of some unified real-time lighting solution. The end result: a very sharp result and something that I fully expect to really sing on future hardware (both higher res headsets than the Index and the future GPUs that can drive them at high resolution while hitting 144Hz native). The only thing that actually feels extremely outdated is the level loads between sections, something a level streaming solution could surely have completely alleviated.

As to how it looks on my older GPU driving the Index? At points it's a touch too sharp for me. The textures can crawl and alias a bit in spots and the edge anti-aliasing is good but not perfect. I'd prefer a softer output that manages to deal with shader aliasing, even if it might have more issues around transparencies and thin edges (here using super-sampling on texture transparency, the old classic that we don't see so much of in 2021 but really made the chainlink fences pop in 2005 in games like Half-Life 2). But beyond some mild criticisms, it holds together really well. That's why I think it'll work very well in the future (selling an entirely new generation of headsets on PC and presumably even console). Unlike some of the other games, I think you could pump up the internal res and maybe integrate some VRS or even DLSS to boost output resolution without linearly increasing GPU load (spending your fidelity more smartly with VRS Tier 2 or simply letting AI magic clean aliasing defects while chasing a fixed frame time with DLSS 2.1) and so remove those small criticisms without demanding a radically more powerful GPU.

Thursday, 4 March 2021

My Present/Presence in Virtual Reality

I did not get to experience the CRT-based early consumer VR hype, because that stuff basically failed to make it to market in any real sense and was at trade shows before my time covering the industry (let alone working somewhere that got sent dev kits). But I did enjoy the early success of consumer stereoscopic 3D gaming. I jumped into both the 1999- (Elsa Revelator) and 2008- nVidia 3D Vision ecosystems (first on a CRT and then on a high refresh rate LCD) and while the second push also came with some 3D movies (as cinema chains tried to find a new reason to spend too much to go to the movies), the main draw for me was always interactive 3D experiences. As long as you kept your head still & tweaked with the 3D settings, you could get an impressively convincing window into a miniature 3D world. Things just feel different when you can use your vergence to focus on different elements in the scene (because things close to you & distant cannot both be in focus at the same time so you have to pick which is double - although this tech does not simulate the soft-focus that reality provides to the out of focus depth) and have successfully trained yourself to disable your accommodation-convergence reflex (current VR has the same limitation but we're on the edge of consumer eye-tracking that could allow renderers to apply depth-of-field based on gaze tracking).

When consumer VR was back in the "maybe this could work at consumer prices" stage of Kickstarting in 2013, I started paying attention. Take a high resolution consumer phone screen, add some lenses, and read from the ok quality gyroscope/accelerometer package that phones also now include, and you're just some better motion tracking away from a real VR setup. Without that tracking, there isn't quite enough precision for rotation and you've got a major issue with drift (which you can see with your phone, if you've ever tried to do something fancy with that sensor package) plus the accelerometers are far short of what you need for sub-mm position tracking (as you'd expect from having to do a double integration from acceleration to velocity to position with no external validation).

In 2014 I got the Oculus Rift DK2 to try out a few projects for myself. This fuses the sensor package readings with an external IR camera looking for IR LEDs on the headset. The low persistence displays (you can't leave the image up until the next frame because the headset will be in motion and this smear can cause huge issues - I believe the good series of blogs on this by Michael Abrash all got purged from the Valve servers at some point last year but Archive.org remembers) offer up to 960x1080 PenTile per eye (half a 1080p screen, assuming the lenses go right up to overlapping views with your eye position in the headset) at a maximum of 75Hz. It's dev hardware, but it was only $350 and kinda works. The real issue for me was the PenTile pixel layout because that was a major thing for OLED phone panels at the time and means for each input pixel you only got two colour elements rather than the three of RGB. To me, while we're often talking about the bandwidth limits of higher refresh rates and 4K displays or the GPU load of calculating each sub-pixel's value, effectively throwing away a full third of the information when it hits the display (because the red and blue channels are half resolution on the actual screen) seems like a waste. It also means that the number of individual dots of light in the headset you're looking at is only twice the pixel count (skewing comparisons with RGB layout panels in other devices). Some early consumer games played on the DK2, although I seriously doubt everything released in the last couple of years would work (even if you can accept the quality limitation) as I don't think the current SDK still supports the very early dev hardware.

In 2016, I got a real consumer VR headset with PlayStation VR. $300 got you Sony's spin on their existing line of personal 3D viewers (which I'd always seen advertised as a way of looking at a movie on a plane in a virtual cinema) and the big upgrade from my DK2 was an RGB OLED layout at the same resolution (so that's 50% more individual points of light from the sub-pixel count increase) and up to 120Hz. The camera used visible not IR light to track things and reused the PS3 motion controllers if you wanted to play something not designed to work with the motion-enabled DualShock 4 (default PS4) controller. The big setback: it was released around the time of the PS4 to PS4 Pro transition and most software was mainly made assuming the rather paltry GPU inside the 2013 PS4 (which, even at release, was not even a particularly high-end customised AMD). An external box added support for virtual 3D audio and 2D pass-through to a TV (some games even made social experiences where the players on TV saw something completely different to the person in PSVR). A lot of games seemed to rely heavily on reprojection to double the effective framerate and the tracking was not great (especially for controllers which were either actual PS3 motion controllers repurposed & never intended for exact tracking or a standard controller that likewise was not originally designed for sub-mm tracking because it was just bringing forward the legacy support from the Sixaxis "we were fighting a haptics patent so couldn't include rumble in the PS3 controller so I guess have some motion sensors" controller).

In the before-pandemic times, I also had access to (but never had at home) the commercial first revision of the Rift and HTC Vive. Both 2016 headsets, both 1080x1200 per eye PenTile OLEDs (two actual panels, not one screen with lenses aiming to almost overlap) at 90Hz. At two and half million sub-pixel elements that's actually a lower dot density than the PSVR (about three million) but the advantage is everything expects a higher resolution and modern PCs can really drive those rendered pixel counts up (even using anti-aliasing) as the GeForce 10 Series was out by 2016. The Vive is interesting because it doesn't use a traditional camera for drift correction or sensor fusion; rotating lasers in base stations provide a moving slice of light for objects to orient & position themselves within (synchronising with a wide IR pulse to know the timing of when in the rotation the laser hit them). For the last year of lockdown, I've had no access to this kit and I'd not really used it for a year before that. So I've basically been PSVR-only and while the exclusives have been good, stuff like Resident Evil 7 sure does seem like it'd be better if it wasn't tied to that console GPU. Both the PC headsets have been superseded by higher spec updates but I've not seen anything of them up to this point.

That is, until a week ago. Thanks to the incredible generosity of someone reaching out and offering to ship me their Valve Index VR kit, I now have a modern PC VR headset at home. The Vive was codeveloped by Valve so they decided to take the lead in 2019 and release their own branded kit. The same base station tracking tech but here paired with a headset that offers 1440x1600 per eye RGB from 80Hz to 144Hz (and a somewhat higher field of view than any of the other kit I've used). The audio uses portable "ultra near-field" speakers, which sounds surprisingly good (considering I normally use in-ear or closed headphones which provide good conduction) and doesn't block out sound from the outside world (otherwise it can feel a bit like you're extremely vulnerable when immersed in the presence of VR). I'm glad I can stand in a quiet room so get all the benefit of off-ear sound (you don't need to simulate the distortion of your ear shape because that process still happens - preventing the sound from feeling like it comes "from inside your head") and it continues to be immersive.

The other huge update is the controllers. My limited time with the Vive was using their motion controllers (lot of time with the Rift & PSVR was using traditional wireless gamepads) and the Index controllers are certainly a refinement of that basic idea but rather than holding onto two sub-mm tracked devices, these you tie to your palms and so can entirely let go. The importance of precise tracking can be seen in how you put on the VR kit: with PSVR you need to know where the controller is before you put on the headset; with an Index the controllers need to be switched on but once you put on the headset you can easily walk over to the controllers and put them on using their 3D rendered virtual versions. I'm almost ready for the future where we go into VR by putting gloves on. Yes, you'll never beat the haptics of a real button press or trigger pull but, for a lot of VR experiences, actually having some virtual hands is all you need. This has opened my eyes to where VR gaming isn't just traditional gaming but with fully-immersive environments and extra input from head tracking. With the next generation of devices, gaze tracking should provide even more efficient rendering (only render the highest resolution where you're looking) but also entire new interfaces that are controlled with a look and a hand gesture.

Up next (after maybe a couple more weeks of dipping into all the PC VR experiences I've been missing out on): what are my actual impressions of playing various things?