Friday, 31 August 2018

Rust: Fail Fast and Loudly

So recently I was chatting to some Rustaceans about library code and their dislike of a library that can panic (the Rust macro to unwind the stack or abort depending on your build options). The basic argument put forth was that a library should always pass a Result up to the calling code because it cannot know if the error is recoverable. The chapter of the Rust Programming Language book even lays out this binary: Unrecoverable Errors panic! while Recoverable Errors return a Result. As a researcher in debugging, I reached the point where I basically banned these terms from my lectures because they can potentially lead to this thinking that libraries cannot know they're in an unrecoverable state and so can only defer to what calls them.

Terminology

Throughout CompSci literature, some terms relating to debugging are not used consistently. I'll start with the words I use (so I never have to write this in a blog post again). To illustrate the scale of the terminology issue, enjoy this quote from the 2009 revision to the IEEE Standard Classification for Software Anomalies:
The 1993 version of IEEE 1044 characterized the term “anomaly” as a synonym for error, fault, failure, incident, flaw, problem, gripe, glitch, defect, or bug, essentially deemphasizing any distinction among those words.
A defect (also called a fault, error, coding error, or bug) in source code is a minimal fragment of code whose execution can generate an incorrect behaviour. This is for some input (which includes the environment of execution), against whatever specification exists to declare what is and is not correct behaviour for the program. A defect is still a defect even if it is not exercised or does not cascade into error / failure during a test case. Defects can be repaired by substituting in a replacement block of code into the block reported as defective; this returns the program to executing in a way that does not violate the specifications.

An error (sometimes also called a fault or infection) in program execution or modelled / simulated execution is the consequence of executing a defective block of code and the resulting creation of an erroneous state for the program. This effect may not be visible and not all errors will be exposed by surfacing as a failure. An error is the result of a defect being exercised by an execution which is susceptible to that defect.

A failure in program execution or modelled / simulated execution is the surfacing of an error state by the observation of behaviour in violation of the program specification. It is therefore correct to say that a failure was experienced for a given test case due to a chain of erroneous states that originated with the execution of a defect that caused the error.

Setting a Trap

Having muttered about the language choices made in the Rust book at the top, I'm going to also praise how they actually resolve that chapter. The final section goes into detail about the pros and cons of calling panic from your code. It defines bad state in a way I might write myself in a practical programming guide. It even offers up the type system as a way to ensure your specifications for input aren't violated, with as much of the burden placed on compile time as possible.

It's good writing but it can also potentially be read by those who really wish panics didn't exist as saying you just need to make sure every possible input into your library is valid. My stance is that an occasional small slice of invalid input being possible is actually important for code quality when writing in a language that can fail (it can also make some things a lot easier to write in practice). However, it must be clearly labelled as such, with no question about when you might panic. This is the contract you're writing and every good library should fully document the interface so there is no possibility of an unexpected panic.

To give an example from the Rust standard library (which is totally just a library and we should expect other libraries to conform to the same standards it uses - this is even more true of Rust than in other languages as Rust splits out the really core library code into the Rust core library). When you've got a vector and you need to divide it in two, split_off(at) is what you need.
Splits the collection into two at the given index.
Returns a newly allocated Self. self contains elements [0, at), and the returned Self contains elements [at, len).
Note that the capacity of self does not change.
Panics if at > len.
Here we have a clearly defined operation that does exactly what we want and comes with some important guarantees about how it operates. One of those details is that if we ask to split beyond the end of the array then it will panic.

Why does this panic rather than returning a Result and letting us decide if the error is recoverable or not? I can imagine many places where trying to split an array may not be the only thing a program can do to continue, a backup path could be constructed to continue operating under some circumstances if that failed but this library decision means the calling code cannot decide that. If you ask for a split at an invalid point then you get a panic.

It is because the library set a trap. It asks the calling code to know something about the object it wants to be manipulated. Because there is no reasonable way of asking for the array to be split in two beyond the end of the array, the only conclusion that the library can make about such a request is that it is unreasonable. We are past the point of executing a defect, we are swimming through an erroneous state, and it is time to fail so this can be caught and fixed. That also means no room to let the erroneous state accidentally ask to zero the entire storage medium it has access to and trash a last known-good state that might be used to recover later (or debug the defect). Any calling code that wishes to avoid this should catch the erroneous state and recover (if possible) before calling over the library boundary. The potential to get unreasonable requests allows our blocks of code to keep each other more honest by surfacing errors as failures.

It is a restriction that provides a higher chance of fixing a defect before we ship a product. We must strive to fail fast and sometimes that means using some small gaps between what is possible and what is permitted as traps to catch when errors have occurred. A library can be poorly constructed to panic when not expected (and declared) but the existence of panics should not itself be used as a sign that a library is of poor quality or to be avoided.

Saturday, 28 July 2018

Empty Rust File to Game in Nine Days

I've been doing Rust coding for a bit now. Recently that's involved briefly poking at the Core Library (a platform-agnostic, dependency-free library that builds some of the foundations on which the Standard Library are constructed) to get a feel for the language under all the convenience of the library ecosystem (although an impressive number of crates offer a no_std version mainly for use on small embedded platforms or with OS development). I'm taking a break from that level of purity but it inspired me to try writing a game just calling to the basic C APIs exposed in Windows.

So I'm going to do something a bit different for this blog: this post is going to be an incremental post over the next nine days. I'm going to make a very small game for Windows (10 - but hopefully also seamlessly on previous versions as long as they have a working Vulkan driver), avoiding using crates (while noting which ones I'd normally call to when not restricted to doing it myself). I'm not going to rewrite the external signatures of the C APIs I have to call but I'll only be importing the crates that expose those bare APIs.

Day 1

The day of building the basic framework for rendering. Opening with a Win32 surface (normally something you'd grab from eg Winit) and then implementing a basic Vulkan connection (which would normally be done via a host of different safe APIs from Vulkano to Ash, Gfx-rs to Dacite).

The Win32 calls go via Winapi, which is a clean, feature-gated listing of most of the Win32 APIs. The shellscalingapi is a bit lacking as it doesn't expose all the different generations of the Windows HiDPI API (and Rust programs don't include a manifest by default so you typically declare your program DPI-aware programatically) which means you have to declare a few bindings yourself to support previous editions of Windows. But generally it makes calling into Win32's C API as quick as if you were writing a native C program including the required headers. You could probably generate it yourself via Bindgen but the organisation here is good and it's already been tested for potential edge cases.

Vulkano exposes the underlying C binds via the Vk-sys crate. It has no dependencies (so it's what we want: just something to avoid having to write the signatures ourselves without obfuscating anything going on) and while it's not updated to the latest version of Vulkan (1.1), we're only doing a small project here (so it shouldn't matter at all). The function pointers are all grabbed via a macro, which is a bit cleaner than my previous C code that called vkGetInstanceProcAddr individually whenever a new address was required (to be cached). Of course, other areas are down to just the barest API which means looking up things like the version macro.

So at the end of day 1, we've got a basic triangle on the screen working (with a 40kB compressed executable, most of which is Rust runtime/stdlib as Rust libraries default to static linking).

Thursday, 28 June 2018

Mini-Review: Slay the Spire

So early last year, it was already clear that we were getting a lot of extremely good games (and going on the number of February release dates announced at E3, 2019 looks like it's going to be similar). This year has started with fewer tent-pole releases (most notably, God of War) and far less focus on RPGs overflowing with content (which gave early 2017 a very specific feel) but there certainly have been some great games like Mashinky building up in Early Access and BattleTech getting a full release. Into the Breach is another game from earlier in the year that I've not written about yet but is very nice. There's something in the strategy/tactical water this year and it tastes like roguelike-likes. The genres have always been somewhat mingled, what with 4X games (or even solitaire games) being about semi-random runs which build their own story through the mechanics (and that's where Mashinky fits in), but much of 2018's output (They Are Billions entered Early Access at the very tail of 2017, I'm counting it) feels explicitly part of the current roguelike-like wave. Sometimes it's unclear which side of the line games are aiming for (Frostpunk is probably going for more scenario-based rather than the endless replayability of rogue).

Slay the Spire, currently in Early Access with plans for a release sometime this Summer, is a deckbuilder game. If you're not familiar with the genre, it's the assembling of a card deck from CCGs (like draft format) without that pesky monetisation of the acquiring of the cards required. You fight battles (here entirely PvE against clockwork enemies who have predictable patterns and compositions rather than branching AI) and work out how everything synergises with the simple core mechanics, but without having to buy hundreds of dollars of cardboard or, in our terrible digital future, virtual cardboard.


In order to ensure the game doesn't devolve into simply selecting the best deck from the current meta discussions and throwing it at the enemies, the format here is solidly a roguelike-like. Semi-randomised runs where the expectation is to eventually be weakened to the point of death and have to restart with a new random seed from the very beginning. As you work through a run, you'll be offered various card choices (as well as handed out limited potions and rule modifiers in the form of relics) from which to build your deck. One of the key things here is that card removal is actually hard (not often offered and rarely for free) so building a deck is very much about what you don't select. The only times I've seen cards you can't turn down is used to good effect in a curses category. Negative outcomes can add cards to your deck you don't want and as they are hard to remove, they will stay with you and mess with your flow. Slay the Spire is very clean in how everything works like this - full of smart decisions to keep the game compact without feeling stale.

Unfortunately also absent from this, compared to one of my previous favourites - FTL, is much story development. There are a pool of random events with flavour text but not to the same extent as it felt like FTL assembled a story. Even the Magic: the Gathering standard of flavour text for cards is missing here with only artwork and name working beyond mechanics as narrative. But what you do get from a standard run is 50 events, mainly fights, as you scale up through three main bosses and a few elites (with your exact path somewhat flexible, so you can pick when to fight an elite or rest as a campsite to replenish your health). As with all enemies in the game, each individual boss is clockwork so part of the learning curve is internalising their moves, but there is some variety in which boss you encounter (so the final boss is randomly selected from a pool of three and you can see who it is during the final third of the run to help build your deck towards beating them).

So far the Early Access is going well, with now three different characters (changing the starting relic, some core mechanics, and card availability) all feeling sufficiently different. Beyond the standard roguelike-like, there is some permanent unlocking of extra cards/relics that will randomly appear in the game to expand your options over time as well as a difficulty staircase called Ascension that adds new difficulty modifiers once you've grokked the mechanics and how to steer yourself towards a synergistic deck despite the RNG offering you new cards. Spelunky fans will recognise the Daily Challenge, here adding three daily modifiers to a daily seed and then offering a leaderboard scoring how far you got and what feats you achieved during your first run. There is also now an Endless mode, although I am yet to even try it.

Currently the buzz is positive and the sections of the game with 'coming soon' written over it are almost all swapped out for new features (sitting at Weekly Patch 30 at time of writing) so a final release is probably on track. I do hope the game becomes a living game after 1.0 with plenty of balance tweaks but also a slowly expanding enemy and elite selection (as they are clockwork and so it can feel like you're eventually solving them for almost all competent decks you may have when you encounter them). Expansions for new cards, new bosses, and even a new character also seem like a good long-term future for the game. For now, not even at 1.0, it's an extremely easy way to burn through hours of play and feel like you're getting a deep appreciation for the various mechanics and synergies available.

Tuesday, 29 May 2018

Evolving Rust

At the start of the year I talked about using Rust as a tool to write code that was safe, easy to understand, and fast (particularly when working on code with a lot of threads, which is important in the new era of mainstream desktops with up to 16 hardware threads).

Since then I've been working on a few things with Rust and enjoying my time - especially in some cases where I just wanted to check basic parallelism performance (taking advantage of a language where you can go in and do detailed work but also just call to high-level conceptual stuff for a fast test). If you're looping through something and want to know the minimum benefit of threading it, just call to Rayon and you'll get a basic idea. In practice, that usually means changing the iterator from an .into_iter() to .into_par_iter() and that's it.

I finally upgraded my old i5-2500K desktop (on a failing motherboard from 2011) to a new Ryzen 7 so it's been very useful to quickly flip slow blocks of code to parallel computation. When you're just building some very basic tool programs, I'd probably not even think about threading in C, but here it is so easy that I've been quick to drop a (for example, typically) 30ms loop down to 3.5ms. One of the things I've been somewhat missing is easy access to SIMD intrinsics, but this brings me to something else I've been enjoying this year: Rust is evolving.

I'm used to slowly iterating standards with only slight upgrades between them as tools like compilers improve and the std lib slowly grows. Clang warnings and errors were a massive step forward that didn't rely on a new C standard and libraries can offer great features (you'd otherwise not have time to code yourself) but when I think of C features then I generally think of language features that are fixed for quite some time (about a decade).

Rust is currently working on the next big iteration (we're in the Rust-2015 era, which is what Mozilla now calls 1.0 onwards, with Rust-2018 planned before the end of the year) but that's via continuous updates. Features are developed in the nightly branch (or even in a crate that keeps it in a library until the design is agreed as a good fit for integration into the std lib) and only once they're ready are they deployed into stable. But that's happening all the time, even if a lot of people working with Rust swear on nightly as the only way to fly (where you can enable anything in development via its associated feature gate rather than waiting for it to hit stable).

For an example of that, SIMD intrinsics are currently getting ready to hit stable (probably next release). That's something I'm extremely eager to see stabilised, even if I'm going to say the more exciting step is when a Rayon-style library for it exists to make it easier for everyone to build for, maybe even an ispc-style transformation library.

The recent Rust 1.26 update is a great example of how the language is always evolving (without breaking compatibility). 128-bit integers are now in the core types; inclusive ranges mean you can easily create a range that spans the entire underlying type (without overflow leading to unexpected behaviour); main can return an error with an exit code; match has elided some more boilerplate and works with slices; and the trait system now includes existential types.

Monday, 30 April 2018

BattleTech: Just One More Mission

So, this has rapidly taken over all of my free time. Who knew that almost 30 years after I was playing those early BattleTech computer games (including some very early Westwood Studios titles), there would be a tactics game that captures the magic of detailed combat between 'Mech miniatures simplified down without losing the charm and weight of those mechanics.

When X-Com (originally UFO: Enemy Unknown to me) was rebooted into a new tactics game, I just could not get into the simplified systems. Maybe this was made worse by my continuing to go back to that original and throwing dozens of hours into the campaign every few years but something about moving from action points to move & fire phases didn't click with me. I knew how this game worked and a steady shot came from moving less and having more time to aim properly. It was all a complex set of choices that set the pace of progression and the chances of coming back with most of your squad in good health (or at least alive). Without that as the backbone of the tactics game, I just couldn't get into the larger strategic layer.


For whatever reason, I don't feel similarly constrained by that in Harebrained Schemes' latest game. Maybe it's the secondary systems like heat management and armour facing or that all of that stuff comes from detailed loadout decisions made in the strategic layer but the simplifications here feel necessary and improve the flow of each mission (which can sometimes finish in minutes but normally run closer to an hour). There was never going to be a time when 'Mechs could shoot more often (if it used APs) because managing the heat generated already restricts your actions as much as the turn counter. I've also not been going back to a different BattleTech tactics game and getting my fix there in the years up to this release so each mission feels like fresh air, every dodge and answering body-block feel like the taste of metal behemoths becoming mangled for my enjoyment.

I could probably play just the tactical layer for another 40 hours without anything else to draw me in. Keep that random scenario generator running to build missions and some fresh 'Mech loadouts to keep things interesting & my playbook changing and I'd be set. But here we get a full set of scripted story missions and universe building which situates you inside the world some of us have been diving into for decades.

As a mercenary, you're responsible for making payroll every month and ensuring your equipment is replaced after every mission. It can genuinely feel desperate when you're trying to make enough from contracts to keep going and you know that the damage you take can eat through your profits. Far worse, injuries and repairs are going to prevent you jumping into another contract for some time and that payroll is only getting closer. Time is money and even if you win a scenario, you could still come out with a loss. That's where the fiction continues to meet the mechanics: unless you're on a story mission then you are encouraged to consider cutting your losses and abandoning a contract. Optional objectives can increase your pay but none of that is worth it if you're stuck for a month repairing the damage you took completing it. Even before you're done with the core objectives, sometimes it's time to evac and write it off. There are dozens of little things that mesh the narrative and the mechanics like this.

The production values are somewhat mixed (there is a bit of the "KickStarter budget constraints" visible in spots) with some functional-if-TellTale-Games(ish) characters for a lot of the dialogue between story missions giving way to the occasional but far more evocative animated painting cutscenes backed by excellent music. In the tactical layer some of the lighting, atmospheric effects, and 'Mechs look excellent but then it's also easy to note some rather variable detail levels, dodgy action camera shots, the odd framerate canyon, and something seems straight up broken about the loading system (it hasn't crashed, it's just trying to load the loading screen). I grabbed a new Ryzen this month and BattleTech is possibly the only place where I've not noticed the improvement (something is going on during those load screens, if they even render in, but it's not taxing CPU cores doing it). But these are minor blemishes on what is often a gorgeous game that oozes a coherent style.

This is an exceptional tactics game that simplifies the miniatures without stripping that character, of huge 'Mech combat in a crumbling universe of fiefdoms. Come for the tactical mission encounters, stay for playing as mercenaries trying to make ends meet while pawns in much larger events.

Sunday, 18 March 2018

The Asset Fidelity Arms Race

So there has been a lot of discussion about the cost of game development recently. Unfortunately a lot of that has been used to defend questionable business practices (there is another gaming industry and I have absolutely no interest in ever being part of it) or extremely short-term views of economic expansion (eg increasing new-release unit prices for a medium that's already one of the most expensive ways of purchasing a single piece of mass produced entertainment and has been shrinking unit costs and value [loss of resale/lending etc] with the successful transition to digital).

Of course, while there are billions in revenue to be made from a single project, massive corporations will continue to greenlight projects whose scopes grow to a decent percentage of the potential rewards. So really the biggest budgets will always grow to fill the potential maximum returns, which means a growing hit-driven industry trends towards growth. This gives me a rather fatalist view of that original discussion (and concern about the "solutions" proposed which point at gambling mechanics and increasing unit prices as if they could not lead to a market crash or reverse decades of market growth).

But let's step back a second. Asset costs are going up and games are getting bigger (if not longer - not a bad trend as we balance the endless replayability of something like chess with the expectation that you can tell most stories in much less than 100 hours - be that in a book, movie, or TV series). We've been talking about this for as long as I've been involved in video games (~1999 onwards, first as press then adding indie).

We're about to watch another GDC where there should be a great selection of technical talks, often that propose paths out of an increasingly expensive asset fidelity arms race. But are we going to listen and then go back and just use these techniques to build even more detailed worlds? Even on an indie project (where the project decisions are usually made by an in-the-trenches dev), we tend to scope for the most that we think we can do. Doesn't that say something about how this arms race only exists because we aren't threatened by it? That we're already engaged in a careful process of ensuring the incline is just right for stable growth.

Forza Motorsport 4 - Xbox 360 (2011)

Seven years ago, this was the detail level for Forza, except this used an offline renderer (photo mode) to really make the most of those assets. To my eye, this asset stands up a generation and a half of consoles later. When I look back at some titles no longer considered cutting edge on game photography sites like DeadEndThrills, there is a lot to like about the actual assets even when just tweaking the real-time renderer to try and push the limits of what it can offer. And the cost of making assets at that fidelity level (as our tools advance) is only going down with time. Not to mention, the potential for reuse grows (especially with more component-based design from workflows promoted by stuff like PBR).

When I'm working on level-of-detail systems, it's really only an incremental improvement in the potential density of the very local area that chasing asset fidelity is bringing us today - the rest of the scene is managing way more assets/detail than we have the ability to render in 16ms. Is the asset fidelity arms race over if we want it to be? Long term, are we looking towards one off costs (R&D: new rendering technology and hardware advances) and larger budgets building bigger worlds (for the projects that need it) rather than major increases in the fidelity of assets? Not to say there is no point in increasing fidelity but how quickly will this look like diminishing returns? So much of the very recent increases in visual fidelity seems to come from rendering advances that provide things like rich environmental lighting or better utilisation of existing assets (combining high pixel counts with good actually super-sampled anti-aliasing).

Sometimes I feel like we're being sold a false choice: between sustainable development costs or expensive looking games. As we slowly ride the silicon advances (the rendering potential of a $150 to $500 device, quite a narrow window that is constantly throwing extra FLOPS at us) and develop new real-time rendering algorithms, it is far from as clear-cut as it can sometimes sound. When we look at the photo modes that have come to games, often that produce extremely clean and detailed versions of what the game actually looks like in action, we should remember that this is already the potential visual detail of current game assets. We’re just a bit of hardware performance and a few real-time techniques away from realising it. These are long-term advances that lift all projects up, sometimes with major increases in asset-creation productivity (eg integrating various procedural assists and more recently the potential from moving to PBR). In addition, expecting users to buy new hardware for a few hundred dollars every four to seven years is a lot more reasonable (and sustainable, as we chase the affordable silicon cutting edge) than pushing unit prices to $100 or even beyond.

GT Sport - PlayStation 4 Pro (2017)

So, as I look to GDC, I'm looking forward to hearing about a load of exciting advances. I always look forward to SIGGRAPH for the same reason. Even if the budget to expand asset fidelity dries up tomorrow, we should be able to continue to make amazing things. Video games are built on innovation. Let's not allow our concerns about the asset fidelity arms race to lead us down a path of thinking the people who buy games are a resource to be strip-mined as rapidly as possible. Sustainability is just as much about ensuring we can offer something at a price everyone can afford and which enriches their lives, providing delight rather than cynically tapping into gambling-like addictions or experiences that feel hollowed out.