Saturday, 20 January 2018

Why I'm Trying Rust in 2018

Last year, when considering a long overdue upgrade for my home desktop, I pondered the end of quad-core desktop dominance. Since then we've seen 8 hardware threads on mainstream "thin" laptop CPUs so even at 15 Watts (including the iGPU) you're not expecting everyone to only have two or four (possibly via SMT) threads. Intel are still charging to enable 2-way SMT on desktop but even there the mainstream is now solidly six threads, possibly 12 via SMT. AMD are running away with 12 hardware threads for pennies and 16 for not much more with a slight speed boost expected in April to eat into that single-threaded gap (while Intel are busy reacting to slightly bad news for their real world performance).

At this point, if we want to maximise the performance of our code (ok, which is not the sole focus), I think it goes beyond saying that a single-threaded process is typically not going to cut it. If we're statically slicing our workload into two to four blocks ("main loop here, audio subsystem and this on a second thread..." ) then we're also not building something that scales to the mainstream consumer hardware we will expect to be running on in the next few years. Even ignoring SMT (maybe our tasks leave very few unused execution units in each core so running two threads just adds overhead/reduces cache per thread) then we're going to have to start to expect six to eight cores that can all run fast enough to always be much better than running fewer cores faster (due to boost/TDP limited processor design - which doesn't even appear to be a factor on desktop). When thrown 32 hardware threads, we should be laughing with joy, not worried about having the single-threaded performance to run our code. We have to work under the assumption that basically everything we do has to be thread-safe and able to split into chunks all working on different cores. Yes, some tasks have to be worked on sequentially but in areas like game development we are juggling a lot of tasks and many of them are not so constrained, so we've got to adapt.

It's 2018 and when I think about some of the most successful threaded code I've written in recent years, it's mainly in Python. Yes, GIL-containing, dynamically-typed (static analysis nightmare) Python. It was never going to be the fastest but I had expressiveness and a great set of libraries behind me. I also have no doubt that a subtle defect is probably still sitting in that code which we never found. But if I was to rewrite it in C, those odds only go up to an almost absolute certainty. At this point, I'd say my ability to debug most defects is ok but, even assuming I catch everything, the time lost to that task is a significant percentage of my development budget. I started looking around for something that retained the system programming language power I expect from C (and knowing that I'd be doing FFI-binds into C ABIs) but with better tools to enable me to write threaded code I felt more confident about being correct.

Enter, stage left, Rust. A system programming language that's just about old enough to have gotten the first round of real-world testing out of the way and start to build an ecosystem while also having one feature you'll not find in most of the other multi-paradigm, high-performance languages: a borrow checker as part of the compiler. It has the expressiveness you'd expect from a modern high-level language eg Python or Scala, with enough bits of Functional or OO, but it has no time for handing out mutable references to memory like candy. The compiler requires you always consider how you interact with your memory, which I found very useful for doing threaded work, along with ensuring you've got no GC overheads (and only count references for memory where you really need it). Once I'd gotten my head around it, the constraints no longer felt onerous.

This is certainly not the only way of doing things, but Rust feels like it is just close enough to C-like for me to feel comfortable (and know what my code will most likely actually do) while offering almost everything I expect from the integration of various modern/functional features, all wrapped in low-overhead (often completely free at run-time) safety. You can still shoot yourself in the foot, but there's not a book of undefined behaviour you have to carefully avoid and which the compiler will often not warn you about. So I'm changing my traditional view of wait and see, 2018 will be a year where I explore a new language and iterate on projects while the tools are very much developing around it. Rust may not quite be here yet, but it's getting close enough that I can't see myself jumping into a large independent project in C and not regretting it in a year. I'm also currently between major contracts and probably looking to relocate so now is an ideal time to use a bit of that freedom to investigate new ideas. Even if it doesn't work out, operating with this ownership model will probably push how I think about (and aim for safety in) concurrent programming in the future.

How is the current state of play? They've got a (mostly) working Language Server for Visual Studio etc with advancing integration into some other IDEs (JetBrains stuff, vim). I've spend quite a lot of time in PyCharm recently so it feels natural to stick with that (using an early plugin that's developed enough to offer the standard niceties, which is handy for the type hints in a language that doesn't require many explicit type declarations). The installer is the library manager (kinda turbo pip) is the solution manager/build system (and installs optional tools so they can tightly integrate into the default workflow). If you've got the C runtime libraries (there is an option to build without ever calling into CRT but default is a standard library built on that) then you're basically ready to go.

The ecosystem makes a lot of stuff seamless. You add an external library by name (or git repo), the solution manager finds the latest version in the central repository, downloads it, builds it, and notes the version (so you don't get caught out by an update unless you ask for it to upgrade). Documentation auto-builds, examples in documentation automatically get included in the test suite, and there's just some light TOML config plus conventions on file locations to keep your solution managed. There's a default style and a decent source formatter (with customisation options) to enforce it if desired. Once you're up and running then VS Code's C/C++ debugger works fine (I've not had much luck debugging with JetBrains but this is the Community edition and CLion retail apparently is where you need to be) or try your favourite tools based on GDB or LLDB for support. Want to share your library code with the community? The library manager just needs your user key and will package it up to upload (so others can include it in their projects).

There is still work ongoing (a year ago the Language Server sounds like it was pretty iffy and even now it's flagged as preview for a reason) but there's enough working that you'll not be feeling like you're working with researchware. 2017 was the year of getting a lot of stuff basically there (the community went through and cleaned a lot of the important libraries to SemVer 1.0.0 releases). 2018 is a year for sanding off some more rough edges (adding a few convenience features to the language, cutting away verbosity that the compiler can reason around for most cases, polishing the development experience) and it sounds like 2019 will be the next time a major revision happens. I've got a few things I'd like to see (faster compilation times would be extremely valuable - I don't want to feel like I'm back in C++ land) but so far I've yet to find a deal-breaker and the potential is evident. Not sure I entirely agree with all these executables carrying stdlib code with them (static linking is the norm) - I've got 110MB made up of 14 binaries just for the Rust CLI package/build manager and I'm betting a lot of that is duplicated stdlib code they really could be calling from the dlls. Maybe this is the C coder talking but it feels like the compiler could be emitting smaller binaries in general, even ignoring static linking.

So far I've been getting used to the language, the tools, the ecosystem; building lots of smaller projects so I can see how things are done and what performance looks like compared to the same work done (idiomatically) in other languages. I'm expecting to move to Ryzen in April or May (based on availability, price, and if the revised parts are worth getting or just make the current parts even more affordable) so that'll be an interesting jump in performance potential. It's time to see what developing a large project feels like in Rust, it's time to start from...



Starting resources

The Official Documentation (includes links to all the books, a couple highlighted below)
The Rust Programming Language (2nd ed intro book - draft being finalised)
The Rust FFI Omnibus (detailed notes on ensuring you can call into your Rust code)
Cookin' with Rust (various common tasks using major bits of the ecosystem's libraries)
Rust by Example (what it says on the tin)
Rustlings (small exercises for getting used to writing Rust)
The Unstable Book (for documentation of what's not yet in the stable feature set)
The Rustonomicon (for those deeper questions about the language)

No comments:

Post a Comment