Better Compile Times

#1

tldr: Amethyst’s change->compile->run loops are unreasonably long right now. I have data / examples showing we can do much better.

This is a mirror of a github issue i just opened: https://github.com/amethyst/amethyst/issues/2064

I think its worth discussing here as well.

Problem Statement

First, as many of you don’t know me, I want to make it clear that this comes from a place of love and respect for what is being done in the Amethyst project. I think amazing work has been done and this is already a really cool project.

That being said, Amethyst compile times are currently sub-optimal. In order for a game engine to be productive, change->compile->run loops need to be as tight as possible. In an ideal world this is less than a second (although this is very hard), less than 1-4 seconds is still comfortable, 4-10 seconds is painful but workable, anything higher is a non-starter. I’m speaking for myself here, but I’m sure others will fall into similar ranges.

The Evoli showcase game currently takes ~31 seconds to compile a single newline insertion in the game logic. The “space menace” example takes almost the same amount of time (~28 seconds). This will be a deal breaker for most people. Amethyst can largely cancel out this problem by adding scripting support and moving 100% of game logic there, but that won’t help the people that want to use rust directly. I personally want to use a rust game engine and write my logic in rust for the safety and performance benefits. Otherwise I would be using other more established engines.

Some might just write this off as “Rust has slow compile times and there is nothing we can do about it”. The “rust is slow” meme is true to an extent, but that is not what is preventing Amethyst from falling into the “comfortable” compile time range. The problem is almost entirely generics. And more specifically how Amethyst uses them.

My assertion: It is possible to build multi-platform game engines in Rust with change->compile->run loops that fall into the “comfortable” category defined above. In fact, we can have a cross platform game engine with < 1 second compile times. I have data to prove it!

Existing rust game engines with good compile times

  1. coffee
    • coffee is a 2D cross platform rust game engine built on wgpu and winit.
    • Changes to the coffee example apps can be compiled in 2-3 seconds when using the LLD linker
  2. my custom game engine:
    • uses legion, wgpu, winit
    • Changes to an example app without app-specific legion systems can be compiled in ~0.8 seconds when using the LLD linker
    • Changes to an example app similar to Evoli using legion systems can be compiled in ~2.5 seconds when using the LLD linker. Legion incurs a surprisingly high compile time cost.
    • I haven’t released this anywhere, so I understand that this amounts to “this is possible because I said it is”. I’m using approximately the same stack as coffee, which is open source. You can verify the performance by cloning that repo.

Right now if I need to choose between Amethyst (which has more features) and my game engine (which has reasonable compile times) I will choose my game engine. I can take the time I’d wait for each compile and invest it in reaching feature parity with Amethyst. Over the course of a medium-sized game’s development, I might actually come out on top.

That being said thats not what I want! There are so many smart people working on Amethyst and nobody wants yet-another-game-engine. Fortunately there is hope!

Improving Evoli Compile times

By factoring out parts of Evoli’s Amethyst initialization code into a separate crate, Evoli goes from ~31 seconds to ~13 seconds.

You can test out my branch here

I built a crate called “amethyst_precompile” that consists of the following pieces:

  • PrecompiledRenderBundle: saves ~13 seconds
    • registers the Evoli rendergraph as a RenderSystem
  • PrecompiledDefaultsBundle: saves ~3 seconds
    • registers: InputBundle, TranformBundle, AudioBundle, WindowBundle, UiBundle
  • PrecompiledSpriteBundle: saves less than 1 second
  • start_game function: saves ~2 seconds
    • Creates the Application<GameData> instance and runs the game.

PrecompiledRenderBundle saves the most time. The new Rendy renderer makes heavy use of generics in both public and private interfaces. As a results, we are basically recompiling the entire renderer each time we rebuild a user’s game crate. Making the final render system a concrete type we can reference yields massive wins.

In general the problem is that amethyst’s use of generics requires its consumers to compile large portions of the engine. This is a problem inherent to rust’s generic implementation. It almost certainly won’t be fully solved by compiler optimizations.

Those were the easy non-breaking wins. We can shave off another 4 seconds from the compile time by commenting out initialize_prefabs(&mut data.world). The prefab system also heavily uses generics. This seems like a much harder problem to solve because the high level interfaces do need to be generic. However its worth looking in to whether or not the lower level types could expose a concrete interface. Without the current prefab compile times, we’re at ~10 seconds. Thats actually workable! And I’m sure there are more optimizations to be had.

LLD

Additionally we can use LLD to improve our times even further. However whenever I compile Amethyst with LLD I get these runtime errors:

Error { inner: Inner { source: Some(Error { inner: Inner { source: None, backtrace: None, error: Message("unknown variant `TTF`, there are no variants") } }), backtrace: None, error: StringError("Failed parsing Ron file") } }
caused by: unknown variant `TTF`, there are no variants
Error { inner: Inner { source: None, backtrace: None, error: Message("unknown variant `TTF`, there are no variants") } }

It makes sense to priortize solving those problems in amethyst because the LLD linker wins don’t require any form of rearchitecture.

Evoli Iterative Compile times: Inserting a newline in game logic

  • base: ~31 seconds
  • base_lld: ~28 seconds
  • refactored: ~13 seconds
  • refactored_lld: ~11 seconds
  • refactored_no_prefab: ~9 seconds
  • refactored_no_prefab_lld: ~7 seconds

Conclusion

Amethyst’s current design forces users to recompile large parts of the game engine when making minor changes to game code.

With some minor refactors to user code, Amethyst is capable of ~11 second compile times for Evoli (provided lld gets fixed, otherwise add ~2 seconds to the number). We can probably make this even better.

Other game engines with less reliance on generics have significantly better compile times. The examples provided both use wgpu (an abstraction on top of gfx-rs like rendy). The wgpu graphics library doesn’t suffer from the same compile time problems as rendy because it presents concrete types to the user. Its compile time cost is close to zero.

I also observed that Legion has a measurable impact on compile times in my own engine. I go from ~0.7 seconds without legion to ~1.9 seconds with 7 simple systems and 5 component types. This is largely a separate issue because the cost comes from users calling legion apis from their game code.

I think Amethyst should re-think the interfaces it presents to users to encourage good compile times by default. Many of the current interfaces have large hits on user code if they aren’t put into a separate crate. And others are just plain impossible to use without paying the compile time tax. None of the examples or showcase apps use the “separate crate” method. And users shouldn’t need to contort themselves to get reasonable compile times in the first place.

Generics aren’t inherently bad, but they can easily get out of hand.

9 Likes
(Zicklag) #2

Hi @cart, thanks a lot for that breakdown. That is some great data to have. I definitely agree that the compile times are very difficult to deal with without a fast computer that can bring those times down. It gave me quite a bit of trouble just going through the “Pong” example on my laptop.

It is actually very promising to see that the compile speed issue isn’t fully Rust itself, but how we are using it. Also great to see that you have non-breaking examples of how to improve the times! I can’t speak a lot to the API or what we can change about it because I don’t have a lot of experience with it yet, but your tests make some valuable points.

Our scripting RFC should actually facilitate writing Rust “scripts” for game logic without any performance disadvantage, because it will be using the C FFI, so that might help a lot.

I know that it used to work, because I’ve used LLD with Amethyst before. That definitely seems like something we should look into.

1 Like