Legion ECS Discussion

(Zicklag) #4

I just read through the whole conversation, and while I don’t understand all of the lower-level points, here are my thoughts:

If we are going to switch the ECS that Amethyst is built on, the sooner that we do it the better. We don’t want to keep working on something that we are going to end up needing to throw away later. We need to do the investigation necessary to adequately compare the two ECS’s so that we can make an educated decision on what to do.

While rewriting the engine to handle a different ECS would be a lot of work, my usual stance on things like this is that I would rather put in the work now and make it great than go with a sub-optimal solution that might come back to bite me later. If Legion really is going to bring Amethyst performance that it wouldn’t be able to get otherwise, then I think it might be worth it. We’re going to have to do the testing to find out for sure.

The change would require updates to pretty much every portion of the engine, though, and that would probably make it difficult to do development on anything else in Amethyst while that is in-progress. For example, I’m intending on working with @Moxinilian to get some work done on scripting in Amethyst, but that is very closely tied to the ECS. Does it make sense to start new development on a scripting system if the ECS is going to change and change the way that we have to build a lot of the scripting system?

This whole thing is a big deal because of both the potential gain and the potential cost of switching or not switching. I think it deserves serious consideration, though.

It seems like a good benchmark or set of benchmarks is the first step.

(Kae) #5

This seems absolute. No implementation of GlobalAlloc that I know of invokes a syscall on every allocation. On Linux with libc the default GlobalAlloc calls malloc which only invokes syscalls in certain situations. On Windows, the default GlobalAlloc calls HeapAlloc. They all have their own platform-dependent behaviours, though I won’t pretend like they are Good Allocators.

Syscalls are only necessary when changing the virtual memory mapping, but is usually done for larger allocations only.

And you will need to dynamically allocate. Perhaps you mean it should aim to re-use allocated memory instead of delegating this to the global allocator?

Seems ok :slight_smile:

Takes a lot of resources to implement and test reasonable use-cases and related benchmarks, but yeah - real use-cases would be best. An ECS doesn’t have that many possible operations, so just showing the pathological and best-case behaviours for both would also be acceptable IMO.

No, it currently does not meet the allocation goals you set up with any of its storage implementations.

Specs cannot meet the concurrency goals with the current API because it dispatches on system-level and not on a component or chunk level.

Specs cannot guarantee linear iteration for multi-component queries due to its isolated ComponentStorage design.

Can specs meet the goals? Not without extensively changing its API.

For simple add/remove, Legion does well with allocation behaviour (at least on my fork :wink: ) but it does not use a custom allocator for its fixed-size blocks. It could do better with custom allocator APIs though, as hashmaps and vecs for chunk metadata still use GlobalAlloc. Also, the blocks are not actually guaranteed to be of an exact byte size. And structural mutations for entities are not optimally implemented currently, requiring multiple individual allocations.

Legion does not have a concurrency story, so it does not meet any of those goals.

Legion has optimal cache coherency and data locality for multi-component queries. It goes beyond optimal in the classical ECS sense with the Tags system that shares certain data across all entities in a Chunk as well. Single-component queries have linear behaviour within a Chunk, but may need to touch many chunks depending on the existing archetypes. I’d say Legion shines most on the iteration goals, which is the most important aspect for high performance considering modern CPU architecture details.

3 Likes
(Kae) #6

Another thought: providing multiple levels of granularity for data storage and parallelism seems quite appealing. I can imagine a dispatcher that is aware of data on following levels

  • Global
  • Per world
  • Per entity
(Erlend Sogge Heggen) #7

Earlier this week I discussed with @jaynus how Evoli might be extended to serve as a real-world-application benchmark for the ECS features we’re most interested in measuring.

  1. More moving entities
  2. Syncing from specs-nphysics
  3. 3D culling
  4. Userless simulation mode

All can be tracked in the ECS Benchmark milestone: https://github.com/amethyst/evoli/milestone/3

1 Like
(Justin LeFebvre) #8

I know I don’t have a ton of skin in the game on this subject since I haven’t spent much time digging into the internals of the ECS (specs/shred) but I would say that there should be no world where we replace Specs with Legion or any other ECS. However, we can and should borrow (steal) the best ideas from another ECS and implement them in Specs whenever possible in order to get our library to where we would like it to be.

1 Like
(Kel) #9

Just to be clear, the primary question in discussion isn’t “should we switch out ECS”. In any case, Legion is in comparison very young and missing lots of features that would be required. This discussion is first and foremost about the decisions in Legion’s design, and in Specs’, and measuring what we could do better and where.

3 Likes
(Zicklag) #10

Just a thought, to help on allocations, there are a couple alternative allocators that are supposed to be faster like mimalloc and jemalloc. The sled project performance page says:

jemalloc can dramatically improve performance in some situations, but you should always measure performance before and after using it, because maybe for some use cases it can cause regressions.

Might be something to try out for Amethyst/Specs.

(Zicklag) #11

What is the feasibility of implementing a wrapper of sorts around Legion so that you could use it inside of an Amethyst game? Assuming you got Legion up-to-speed feature-wise.

Me and @Moxinilian have been discussing how to get scripting support into Amethyst with Specs, but @kabergstrom’s fork of Legion already has a C API and external component types like we need for scripting.

For my project, I don’t care about any differences from Specs in the Legion API, it actually seems like Legion would be nicer, but Amethyst is built on Specs. Would there be any way to wrap Legion in a Specs API so that the core Amethyst systems could run on Legion and I could use the Legion API for any non-Amethyst-core systems?

If there is really no chance that Amethyst is going to use Legion, is there any possibility of me providing some sort of integration layer myself, or would that be incredibly difficult?

(Kel) #12

So ECS wrappers were discussed in the chat logs (I think…) which I know are a bit lengthy so in summary the big problem with such a thing is that the benefits of the different ECS are not in their implementation details but in the different API’s which allow those libraries to optimize for different cases.

I wouldn’t say that! What I meant above is an API change as massive as in our ECS, whether by switching or by adapting our current library, would require firstly discussing the definite technical requirements and scope, and then implementing and evaluating an MVP. Once these things happen, then it’s possible to definitely say whether or not these sorts of API changes will happen. This is generally a thing that needs to happen across the entire engine architecturally IMO but this is such a substantial (while scope-able) change that I think there’s no better place to make sure we stick to such a process.

@jaynus and @kabergstrom among others have already begun some parts of that process in this thread!

1 Like
(Kae) #13

My understanding is that it’ll be possible to make a specs-compatible API with a legion backend. As Kel mentioned, @jaynus started looking into realizing it.

1 Like
(Zicklag) #14

Ah, OK. I had misunderstood your previous comment. I definitely agree with that.

In that light, my team decided that for Arsenal it makes more sense for us to work on proving out Legion in Amethyst than it does to work on establishing scripting in Specs. Is there anything that we can do to help with that?

(Erlend Sogge Heggen) #15

The discussion about Legion seemed to conclude with “Before we can justify putting a lot of work into this, we need good benchmarks to tell us what problems actually need solving, if any”.

Creating these benchmarks would move this forward.

2 Likes
(Zicklag) #16

OK, I’ll look into that, then. If anybody has any suggestions or ideas on what kind of benchmarks to make just tell me.

I’m thinking I could create some amethyst equivalent components like Transform in Legion and then run through some scenarios and compare them. Maybe I’ll look at some existing Amethyst games like Evoli and Space menace and test some situations similar to those.

1 Like
(Zicklag) #17

I created a repo for the benchmarks:

Right now all it has is an entity creation benchmark, but I’ll be working on more. If anybody has ideas for benchmarks to include you can create an issue on that repository.

2 Likes
(Erlend Sogge Heggen) #18

Could you elaborate on that?

I wanna get clarity on whether specs is somehow unfit for your scripting implementation, or if it just seems a little less attractive.

As has been pointed out, it is highly unlikely that we’ll find ourselves moving wholesale to Legion any time soon. Whichever path we do go down it’s going to be a long and incremental one, so it’s best to work on the assumption that specs is here to stay.

(Paweł Grabarz) #19

I think the main goal with switching to Legion is not at all about performance. We lack flexibility with existing specs API, and it’s very unlikely we can lift those annoying restrictions without changing the underlying specs architecture.

There are two main features I really miss in specs that are unlikely to ever be implemented, but are already there or easy to implement in legion:

  • fast coarse change detection
  • composable or runtime defined component queries

The second point is particularily important, as we can’t really make any workaround for this. It’s extremally important to be able to query things dynamically for scripting to ever work, and also it’s very benefitial to allow adding extra component type constraits into existing systems through either value or generic type. This feature right now is what I miss to implement culling system that will adapt itself to current rendering needs.

So, again, this is not (only) about performance. I would actually treat the performance gains as a nice side effect. It’s mostly about having less problems with rust lifetimes and being able to define nice FFI api.

Saying that it’s a lot of work is maybe accurate, but a bulk of that work is already done. The ECS implementation alone is done, FFI is done, and we do have partially implemented scheduling. The part of integrating it with amethyst is actually comparable in scope to many changes we did over last few months, like updating specs version or transform refactor. Once legion is there, changing amethyst won’t be HARD. It will be a WIDE change, but a very repetitive one, likely very easy to review and validate.

I see this change as a very important step for amethyst that will open the doors for implementing key features like editor or scripting, and will allow other existing (or soon-existing) systems to grow without requiring significant code complexity.

3 Likes
(Zicklag) #20

I don’t have an in-depth enough understanding of whether or not Specs is fit or unfit for a scripting solution. All of the work I’ve done in that direction was piggy-backing on any knowledge and investigation that was done by @Moxinilian. What @frizi just pointed out makes it sound like Specs isn’t going to lend itself to a scripting solution, though:

Either way, even if there was a good way to get scripting into Specs, I’m really not the one that would be suited to do it and it could take me months of work, even with help from people who do know more about it.

If it were 9 months before we moved towards Legion, and it took me 5 months of work to get scripting into Specs, we would only have scripting support for 4 months before we completely replaced it with Legion. I can’t afford to put in that much work with the potential that it could all be wasted. Maybe that timeline is dramatized, but I really don’t know what the timeline would look like because FFI and Specs internals are far outside of my experience. I would fully willing to learn if I was sure that the effort would be useful work, but at this point it looks like there is a good chance it wouldn’t be.

Also, I’m not an expert on the subject, but from everything that I’ve read in this discussion, including the Discord one, it seems like, if it weren’t for the work of migrating to it, Legion is pretty easily the better design decision. It just seems more suited to the needs of a high-performance and flexible game engine.

Me and my team would rather put work into a future, better solution than spend work now on patching something that will be replaced later, even if it means that it will be longer before we can use the solution. My team’s decision to use Amethyst itself is not based on the fact that I can get a working solution now, but on the fact that it will be a better solution than the alternatives when it is ready.


BTW here’s an update on the benchmarks that I started work on:

While it probably wouldn’t be difficult to start testing different focused performance metrics like iterating over components of different layouts and things like that, I really wanted to test the performance of Legion and Specs in situations that were as close to possible to the real-world use-cases that we would be putting them in. I wanted to start with the transform system because it was foundational to every Amethyst game.

After I started trying to mimic the transform bundle in the benchmarks repo I had created, I realized that it might just be better to use Amethyst itself and start porting amethyst_core to Legion so that we could get a fully accurate comparison. At this point I think I’m going to experiment to see if I can port amethyst_core to Legion under a feature flag so that we can test out games, even if it is without graphics, audio, networking, etc, and run some closer-to-life tests with Legion inside of Amethyst. ( A feature flag might make the code too messy, in which case I’ll just keep it in its own branch. )

I’m not 100% sure how much work that is going to turn into, but it would be a great way for me to get more acquainted with Amethyst’s internals and it would be a good way to feel out how difficult a Legion port might be. If it turns into too much work for one reason or another, I’ll probably just do some more focused benchmarks in a separate repo. I would like to know whether or not that approach makes sense to everybody else.

3 Likes
(Théo Degioanni) #21

Excellent idea!

Scripting would work in specs without too much work. What Frizi is describing is how once your components are statically declared in specs, you cannot access them as flexibly as you could in their development-time form. This is not an issue for scripting itself as dynamic dispatcher rebuild and conditional dependencies are a development-time thing. I would need to know more about Frizi’s engine use case to give an opinion on how feasible specs is for their specific use case, but it is not related to scripting.

However I do agree that legion would give more advantages on those aspects, especially on that first point which to me is extremely important and extremely difficult to achieve without locking all the time in specs. But this is outside the scope of scripting.

TL; DR: scripting can work in specs if we want it to.

2 Likes
(OvermindDL1) #22

I just took a look at legion and I have a question:

This sounds exceptionally bad for an ECS library. I add and remove components very rapidly, all the time, all over the place, and this seems backwards from how it is traditionally done… Just how much ‘significantly slower’ is it and why? If it is baking component ‘sets’ into arrays then that sounds more like a dataflow system than ECS, which is fine, but it’s not ECS.

1 Like
(Zicklag) #23

From the Discord conversation:

Ayfid

It is optimised for entity creation, deletion and iteration speed. Dynamically adding and removing components from existing entities is supported but slow. You can generally design game code around avoiding the need to do component addition/removal outside of entity creation, but you cannot avoid creating, deleting or iterating through entities; those are the core functions of an ECS.

I do not think adding/removing components to existing entities is nearly as high a priority as creation and iteration speed

As far as I understand it, it is slower because of the chunked design of the entity storage, which is part of where it gets its speed everywhere else.

If you needed to add and remove components on an entity I would imagine you could add a boolean enabled field to the component and ignore any non-enabled components in the systems that use it.