Legion ECS Discussion

(Zicklag) #14

Ah, OK. I had misunderstood your previous comment. I definitely agree with that.

In that light, my team decided that for Arsenal it makes more sense for us to work on proving out Legion in Amethyst than it does to work on establishing scripting in Specs. Is there anything that we can do to help with that?

(Erlend Sogge Heggen) #15

The discussion about Legion seemed to conclude with “Before we can justify putting a lot of work into this, we need good benchmarks to tell us what problems actually need solving, if any”.

Creating these benchmarks would move this forward.

2 Likes
(Zicklag) #16

OK, I’ll look into that, then. If anybody has any suggestions or ideas on what kind of benchmarks to make just tell me.

I’m thinking I could create some amethyst equivalent components like Transform in Legion and then run through some scenarios and compare them. Maybe I’ll look at some existing Amethyst games like Evoli and Space menace and test some situations similar to those.

1 Like
(Zicklag) #17

I created a repo for the benchmarks:

Right now all it has is an entity creation benchmark, but I’ll be working on more. If anybody has ideas for benchmarks to include you can create an issue on that repository.

2 Likes
(Erlend Sogge Heggen) #18

Could you elaborate on that?

I wanna get clarity on whether specs is somehow unfit for your scripting implementation, or if it just seems a little less attractive.

As has been pointed out, it is highly unlikely that we’ll find ourselves moving wholesale to Legion any time soon. Whichever path we do go down it’s going to be a long and incremental one, so it’s best to work on the assumption that specs is here to stay.

(Paweł Grabarz) #19

I think the main goal with switching to Legion is not at all about performance. We lack flexibility with existing specs API, and it’s very unlikely we can lift those annoying restrictions without changing the underlying specs architecture.

There are two main features I really miss in specs that are unlikely to ever be implemented, but are already there or easy to implement in legion:

  • fast coarse change detection
  • composable or runtime defined component queries

The second point is particularily important, as we can’t really make any workaround for this. It’s extremally important to be able to query things dynamically for scripting to ever work, and also it’s very benefitial to allow adding extra component type constraits into existing systems through either value or generic type. This feature right now is what I miss to implement culling system that will adapt itself to current rendering needs.

So, again, this is not (only) about performance. I would actually treat the performance gains as a nice side effect. It’s mostly about having less problems with rust lifetimes and being able to define nice FFI api.

Saying that it’s a lot of work is maybe accurate, but a bulk of that work is already done. The ECS implementation alone is done, FFI is done, and we do have partially implemented scheduling. The part of integrating it with amethyst is actually comparable in scope to many changes we did over last few months, like updating specs version or transform refactor. Once legion is there, changing amethyst won’t be HARD. It will be a WIDE change, but a very repetitive one, likely very easy to review and validate.

I see this change as a very important step for amethyst that will open the doors for implementing key features like editor or scripting, and will allow other existing (or soon-existing) systems to grow without requiring significant code complexity.

4 Likes
(Zicklag) #20

I don’t have an in-depth enough understanding of whether or not Specs is fit or unfit for a scripting solution. All of the work I’ve done in that direction was piggy-backing on any knowledge and investigation that was done by @Moxinilian. What @frizi just pointed out makes it sound like Specs isn’t going to lend itself to a scripting solution, though:

Either way, even if there was a good way to get scripting into Specs, I’m really not the one that would be suited to do it and it could take me months of work, even with help from people who do know more about it.

If it were 9 months before we moved towards Legion, and it took me 5 months of work to get scripting into Specs, we would only have scripting support for 4 months before we completely replaced it with Legion. I can’t afford to put in that much work with the potential that it could all be wasted. Maybe that timeline is dramatized, but I really don’t know what the timeline would look like because FFI and Specs internals are far outside of my experience. I would fully willing to learn if I was sure that the effort would be useful work, but at this point it looks like there is a good chance it wouldn’t be.

Also, I’m not an expert on the subject, but from everything that I’ve read in this discussion, including the Discord one, it seems like, if it weren’t for the work of migrating to it, Legion is pretty easily the better design decision. It just seems more suited to the needs of a high-performance and flexible game engine.

Me and my team would rather put work into a future, better solution than spend work now on patching something that will be replaced later, even if it means that it will be longer before we can use the solution. My team’s decision to use Amethyst itself is not based on the fact that I can get a working solution now, but on the fact that it will be a better solution than the alternatives when it is ready.


BTW here’s an update on the benchmarks that I started work on:

While it probably wouldn’t be difficult to start testing different focused performance metrics like iterating over components of different layouts and things like that, I really wanted to test the performance of Legion and Specs in situations that were as close to possible to the real-world use-cases that we would be putting them in. I wanted to start with the transform system because it was foundational to every Amethyst game.

After I started trying to mimic the transform bundle in the benchmarks repo I had created, I realized that it might just be better to use Amethyst itself and start porting amethyst_core to Legion so that we could get a fully accurate comparison. At this point I think I’m going to experiment to see if I can port amethyst_core to Legion under a feature flag so that we can test out games, even if it is without graphics, audio, networking, etc, and run some closer-to-life tests with Legion inside of Amethyst. ( A feature flag might make the code too messy, in which case I’ll just keep it in its own branch. )

I’m not 100% sure how much work that is going to turn into, but it would be a great way for me to get more acquainted with Amethyst’s internals and it would be a good way to feel out how difficult a Legion port might be. If it turns into too much work for one reason or another, I’ll probably just do some more focused benchmarks in a separate repo. I would like to know whether or not that approach makes sense to everybody else.

3 Likes
(Théo Degioanni) #21

Excellent idea!

Scripting would work in specs without too much work. What Frizi is describing is how once your components are statically declared in specs, you cannot access them as flexibly as you could in their development-time form. This is not an issue for scripting itself as dynamic dispatcher rebuild and conditional dependencies are a development-time thing. I would need to know more about Frizi’s engine use case to give an opinion on how feasible specs is for their specific use case, but it is not related to scripting.

However I do agree that legion would give more advantages on those aspects, especially on that first point which to me is extremely important and extremely difficult to achieve without locking all the time in specs. But this is outside the scope of scripting.

TL; DR: scripting can work in specs if we want it to.

2 Likes
(OvermindDL1) #22

I just took a look at legion and I have a question:

This sounds exceptionally bad for an ECS library. I add and remove components very rapidly, all the time, all over the place, and this seems backwards from how it is traditionally done… Just how much ‘significantly slower’ is it and why? If it is baking component ‘sets’ into arrays then that sounds more like a dataflow system than ECS, which is fine, but it’s not ECS.

1 Like
(Zicklag) #23

From the Discord conversation:

Ayfid

It is optimised for entity creation, deletion and iteration speed. Dynamically adding and removing components from existing entities is supported but slow. You can generally design game code around avoiding the need to do component addition/removal outside of entity creation, but you cannot avoid creating, deleting or iterating through entities; those are the core functions of an ECS.

I do not think adding/removing components to existing entities is nearly as high a priority as creation and iteration speed

As far as I understand it, it is slower because of the chunked design of the entity storage, which is part of where it gets its speed everywhere else.

If you needed to add and remove components on an entity I would imagine you could add a boolean enabled field to the component and ignore any non-enabled components in the systems that use it.

(OvermindDL1) #24

That wouldn’t really work. In my old engine I add/remove components rapidly for determining things such as status effects, query information, etc… etc… and tend to number in the hundreds of components. Having all of those enabled on every possible entity that might have them, which tend to be in the hundreds of thousands of entities or far more), sounds like an exceptionally large waste of memory and processing.

(OvermindDL1) #25

The ‘chunked’ design that keeps being referenced looks like a classic dataflow pattern rather than ECS, and dataflow is exceptionally useful and fast, but it is not ‘dynamic’ in the realm that ECS is (ECS is quite literally a rapidly dynamic dataflow pattern).

Perhaps there should be two types of components, those that get chunked, ala the current dataflow design, and the other types that are stored out-of-band, ala normal ECS style (specs style for example). Generally the components that are rapidly added and removed tend to be fairly few in number (though not always). The few cases that tend to be large in number tend to change less often as well in the ‘bulk’ cases with a few (few hundred that is) entities changing fairly rapidly (boundary components are common here). This seems like it would satisfy both worlds, and it is possible they could work via the same interface as well, or at the very least the same query interface.

(Kae) #26

That comment about changing entity structure being significantly slower is misleading IMO. The existing legion implementation may be because it does heap allocations for a number of things, but all that is required is to swap remove the components from its current chunk and add it to a new chunk.

I don’t know what you mean with chunked component storage being somehow “not ECS” or “not dynamic”. Is there a true Scotsman argument being made?

2 Likes
(Kel) #27

All an ECS is, is a way to relate an index of items to the data for those items, while decoupling procedures operating on that data. Legion satisfies these properties, and is an ECS.

1 Like
(OvermindDL1) #28

Not at all, I think this is just a lack of definitions of terms, I use definitions that I’ve learned ~30 years ago with ECS having come out ~20-25 years ago, so essentially the definitions I’ve been taught (just to make things unambiguous):

  • Dataflow is a set of structs in one or more arrays where an array holds a single ‘type’ of struct. It is easy to add and remove entities to the system, but changing the data they hold involves removing them from one array and putting them onto another, if made well (POD in other words) that’s as simple as a memcpy of the different struct elements from one layout to another.
  • ECS Is Dataflow except instead of the components of an entity being held in the same allocation, instead each component is held in its own array with the same index across them being the same entity.

Most games tend to use dataflow in a large variety of areas, particle systems are the most common, many games actually use dataflow for their game entities in full, take Factorio as an example, where each ‘type’ of entity is a different dataflow array, to change components involves destroying the old and recreating it in another with a different set (how legion works it seems).

They are both accessed via indices, what differs is the memory layout and access patterns. The Dataflow pattern existed before the ECS pattern by a good couple of decades and ECS is considered a subpattern of Dataflow. In other words, all ECS’s are Dataflow, but not all Dataflow’s are ECS.

This is not at all saying that Dataflow is bad, it is absolutely more common than ECS (quite literally near every engine has parts that are dataflow, especially particle systems), just that the access patterns are different since Dataflow focuses more on static speed of small collections by combining the components into singular arrayed structs and ECS focuses more on rapid adding and removal of components.

Non-ECS Dataflow isn’t always faster than ECS either, when there are many components and the sizes become too large than it causes too large of jumps between objects when you are trying to access only small amounts of data.

This is also why some ECS engines combine both patterns into one engine, where certain components are marked as batched and get combined based on their batching tag, for example:

  • Component Transformation has batch tag 0
  • Etc PhysicsLink has batch tag 0
  • Renderable has batch tag 0
  • Inventory has batch tag 1
  • A has batch tag 1 (I’m running out of ideas for names, I just woke up…)
  • B has batch tag 1
  • C has batch tag 1
  • D has no batch tag
  • E has no batch tag
  • Jump has no batch tag
  • etc…

So component Transformation, if it exists for a given entity, will be combined into dataflow batches with others also tagged 0 (PhysicsLink and Renderable in this example). Inventory will be combined with A, B, and C anytime any of them exist. D and E and Jump are untagged and will never be batched with other components whatsoever, this is similar to how specs works now, these are optimal for components that are added and removed rapidly, where batching is most optimal for when they are less often used or only often used together.

This is generally considered the most optimal ECS pattern where it appears dynamic via the API but there are optimized paths using non-ecs style dataflow, though I never got around to adding batching in my old engine (other than manually in a few cases), it is what I’ve always wanted to experiment with.

I’ve not actually heard ECS being defined with such a definition in over 20 years that I’ve been using it?
That style could describe even the old Unity system, which is exceedingly and extremely inefficient and is not dataflow in any way whatsoever (with significant performance detriments because of it), and it is not generally considered an ECS even though it has entities, components, and calls to operate over that data. It’s always best to define things very precisely.

2 Likes
(Kae) #29

Thanks for the clarification! Then in these terms I think legion's storage design is a sort of combination?

  • Component data is partitioned into fixed size (16kB) allocations called “chunks” based on each entity’s set of component types. Each partition is called an Archetype.
  • Components within a chunk are stored in separate arrays for each component type.
  • An array of entity IDs is stored within a chunk. An index across component and entity ID arrays refer to data belonging to the same entity.
  • A hashmap is maintained in the World with the location of each entity for point lookups.

So adding/removing components from an entity would require the following steps,

  • Find/allocate a new chunk for the entity’s new Archetype
  • Copy component data and entity ID into the new chunk
  • Remove components and entity ID from current chunk, probably by replacing it with the last entity in the chunk
  • Update entity location entries

The runtime of this will increase relative to the total size of the entity’s component data, and does include 1-2 hashmap lookups, but there’s nothing inherently slow about it, at least by my definition. It’s all O(1) relative to the number of entities in the World.

3 Likes
(OvermindDL1) #30

That’s how it was seeming to me, I’d only seen that in a closed source library in the past, not an open source, so it was a nice surprise. ^.^

It might be fine as it is, benchmarks will tell for sure! :slight_smile:

2 Likes
#31

@kabergstrom - curious, if I’m building a scene graph renderer thing for wasm (single-threaded), should I use your fork or the official repo?

Or generally, just kinda wondering what the roadmap is - I see your repo is 16 commits ahead… are you planning to diverge completely or just working towards a milestone before making a PR?

(Kae) #32

IMO you should use my fork, as I believe future work will be based off it. I should probably make a PR or discuss the maintainer situation with @Ayfid

3 Likes
#34

Another ECS to consider: shipyard

FYI @kabergstrom - I made a tiny PR on the original repo to allow passing a name in. It’s actually required for using legion in wasm environments since the randomized name thing breaks there. It hasn’t been merged yet, so not sure what the right approach here is… maybe I should revoke the PR and make it against your fork instead?