Scripting: Advanced primitives in the Component Schema Language

(Théo Degioanni) #1

While natural C primitives like integers or char should be exposed to the programmer, it probably is quite important we provide more advanced primitives such as String or Vec that rely on heap-allocated memory (alternatively, we could expose heap allocation mechanisms and references, but this might be too hard for a first implementation).

Regarding the interface definition, even if the schema is turned into a C header definition, we should be able to abstract it behind opaque types (types of which the size is provided but the internal layout is not) and function definitions. Natural primitives are, after all, opaque types with operators.

The hard part is to manage the allocated memory: naturally the engine would allocate and free the associated memory, so we need to make sure the engine can run logic whenever a situation like this would arise.

Let’s take the example of a String: when does it need to work with allocation?

  • On creation: we would simply have a method returning a new instance of a String.
  • On component deletion: cleanup is handled by the engine upon deletion request without issues.
  • On mutation: most legacy or low-level languages such as Java or Rust handle String mutation in functions, and languages like Lua let us overload operators so our custom String type behaves like and correctly with native strings. In both cases, we control the internal value and the needed reallocation.
  • On override: the trickiest part, as a scripting language cannot simply override the stored data without causing a memory leak, the previous String needs to be freed first. This is a big issue and it would probably highly depend on the language.

For example, in Lua, we can make our custom String behave correctly with native strings. component.a_string .. "test" (concatenation) would return a native string, and component.a_string = a_native_string would trigger component's affectation event, which could mutate the internal string into the content of the native string. This obviously requires a copy, but this is cross-language strings we are working with (in-place mutation methods could be used by the user to avoid it, at the cost of being less idiomatic). At no point you can interact with the string data directly as it is protected by the metamethod (the “affectation event handler”). This could also work recursively, for example if overriding a field of a type containing a String, by passing the events to the internal types.

In Rust, component.a_string would be an AmethystString (name pending) that cannot be instantiated. It however implements AsRef<&str>, and has methods for mutation similar to std’s String. A similar solution could be achieved in Java and any other safe static language.

In C#, properties could be used to control affectation in a way similar to Lua.

I have never worked with C++, maybe it would be interesting to know how that would behave.

The general result would be that mutating strings can be somewhat expensive depending on the context, but reading is fairly cheap. In my opinion this is perfectly fine.

@zicklag, you seem to be familiar with Python. Do you believe its immutable strings could be made to work with this sort of solution?

I believe we could apply a similar method to Vec and maybe other heap-allocated primitives. What do you believe?

1 Like
(Optimistic Peach) #2

One thing to note is that C++ has difficult interop at best due to the variety of compilers and ABIs out there. This was mentioned and discussed on the rust users forum:

(Zicklag) #3

I don’t know much about how Python handles strings, but from what I’ve seen while using PyO3, it seems like it would be reasonable to assume that we could create our own string-like API and present it to Python in a way that would allow easy conversion to and from real Python str's.

Also any performance hit you would get from memory copying seems reasonable because it isn’t like you usually need to update hundreds of strings every frame in a game. With vectors it is more likely that you might make lots of frequent mutations, but if you are storing repr(C) types in the vector-like object and using exposed methods like “push” to mutate the vector, I think that would be just as efficient as doing it natively.

1 Like
(Théo Degioanni) #4

This is very interesting. If anything, it shows we should probably consider C++ later down the line, and maybe focus on a clang/llvm-only backend for example.

I am sure you are right. This should do well.

If we can overload the index operator ([]), our custom type might be convincing enough to act as a python list. This should even make it compatible with existing software thanks to the dynamic typings of python.

1 Like
(Zicklag) #5

Since we are providing a pure C API, for anybody who wants to script Amethyst with C++, they should be able to do so through the C API similarly to how you would do it in Rust. ( that is my under-educated guess anyway ). We probably don’t want to put a lot of effort into proving it out and making it work unless somebody has a need for it, though. We want to focus on Rust and Lua first as long as we still make the design universal enough to support others later.

(Théo Degioanni) #6

Absolutely. I just meant that quality of life improvements tailored for C++ are a bit early to elaborate on.

1 Like