Scripting: What Do We Need to Get There

(Zicklag) #21

If the components are stored as bytes in Specs, then to use that component in a Rust script, for example, you have to read the bytes into a struct that can be used to access the component’s fields. If you used something like Serde and Bincode to deserialize those bytes into a Rust struct, it would make a copy of all the data before you could use it.

I’m trying to figure out how we go from byte data to Rust struct, hopefully without copying data.

PS: I know Bincode woudn’t work in practice because its byte representation is specific to Rust.

(Théo Degioanni) #22

In Rust’s case, you need to convert your schema into an appropriate (repr(C), probably) Rust type definition, then unsafely cast the byte array into that type. This, or something similar, should work and be zero copy.

For something like LuaJIT, you would create a C header that would describe your struct and pass it to the FFI interface, and use that to invoke your instances. You could use the “augmented FFI” idea from the RFC to make it more idiomatic, of course.

The way you expose the schema would have to be language specific I believe. But the bute representation obviously should be common to all languages so it is easy to pass it around.

(Zicklag) #23

Ah, you can cast it! I didn’t know you could do that. That makes sense. For the Arsenal glTF exporter I copied some code that exported a repr(C) struct to get a binary representation, but I didn’t really understand it. In this case it would be essentially the opposite of that.

Do you have any minimal code examples of casting to and from a byte array?


I found a way to serialize to bytes: Rust Playground Link.

(Théo Degioanni) #24

The most idiomatic way is probably using std::mem::transmute:

let my_component = std::mem::transmute::<[u8; n], MyComponentStruct>(my_component_byte_data);

with n the size of your component. As the doc implies, this is extremely unsafe as memory representation has lots of intricate behaviors in Rust. But if you use repr(C), it will work just as what you might be used to in C.

1 Like
(Théo Degioanni) #25

Do not forget to care about endianess too! (if needed)

(Zicklag) #26

OK I found in the transmute() docs that there is a safer alternative for doing what we need which looks like this:

 let value_bytes =
        unsafe { &mut *(&mut my_struct_instance as *mut MyStruct as *mut [u8; std::mem::size_of::<MyStruct>()]) };

Now I’ve got a full example of going from a Rust struct to a &mut byte array and back to a &mut Rust struct that can be modified in place. Perfect. It works fine with generics too. I’m assuming that with repr(C) you should be able to do someting similar for the same type in C.

When would I need to care about endianness? Is that different on diferent systems? Does repr(C) take care of that?

So now we need to create a schema that we make Serializeable with Serde so that we can load it from RON. That should be pretty simple.

I was thinking that for scripting in Rust we could create a Rust macro that will generate the component struct from the schema for you with an associated function for creating an instance of the component from a byte array. That would make statically typed Rust scripting easy.

(Théo Degioanni) #27

Well to be frank in terms of readability I’d prefer to use transmute myself. Besides, it isn’t really safer, it’s just not using transmute.

I was thinking about the step to generate C headers and stuff like that. But actually my mind wasn’t very clear, it shouldn’t be an issue.

Are you sure this would be best to make a schema language? Maybe a simple custom-made syntax similar to the way languages register types would feel nicer to the programmer. Something made with a simple parser. Otherwise yes, a macro or code generation could work.

(Zicklag) #28

Sounds good, in the docs it said:

Don’t despair: many uses of transmute can be achieved through other means. Below are common applications of transmute which can be replaced with safer constructs.

That made me think it was safer somehow. If it isn’t, the transmute call is definitely more readable.

Yeah, I was just messing around with the RON and it seems like it would end up pretty verbose, especially when you add generics. Maybe we could actually just use Rust syntax and parse it with Syn.

(Théo Degioanni) #29

Yeah I’m not quite sure what they mean here. Maybe I am wrong, we should probably ask more experienced unsafe Rust programmers on Reddit or community forums. Do you want to do it or should I?

Syn definitely would work, although it might be a bit too powerful and complex (and slow?) for our use in my opinion. Considering how simple our syntax is, we could just write a dumb parser. Here’s the general structure:

struct MyComponent {
    field_name: TypeName<Generic1, Generic2, ...>,

It’s not a very complicated state machine. Even clever regex could suffice (but probably wouldn’t be the best fit).

(Zicklag) #30

I can ask on the community forum.

OK, that should work.

1 Like
(Théo Degioanni) #31

If you need a hand for some of the implementation feel free to contact me.

1 Like
(Zicklag) #32

OK, thanks for the help!

1 Like
(Zicklag) #33

It looks like somebody already asked the same thing. The response was:

Yes, these are pretty much equivalent. However, transmute can also change the types of non-pointer types and it can also change lifetimes. You can’t do that with as . Since transmute has such power (and therefore potential for misuse), its use is generally not recommend.

I looks like using as can do less damage when used incorrectly which is probably a good thing.

(Théo Degioanni) #34

Right, but here we want to transmute actual data, not references. So we do need to use transmute. Unless you want to do it differently?

(Zicklag) #35

Nope, this is the first time that I’ve done most of this so if you think transmute is good then lets go with it.

(Zicklag) #36

You think that nom would be overkill for a parser, too. I haven’t written a from scratch parser and the only one that I’ve read through was the Haxe XML parser. If we wanted to do it from scratch would that be a suitible design to model it after?

(Théo Degioanni) #37

No maybe nom would be suitable, I’ve just never used it. But the usual method to write a parser is to make a state machine: in our case, we would have the initial state, when we find the “struct” string we go to some state where we gather the name of the struct and wait for { to go to some other state where we now wait for a field, etc…

But honestly you can do whatever you want here, this is just what I would do because I am used to doing it like this.

1 Like
(Joël Lupien) #38

If you save your data in one endianness, and then load it on a computer with another endianness, you’ll have a lot of issues. Be careful to use a crate to change the endianness to the right one when saving or loading.

(Théo Degioanni) #39

Intermediate component representation should not be stored in the first place in my opinion. It could create desync issues.

(Zicklag) #40

If you want to send the data over the network for multiplayer or you want to save replays that might be a concern, though.