Refactoring audio, seeking input

(Nolan) #1

Hey folks,

I’ve been slowly refactoring the spatial audio systems and am seeking feedback on next steps. Current challenges I’m facing:

  1. I’m not clear on how to loop audio. There are Rodio methods for looping, but those methods aren’t exposed.
  2. I’m not clear on how to set individual sound volumes, or indeed how to control individual sounds.

But, more fundamentally:

  1. I’m unclear on the purpose of the line:

It looks like that adds sources to audio emitters, but I’m unclear of what 4 is? A limit on how many sources can be added to each emitter? I’m also unclear on the purpose of the bool.

  1. It doesn’t seem possible to access the underlying Rodio source to implement looping and individual sound control.
  2. I don’t know that anyone is even using this API. There are no examples beyond the simple non-spatial sounds in the Pong demos. There is no documentation, either.

I’m thinking of stripping this API down to its bare essentials, then building it alongside a game that actually uses spatial sounds to figure out what is needed. Specific changes I have in mind:

  1. Emitters map to sink instances one-to-one. If an entity makes multiple simultaneous sounds, you’ll need multiple emitters.
  2. Emitters contain methods for setting volume, looping, etc. So if you have a vehicle that emits an engine sound, along with occasional ambient sounds that don’t loop, the vehicle would contain an emitter set to loop, and another emitter with a picker containing the playback logic.
  3. I may get rid of the DjSystem entirely, in favor of an emitter with a picker. If an emitter has a transform, it gets a spatial sink hooked into the listener. If it doesn’t have a transform, it just plays without spatialization. So your music would just be an entity with an emitter, no transform, and a picker on the emitter that selects new music when the first track ends. No need to have two separate routes for audio, just a unified system that adapts to the presence of a transform.

Anyhow, it’d be hard to create a full design doc, because we’re not evolving a spatial audio system that works into one that’s better. I’m also not sure what’s needed, and won’t until I build the game I’m currently working on now.

All this is my long way of checking in. Is anyone currently using the spatial audio system extensively, and if so, can you show me what magic you’re doing to make it work? I don’t want to completely gut and refactor an API if it’s working fine for folks and I just don’t know how. :slight_smile:

Sorry, not trying to sound harsh with this. Just seems like this API was built without any consumers, and it’s been a bit tough to work with something I’m not even sure works. :slight_smile:

Thanks.

7 Likes

(Jacob Kiesel) #2

Go for it. :+1: I trust your judgement on this and would really appreciate anything you can do. We’ve talked elsewhere about some of your questions so I won’t reiterate that here. If there’s anything you need from me specifically feel free to ping me on discord.

1 Like

(Nolan) #3

Following up on this:

I started testing a few designs for an audio refactor. Essentially, I found using the picker-based design to implement looping was very unintuitive, required a bit of extra code, and either it wouldn’t have worked or I wasn’t smart enough to figure it out. I think that looping a sound should be a one-liner, so I set about trying to implement that with a new design before working it into a PR. I also wanted multiple named sound sources per emitter, and eventually I’d like a single audio path without the DJSystem such that music is just an AudioEmitter without a transform.

But I’m having some struggles with the type system and need help. Essentially, Rodio requires that each Source have an associated Item type with the kinds of samples it takes. I’m trying to store sources in a HashMap<String, Source> so each emitter can have independent, named sounds. I can’t paste a single compiler error because in some cases I’m getting many, but I have a couple different branches with different approaches that I’m trying. Essentially the issue is that a source when first loaded is a Buffered source. If you want it to loop, though, you’d call the .repeat_infinite() method on Source which changes it to a Repeat source. Each defines an Item associated type, though, so making these generic is proving difficult.

https://github.com/ndarilek/amethyst/tree/enum-source-type tries to capture both Buffered and Repeat types in a generic SourceHolder enum. This gives the error which initially kicked off this whole mess:

error[E0191]: the value of the associated type `Item` (from the trait `std::iter::Iterator`) must be specified
  --> amethyst_audio/src/components/audio_emitter.rs:15:19
   |
15 |     Repeat(Repeat<RSource>),
   |                   ^^^^^^^ associated type `Item` must be specified

https://github.com/ndarilek/amethyst/tree/refactor-audio tries to make Source a dyn trait, but that gives something like 14 compiler errors, so clearly I’m not doing something right.

Anyhow, I’d appreciate some help with this. If I can create a AudioEmitter.set_loop(name: String, loop: bool) method, then I’ll have found a design that gives adequate enough access to the underlying source that I can submit an RFC/PR with a more fleshed-out design. But right now I’m having a hard time just exposing the underlying data structure in a way that gives the control we’ll need for an easier audio subsystem.

I’m also wondering if something like Alto/OpenAL would be a better fit for us. At least with Alto, I know how to loop a source without changing its underlying data type. :slight_smile: I seem to recall complaints about OpenAL, though, so I’m not sure if we lose anything significant by abandoning Rodio. Happy to keep it if we can make it work–I just thought that implementing a loop function would be very easy, and it’s turning out to be much more difficult.

1 Like

DRAFT: Amethyst 0.11.0 release
(Nolan) #4

Because this is text, I want to set the tone. I’m not trying to hate on the audio system or Rodio, and if the problem is that I’m not smart enough to figure this stuff out, I’m OK with that. :slight_smile:

I feel like I’m fighting Rodio pretty significantly in this rewrite:

  • Changing an aspect of the source–speeding it up, looping it, changing the volume–gives you an entirely new implementation of Source. It seems like it may just wrap previous sources. So if I decode bytes into a source, I get Decoder. If I then buffer the decoder so I can clone or otherwise manipulate the bytes to use a sound more than once, I get Buffered<Decoder>. If I then call .repeat_infinite() on the source to get a loop, I get Repeat<Buffer<Decoder>>>. It isn’t clear to me what happens if I want to stop looping a source. Do I then have Buffer<Repeat<Buffer<Decoder>>>>? Do I have to check the outer type in the loop set function and unwrap it if I want it to stop looping? What happens if I call functions out of order, and the outer type isn’t Repeat? Then, when new effects are added, checking the outer type won’t even be relevant. Note that I’m taking some liberties with the type names in the above example, because I’m not looking up their exact names and am still on my first cup of coffee. :slight_smile:
  • All of this is to say nothing about the fact that nested in each source is an Item associated type that makes creating generics very difficult–at least, for me. See yesterday’s post for two different attempts to achieve this.
  • Our current implementation seems to work with raw buffers of bytes. It seems to implement EndSignal, which I found is apparently a clever workaround for this Rodio issue. Yet it seems to me that, if we keep this design, we’re going to have to implement bunches of audio algorithms ourselves. I.e. looping will involve presumably restarting the buffer when the sound ends. What about pitch-shifting? Speed changes? This is all implemented by Rodio, but we’re going to have to either abandon those implementations, or do some bit-flinging back into sources to use them.

Unless I’m fundamentally misunderstanding something and making things harder than they actually are, I think I’m going to roll my own audio implementation in Alto. It’s cross-platform to Linux, Wasm, MacOS and Windows for sure, and probably to Android and iOS. I understand that an Alto-based audio implementation is unlikely to be accepted into the engine because it fails if more than one version of Alto is used in the same project. I guess I can respect that from the perspective of not wanting to produce binaries that fail, and in the end if that’s the project’s decision then I won’t fight it. But it feels highly unlikely to me that, given a rich audio subsystem, a game developer will pull in their own version of Alto, crash their game, and not immediately recognize the issue and remove the conflicting dependency. I.e. practically speaking, in what scenarios is a game likely to have 2 versions of Alto in its dependency tree? And if it’s that niche of an app where it is both a game and some sort of advanced audio processing setup, couldn’t the creator feature-gate the second Alto dependency such that it is only included by folks wishing to compile out the gamification features?

In closing, I’ll leave you with the Saga of the Loop. :slight_smile: I’m building an audio-based Asteroids game, and am trying to loop the cockpit sound using the current picker-based approach. I can’t get anything to work. Here’s my first attempt:

        let cockpit = {
            let storage = world.read_resource::<AssetStorage<Source>>();
            let sounds = world.read_resource::<Sounds>();
            storage.get(&sounds.cockpit).unwrap().clone()
        };
        let mut emitter = AudioEmitter::new();
        emitter.play(&cockpit)?;
        emitter.set_picker(Box::new(|emitter: &mut AudioEmitter| {
            emitter.play(&cockpit);
            true
        }));

That won’t work because I’m using cockpit in the closure and the borrow-checker cries. I then tried reinitializing cockpit inside the picker:

        emitter.set_picker(Box::new(|emitter: &mut AudioEmitter| {
            let cockpit = {
                let storage = world.read_resource::<AssetStorage<Source>>();
                let sounds = world.read_resource::<Sounds>();
                storage.get(&sounds.cockpit).unwrap().clone()
            };
            emitter.play(&cockpit).unwrap();
            true
        }));

This didn’t work, wanted me to add a 'static lifetime on World. I suppose I could try that, but who knows what rabbit hole that would lead me down. :slight_smile:

Next I tried prepending move before the closure. This required me to move my attempt to start looping after the code that adds the emitter, or at least, it reduced my error count to fewer fingers than I have on one hand:

        world
            .create_entity()
            .with(Player)
            .with(start_transform)
            .with(AudioListener::default())
            .with(emitter)
            .build();
        emitter.set_picker(Box::new(move|emitter: &mut AudioEmitter| {
            let cockpit = {
                let storage = world.read_resource::<AssetStorage<Source>>();
                let sounds = world.read_resource::<Sounds>();
                storage.get(&sounds.cockpit).unwrap().clone()
            };
            emitter.play(&cockpit).unwrap();
            true
        }));

This throws 5 errors–borrow-checker errors, errors about Copy not being implemented for some things.

And that’s just trying to loop a sound. I know there’s work ongoing to make Amethyst easier to use overall, implying that it’s somewhat difficult as is. I’d agree with that assessment. But even so, I’m able to at least understand how and why I need to do things the way I do throughout the rest of Amethyst, and none of it feels as incomprehensible as the audio subsystem. And please don’t read that as a slam on anyone who built it, but I do think an audio subsystem should emerge organically and through use, much like the rest of Amethyst seems to be. And as someone trying to rewrite it with the joint constraints of keeping Rodio and sticking close to the current API, I’m genuinely not sure how to stay true to the former. Alto is the only crate I know of that does spatial, cross-platform audio. If anyone knows of another, I’m open to evaluating it.

If you’ve read this far, thanks. :slight_smile:

3 Likes

(Thomas Schaller) #5

Just one quick thought, have you considered cpal? It‘s the low level library below rodio, so if rodio is too inflexible, that might be an option.

However, it probably won‘t help with spatial audio. I see that advantage of rodio, but C dependencies come with many drawbacks:

  • more work for environment setup
  • versioning issues, as you wrote
  • testing complications (only one test may run in parallel)
  • deployement becomes harder

These are my thoughts on that, but I‘m not involved in audio in any way, so it may as well be reasonable to choose alto.

0 Likes

(Thomas Schaller) #6

At a first glance, it seems like it should be

Repeat(Repeat<Item = RSource>),
0 Likes