I am creating an adaptive music system for a game I am working on and everything works now, except for precise synchronisation of the tracks. I have four layers in my music (drums, drone, strings, brass) that together form the soundtrack. Each layer can fade between three different versions to create adaptive music. Now the problem is that I do not know how to make sure these four (or eight if all are transitioning at the same time) layers are always exactly synchronized.
I am currently using a float timePosition and when I start a new track, I set it to this timePosition. However, this gives very bad synchronisation (usually 100ms off, sometimes even 10000ms off), so I guess this is not the right thing to do. I load the sound paused and only unpause it after settings its time using setPosition((unsigned int)(1000 * timePosition), FMOD_TIMEUNIT_MS).
So the question is: what is the right approach to synchronize four to eight streaming music tracks?
- Oogst asked 10 years ago
I had a similar situation a couple of years ago before I started using FMOD. My solution was to stream all the tracks all the time. Instead of stopping a stream after it fades out, set the volume to zero and let it continue so that it stays synchronized if you have to fade it back in later.
One of the reasons that I liked this approach is that it forced me to deal with the worst case scenario all the time, even when other resources were being loaded at the same time. This gave me a great deal of confidence in the file handling code. (This was a PS2 game and you have to everything yourself on the PS2)
If you don’t want to keep all the streams active, you might be able to use sync points, but with the the callback coming in a different thread, you might not be able to get sample accurate synchronization with that, but you could get it within one frame.
Something else you could try is calling setPosition twice. The first time you call it, set the position about a quarter of a second ahead of the current position of the active streams. This gives FMOD time to load the first chunk of the file. When the other streams get close to where the queued stream is, call SetPosition again to get the synchronization sample accurate and unpause the channel. As long as the second call to SetPosition puts the play cursor in the loaded chuck, it shouldn’t reload anything. But I could be wrong about that, the FMOD gods know better than I do.
- jcole answered 10 years ago
I would be a bit disappointed of just playing all twelve tracks at the same time instead of the four to eight relevant tracks where really the best solution, because that would scale very badly to larger numbers and already it a waste of resources. This does raise the question, though, how much weight it puts on the framerate to do this. I have heard audio is quite cheap, but is that true? And does it matter much whether I use WAV (big, so bad) for this or OGG (small, so good)?
The setPosition solutions sound a lot better, although framerate precise is not really good enough. At 25fps, that already means each track can be 40ms off, which is way too much for good music (I would kill my drummer if he did that during a gig).
I suspect that what you’re trying to do is very doable, but I don’t know what the right set of API calls are. Hopefully one of the FMOD dudes can shed some light on that side of it.
As to the cost, the five resources you have to play around with are programmer time, CPU, memory, disk space, and disk IO (both bandwidth and amount of seeking). Going with a compressed format (like OGG) decreases disk space and disk bandwidth at the cost of CPU. Streaming spends disk performance but saves memory. jcole’s suggestion saves programmer time but spends a lot of disk IO, etc. Whether any of this is "cheap" or not depends entirely on how the rest of your game makes demands of these shared resources.
Some other things you might consider:
* Interleave all 12 channels into a single file, streaming just that one file, and then turning individual channels on an off. As you mention, it’s an approach that doesn’t scale well as you add more variation. If it sounds interesting, though, do a seach for "interleaved" in the forums.
* Load the relevant tracks into memory. If the tracks are short or you have memory to spare, then it’s a pretty easy way to go.
* Transition between pieces of music rather than between tracks, which is what most games do. You can still use interleaved audio to control which tracks the player hears, but it’s probably easier to implement and is quite efficient.
- audiodev answered 10 years ago
I specifically do not want to do the transitioning between pieces of music (wagons in a train I have heard that called), because in this project I am deliberately trying something different from what most games are doing.
The length of each track is two minutes (short because this is a prototype, for a real product my method would rather want 10 minute tracks). So that means 12*2 minutes.
As for costs: what I meant is, how many tracks can a CPU decode? If I do twelve tracks at the same time on an average modern day PC (I know, that is vague, but specific enough for a rough estimate), how much percent of my CPU would that be eating?
I will look for the interleaving idea tomorrow when I have time for it. It sounds interesting for my project, although I do not know exactly what it is right now.
I have implemented this now with just playing all twelve tracks at the same time and it works. Thanks for the help with that idea! 8)
I do wish to know, though, how I could have done this neatly, i.e. without playing all layers at all time. Does anyone know?
Please login first to submit.