0
0

Hi everyone,

Following my previous thread about gapless playback and MP3 files, I managed to create a new playback engine based on SetDelay (inspired by the SoundManager example on the FMOD Wiki). I also created a binary reader which extracts the Xing header information from MP3 files (encoder padding/delay), since this information is vital for gapless playback.

It works perfectly for FLAC and OGG files, just like when I was using sentences/subsounds. However, I still get a very slight delay between MP3 files.

Please correct me if I’m wrong:

The encoder delay is at the start of the file. It’s usually 576 samples. Thus, to skip this frame, I need to setPosition each channel to 576 (using FMOD_PCM). Did I get that right? I implemented it and it seems to work… unless FMOD already does that?

The encoder padding is at the end of the file. It fills the remaining samples with zeroes. The value usually ranges from 500 to 2000 samples. To skip this padding, I need to substract the encoder delay from the setDelay value of the second channel (DSPCLOCK_START).

Thus the setDelay value of the first channel is using this formula:
[Existing delay value (GetDelay)] + [Minimum DSPCLOCK_START delay] + [Encoder delay of sound 1]

Second channel:
[Existing delay value (GetDelay)] + [Minimum DSPCLOCK_START delay] + [Encoder delay of sound 1] + [Sound 1 Length (PCM)] – [Encoder padding of sound 1]

Can someone tell me if I’m doing the right calculations? I’m going to post the source code tonight when I get home.

Thanks to anyone who can help me out!

  • You must to post comments
0
0

Hi peter,

I have good news and bad news. The good news is that the multiple decode stream into one master stream was definitely the solution to go. I created an awesome prototype which finally puts MP3 gapless playback to rest.

The bad news is… the other audio library makes it much easier to do this than FMOD. In the end, I think FMOD is more powerful for game developers but not the best audio library for what I’m trying to do. That library has its own weak points but in the end it’s a better fit for me. So I think I’m about to drop FMOD from my application. :( It’s sad because I have been using FMOD for the last three years!

Anyway, I would like to thank you guys for your great support, I really had a lot of fun working with FMOD. It’s also a great audio library, but maybe more for game developers in the end. ๐Ÿ˜‰

  • You must to post comments
0
0

Hi GunnSgtHartman,

You’re welcome, we’re glad to be of assistance where possible. It’s unfortunate that FMOD wasn’t a good fit for your latest project, best of luck in your future endevours.

  • You must to post comments
0
0

Hi Gunn,

I have found the best valididate sample accurate seamless stitching results is to render the audio stream out to file, then analyze it in a audio program such as Audacity. That allows you to zoom in exactly to the samples validate the output. Sorry I don’t know the exact values for the encoder delay, but if it is sounding good then you’re probably getting close. :)

  • You must to post comments
0
0

[quote="peter":36yq5l7r]Hi Gunn,

I have found the best valididate sample accurate seamless stitching results is to render the audio stream out to file, then analyze it in a audio program such as Audacity. That allows you to zoom in exactly to the samples validate the output. Sorry I don’t know the exact values for the encoder delay, but if it is sounding good then you’re probably getting close. :)[/quote:36yq5l7r]

Hi Peter,

Thanks for replying to my post. That’s a great idea, now I understand why it was part of the SoundManager example on the FMOD wiki.

It’s weird because it seems to work half of the time for the same MP3 files… if I don’t see a silence between both songs in Sound Forge maybe that means I don’t have to cut that encoder delay (i.e. FMOD starts playback after this delay), but that would not explain why it’s intermittent. I’ll post the solution when I find it.

Anyway, here is the piece of code I wanted to post:

[code:36yq5l7r]
// Create the Sound objects for the first two files
m_audioFiles[0].Sound = m_soundSystem.CreateSound(filePaths[0], true);
m_audioFiles[1].Sound = m_soundSystem.CreateSound(filePaths[1], true);

        // Lock the DSP mixer (make sure the delay values stay in sync)
        m_soundSystem.LockDSP();

        // Start playing each sound file            
        for (int a = 0; a < m_audioFiles.Length; a++)
        {
            // Create channel                
            m_audioFiles[a].Channel = new Channel(m_soundSystem);                
            m_audioFiles[a].Channel.SoundEnd += new Channel.SoundEndHandler(Channel_SoundEnd);

            // Start playback (in paused mode)
            m_soundSystem.PlaySound(m_audioFiles[a].Sound, true, m_audioFiles[a].Channel);

            // Set initial volume
            m_audioFiles[a].Channel.Volume = m_volume;
        }            

        // ----------------------------------------------------------------
        // Set channel 1 delay (minimum delay)

        // Get DSP clock start delay            
        Fmod64BitWord wordDelay = m_audioFiles[0].Channel.GetDelay(FMOD.DELAYTYPE.DSPCLOCK_START);

        // Added the minimum delay to this value
        AudioTools.FMOD_64BIT_ADD(ref wordDelay.hi, ref wordDelay.lo, 0, m_minimumDelay);  

        // Set the new DSP clock start delay value            
        m_audioFiles[0].Channel.SetDelay(FMOD.DELAYTYPE.DSPCLOCK_START, wordDelay.hi, wordDelay.lo);

        // ----------------------------------------------------------------
        // Set channel 2 delay (minimum delay + sound 1 length)

        // Get DSP clock start delay            
        wordDelay = m_audioFiles[1].Channel.GetDelay(FMOD.DELAYTYPE.DSPCLOCK_START);

        // Calculate the length in PCM            
        uint length_pcm = (uint)((m_audioFiles[0].Sound.LengthPCM * m_outputFormatMixer.sampleRate / m_audioFiles[0].Channel.Frequency) + 0.5f);

        // Check if the file is an MP3 file with a Xing/Info header and encoder delay information
        if (m_audioFiles[0].FileType == AudioFileType.MP3 && m_audioFiles[0].XingInfoHeader != null && m_audioFiles[0].XingInfoHeader.EncoderDelay != null)
        {
            // Minimum delay + First audio file PCM length - Encoder delay - Encoder padding
            AudioTools.FMOD_64BIT_ADD(ref wordDelay.hi, ref wordDelay.lo, 0, m_minimumDelay + length_pcm - (uint)m_audioFiles[0].XingInfoHeader.EncoderDelay.Value - (uint)m_audioFiles[0].XingInfoHeader.EncoderPadding.Value);
        }
        else
        {
            // Minimum delay + First audio file PCM length
            AudioTools.FMOD_64BIT_ADD(ref wordDelay.hi, ref wordDelay.lo, 0, m_minimumDelay + length_pcm);
        }

        // Set the new DSP clock start delay value            
        m_audioFiles[1].Channel.SetDelay(FMOD.DELAYTYPE.DSPCLOCK_START, wordDelay.hi, wordDelay.lo);

        // Unlock the DSP mixer (the delays have been set)
        m_soundSystem.UnlockDSP();

        // Check if the file is an MP3 file with a Xing/Info header and encoder delay information
        if (m_audioFiles[0].FileType == AudioFileType.MP3 && m_audioFiles[0].XingInfoHeader != null && m_audioFiles[0].XingInfoHeader.EncoderDelay != null)
        {
            // Set position (skip encoder delay)
            m_audioFiles[0].Channel.SetPosition((uint)m_audioFiles[0].XingInfoHeader.EncoderDelay.Value, FMOD.TIMEUNIT.PCM);
            m_audioFiles[1].Channel.SetPosition((uint)m_audioFiles[1].XingInfoHeader.EncoderDelay.Value, FMOD.TIMEUNIT.PCM);
        }

        // Unpause all channels
        for (int a = 0; a < m_audioFiles.Length; a++)
        {
            // Check if the channel exists
            if (m_audioFiles[a].Channel != null)
            {
                // Start playback
                m_audioFiles[a].Channel.Pause(false);
            }
        }

[/code:36yq5l7r]

CreateSound uses the following parameters: FMOD.MODE.SOFTWARE | FMOD.MODE.CREATESTREAM | FMOD.MODE.ACCURATETIME

m_minimumDelay is calculated this way:

[code:36yq5l7r]
// Get DSP buffer size
DSPBufferSize dspBufferSize = m_soundSystem.GetDSPBufferSize();

        // Set minimum delay
        m_minimumDelay = dspBufferSize.bufferLength * 2;   

[/code:36yq5l7r]

Anyone else has an idea? Thanks!

  • You must to post comments
0
0

The intermittent nature sounds like it might be a threading issue. Keep in mind that the mixer could run at any point in time, even between function calls or during a function call.

so if you do something like this:
getDSPClock
setDelay

the mixer could have performed a mix between those two operations causing your delay to be out by one whole mix block. You can use System::lockDSP to lock the mixer.

lockDSP
getDSPClock
setDelay
unlockDSP

Use this with caution, do not lock the mixer for too frequently or for too long.

  • You must to post comments
0
0

[quote="peter":29lwm27a]The intermittent nature sounds like it might be a threading issue. Keep in mind that the mixer could run at any point in time, even between function calls or during a function call.

so if you do something like this:
getDSPClock
setDelay

the mixer could have performed a mix between those two operations causing your delay to be out by one whole mix block. You can use System::lockDSP to lock the mixer.

lockDSP
getDSPClock
setDelay
unlockDSP

Use this with caution, do not lock the mixer for too frequently or for too long.[/quote:29lwm27a]

Hi Peter,

Once again, thanks for your help. I actually added LockDSP/UnlockDSP yesterday but didn’t have the chance to test it out. The code I posted tonight contains those two methods.

I did a new round of audio tests tonight, more rigourously, using 5 sets of 2 songs that were perfectly synchronized in WAV form, and converted them into FLAC, OGG and MP3 files.

The great news is that FLAC, OGG and WAV playback is now perfectly synced 100% of the time with LockDSP/UnlockDSP.

I also tested the MP3 files, this time, listening to these files in my player, Foobar, Winamp and iTunes. To my surprise, some of the files work perfectly in every software and some others have a very slight delay, even though they have Xing headers. I reproduced the same delay in Winamp and Foobar. It’s slightly less apparent than in FMOD, but it’s still there.

I used the wave output like you suggested to take a look at the problem, and I detected a slight delay between both files (WaveLab says it’s about 3 ms). I didn’t manage to find that silence in the wave output from Winamp, but the click was clearly audible.

So I guess the encoder padding value isn’t always exact. I’m not sure what the other players are doing to make it less apparent (removing additional silence?), but I guess this is acceptable considering people should know MP3 is a "lesser" sound format and other players have the same behavior.

Just a last question if you have time, what would be the best practice to load the next set of sound/channel objects using SetDelay and without stopping playback? Should I LockDSP/UnlockDSP in the Channel sound end callback and load the next file, compensating the current position in the SetDelay value? As long as it’s done quickly, it shouldn’t affect playback?

Thanks again for your great help!

  • You must to post comments
0
0

Glad to hear it is coming along.

[quote:3j0hmxzm]Should I LockDSP/UnlockDSP in the Channel sound end callback[/quote:3j0hmxzm]
I would suggest being careful with what you do inside this callback, the channel and sound object are ‘in use’ and cannot be cleaned up in here. If you are just using the callback for loading more data that should be fine, but I feel that you may not not have enough time to load the next sound doing it inside the callback. I would try to initiate the loading of the next sound as early as possible and definately avoid loading any sounds inside a DSP lock. Loading can be quite a slow operation, it’s common to use async loading to load the file in the background, you have to know in advance and begin loading early to do this.

  • You must to post comments
0
0

[quote="peter":2tm6jfb4]Glad to hear it is coming along.

[quote:2tm6jfb4]Should I LockDSP/UnlockDSP in the Channel sound end callback[/quote:2tm6jfb4]
I would suggest being careful with what you do inside this callback, the channel and sound object are ‘in use’ and cannot be cleaned up in here. If you are just using the callback for loading more data that should be fine, but I feel that you may not not have enough time to load the next sound doing it inside the callback. I would try to initiate the loading of the next sound as early as possible and definately avoid loading any sounds inside a DSP lock. Loading can be quite a slow operation, it’s common to use async loading to load the file in the background, you have to know in advance and begin loading early to do this.[/quote:2tm6jfb4]

Hi Peter, I have great news today!

I tried putting the code inside the channel callback just to see what it would do and sometimes the GetDelay would be ignored and the next song would start immediately. Instead I created a timer for loading the next song a few seconds after starting the current song, just like you suggested. I was struggling at first, because I forgot to add the existing delay of the channel, so the song was always starting too soon. Then it was missing just a few milliseconds to sync it properly like the first two songs.

That’s when I realized that I was converting the length PCM value from 44100 Hz to 48000 Hz (i.e. the audio file is 44100 Hz but the output mixer frequency is 48000 Hz) but I wasn’t doing the same with the position value. When I added this convertion, it started worked perfectly.

Then I realized that I forgot to do that same conversion to the MP3 encoder padding/delay I was mentioning earlier. I was really glad to hear that it removed the remaining silence; my player is now syncing MP3 files exactly like Foobar and Winamp… I’d even say it sounds a little bit better! ๐Ÿ˜†

Anyway, thanks for your awesome help Peter, it put me in the right direction, I really appreciate it. Have a great day!

  • You must to post comments
0
0

You’re most welcome, I’m glad to hear it’s all working for you. :)

  • You must to post comments
0
0

[quote="peter":15oi99nd]You’re most welcome, I’m glad to hear it’s all working for you. :)[/quote:15oi99nd]

Actually, I stopped working on this prototype for a while, and when I came back to it, I realized that the sync wasn’t perfect when syncing a new song during playback. It’s perfect 90% of the time, regardless of the hardware (I tested this on a Pentium 4 and on a Core i7). By that, I mean that I get a very slight delay between tracks, just a few milliseconds (I checked using WAVWRITER). By the way, I only used FLAC and WAV files for these tests, so my MP3 Xing implementation isn’t to blame.

The channels keep in sync perfectly if I set the delays in advance for multiple channels BEFORE starting playback. However, when I try to set the delay of the next channel during playback using a timer and lock/unlock, this works 90% of the time. Here is the piece of code I’m using:

[code:15oi99nd]
// Lock the DSP mixer (make sure the delay values stay in sync)
m_soundSystem.LockDSP();

        // Start playback (in paused mode)
        m_soundSystem.PlaySound(m_audioFiles[m_currentAudioFileIndex + 1].Sound, true, m_audioFiles[m_currentAudioFileIndex + 1].Channel);            

        // Get existing delay
        Fmod64BitWord wordDelay = m_audioFiles[m_currentAudioFileIndex + 1].Channel.GetDelay(FMOD.DELAYTYPE.DSPCLOCK_START);

        // Calculate position (with frequency conversion)
        uint position = (uint)((m_audioFiles[m_currentAudioFileIndex].Channel.PositionPCM * m_outputFormatMixer.sampleRate / frequency) + 0.5f);

        // Substact the current position from the total length
        AudioTools.FMOD_64BIT_ADD(ref wordDelay.hi, ref wordDelay.lo, 0, length_pcm - position);

        // Set the new DSP clock start delay value            
        m_audioFiles[m_currentAudioFileIndex + 1].Channel.SetDelay(FMOD.DELAYTYPE.DSPCLOCK_START, wordDelay.hi, wordDelay.lo);                                                         

        // Start playback
        m_audioFiles[m_currentAudioFileIndex + 1].Channel.Pause(false);

        // Unlock the DSP mixer (the delays have been set)
        m_soundSystem.UnlockDSP();

[/code:15oi99nd]

Just a little explanation on how this works: the first two channels are synced before playback (this works 100% of the time). When the sound of the first channel ends, the timer is started (its interval is 2 seconds and it runs just once). This starts the playback and syncs the next channel. I need to get the position of the second channel to make sure the next channel sound starts exactly when the second sound stops. I really tried to reduce the code as much as possible between lock/unlock.

So I’m wondering if the second channel position value is to blame (sometimes slightly inaccurate for whatever reason), or I just take too much time between lock and unlock. Do you have an idea?

I’m also REALLY curious to know how subsounds are implemented in FMOD and if I could reproduce this approach in my audio player, adding the support for MP3 padding/delay. I used subsounds for gapless playback for FLAC, WAV and OGG for my current player version and the sync is perfect 100% of the time. It seems that subsounds are also using one channel, which seems to be more optimal than using multiple channels with SetDelay. Does it work by feeding a stream with audio content manually? Can I use FMOD be used to decode audio file data from different formats and feed it manually through streams or do I have to do the decoding by myself?

Any help would be really appreciated. Thanks!

  • You must to post comments
0
0

Hi,

I have made another gapless playback prototype using another sound library and I realized that the channel synchronization solution wasn’t the best one. This prototype decodes the streams manually (using their decoder) and feeds the data into a single channel, which is a LOT more efficient since it’s using only one channel. I’m a bit surprised because this prototype took me only a few hours to build since there were a few examples around.

I guess that’s how FMOD subsounds are implemented too. I’m going to look how to implement this solution using FMOD since I have been using this library for a few years and I don’t want to switch over audio libraries unless it gets too complicated.

I really wish I had thought of this solution before… this means starting over prototypes, but I’ll finally end up with something rock-solid, I hope ๐Ÿ˜€

  • You must to post comments
Showing 11 results
Your Answer

Please first to submit.