0
0

Hello,

I have been using the FMOD subsounds feature for some time; I have added gapless playback support to my music player for FLAC, OGG and WAV files. It works great with these file types, with no delay or sync problems. I’m using the exact same code for every file type.

However, when I try subsounds with MP3 files, I get a very slight delay between sound files. I’ve tried changing modes (ACCURATETIME, MPEG, etc.) but I can’t seem to make it work properly.

Here is the C# code I’m using:

[code:2lowtc66] // Created extended info for FMOD real-time stiching
FMOD.CREATESOUNDEXINFO exinfo = new FMOD.CREATESOUNDEXINFO();
exinfo.cbsize = Marshal.SizeOf(exinfo);
exinfo.defaultfrequency = soundFormat.Frequency;
exinfo.numsubsounds = filePaths.Count;
exinfo.numchannels = soundFormat.Channels;
exinfo.format = soundFormat.Format;

        SetStreamBufferSize(256 * 1024, FMOD.TIMEUNIT.RAWBYTES);

        result = baseSystem.createStream(string.Empty, FMOD.MODE.LOOP_NORMAL | FMOD.MODE.OPENUSER, ref exinfo, ref data.sound);
        CheckForError(result);

        // Create streams for each file
        for (int a = 0; a < filePaths.Count; a++)
        {
            result = baseSystem.createStream(filePaths[a], FMOD.MODE.DEFAULT, ref data.subSounds[a]);
            CheckForError(result);
        }

        // Set subsounds for each file
        for (int a = 0; a < filePaths.Count; a++)
        {
            // Set subsound
            result = data.sound.setSubSound(a, data.subSounds[a]);
            CheckForError(result);
        }

        // Setup gapless sentence            
        int[] soundlist = new int[filePaths.Count];
        for (int a = 0; a < filePaths.Count; a++)
        {
            soundlist[a] = a;
        }
        result = data.sound.setSubSoundSentence(soundlist, soundlist.Length);
        CheckForError(result);

        // Play the master sound
        result = baseSystem.playSound(FMOD.CHANNELINDEX.FREE, data.sound, false, ref mainChannel.baseChannel);            
        CheckForError(result);

[/code:2lowtc66]

I’m not swapping any subsounds during playback, the sentence always stays the same. I load all the songs of an album in a sentence and then start playback.

Can anyone point me in the right direction? I must not be far from the solution!

Many thanks to anyone who can help me.

EDIT: I was using FMOD 4.30 and I upgraded to 4.34; the results are the same. FLAC/WAV/OGG works perfectly but I get a slight delay between MP3 files.

  • You must to post comments
0
0

I thought I’d share this link, it talks about at length:

http://www.compuphase.com/mp3/mp3loops.htm (see Part 2 – Theory of operation)

  • You must to post comments
0
0

[quote="Adiss":1kvngq96]This is definitely an esoteric issue. Try searching for "mp3 gapless playback" or "mp3 1152". Also, in the FMOD documentation, there’s a good description of this under Tutorials->Compression Quality, multichannel, and looping with lossy audio formats.

If you want to do slightly better, you could look for the MP3 info tag ([url:1kvngq96]http://gabriel.mp3-tech.org/mp3infotag.html[/url:1kvngq96]) using Sound::getTag(), which actually includes the encoder delays as part of the data. You could then use Channel::setPosition() and Channel::setDelay(FMOD_DELAYTYPE_DSPCLOCK_END) (along with FMOD_ACCURATETIME). Of course, this is dependent upon the existence of the mp3 info tag, which is not guaranteed to be there.

The sound sentencing feature that you’re using is handy for playlist playback, but I think that it is not designed to deal with the subtleties of gapless playback. It’s really designed to simplify some features of videogames, such as dynamically-generated voice commentary and the like. Other music players will have to go through similar contortions to create a gapless playback experience for MP3 files.

FMOD folks, this is an interesting question: would it be possible (or worth the trouble) for sound sentences containing MP3 files to automatically perform this sort of analysis? That is, FMOD looks for the MP3 info tag. If it exists, then it uses that information to automatically start/stop the MP3 at a sample-accurate location. If not, then it overlaps the playback of the last few frame or two of the previous MP3 in the sentence with the next one.

(Actually, it occurs to me that the mp3 info setup could be used to create gapless looping mp3s using the loop points API. And, of course, all of this is dependent upon the sound being opened with FMOD_ACCURATETIME).[/quote:1kvngq96]

Thank you for that additional information. I’ve looked at the setDelay sample on the wiki (http://52.88.2.202/wiki/index.php5?tit … h_setDelay) and I’m pretty sure I successfully reproduced it in C#. It took a while before I found the FMOD_64BIT_ADD method, which I changed slightly. If anyone is interested in the C# code, here it is:

[code:1kvngq96]using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.InteropServices;
using System.Text;

namespace SetDelayPrototype
{
/// <summary>
/// http://www.fmod.org/wiki/index.php5?title=Sample-accurate_sequencing_with_setDelay
/// </summary>
public class SoundManager
{
private FMOD.System system = null;
FMOD.Sound[] sounds = null;
FMOD.Channel[] channels = null;

    string filePath;        
    int numbuffers = 0;
    int outputrate = 0;
    int numoutputchannels = 0;
    int maxinputchannels = 0;
    int bits = 0;

    bool paused = false;
    uint min_delay = 0;
    int first_index = 0;
    int numsounds = 0;

    uint hipause_time = 0;
    uint lopause_time = 0;

    public SoundManager(int numsounds, string filePath)
    {
        this.numsounds = numsounds;
        this.filePath = filePath;

        uint version = 0;
        FMOD.RESULT result;

        result = FMOD.Factory.System_Create(ref system);
        ERRCHECK(result);

        result = system.getVersion(ref version);
        ERRCHECK(result);
        if (version &lt; FMOD.VERSION.number)
        {
            throw new Exception(&quot;Error!  You are using an old version of FMOD &quot; + version.ToString(&quot;X&quot;) + &quot;.  This program requires &quot; + FMOD.VERSION.number.ToString(&quot;X&quot;) + &quot;.&quot;);                
        }

        result = system.init(32, FMOD.INITFLAGS.NORMAL, (IntPtr)null);
        ERRCHECK(result);

        sounds = new FMOD.Sound[numsounds];
        channels = new FMOD.Channel[numsounds];

        for (int i = 0; i &lt; numsounds; ++i)
        {
            result = system.createSound(filePath, FMOD.MODE.SOFTWARE, ref sounds[i]);
            ERRCHECK(result);
        }

        result = system.getDSPBufferSize(ref min_delay, ref numbuffers);
        ERRCHECK(result);
        min_delay *= 2;                       

        FMOD.DSP_RESAMPLER resamplemethod = FMOD.DSP_RESAMPLER.MAX; // dummy values
        FMOD.SOUND_FORMAT soundFormat = FMOD.SOUND_FORMAT.AT9; // dummy values
        result = system.getSoftwareFormat(ref outputrate, ref soundFormat, ref numoutputchannels, ref maxinputchannels, ref resamplemethod, ref bits);
        ERRCHECK(result);
    }

    public void ScheduleChannel(int i)
    {
        FMOD.RESULT result = FMOD.RESULT.OK;

        int prev = (i + numsounds - 1) % numsounds;

        bool playing = false;
        result = channels[prev].isPlaying(ref playing);
        ERRCHECK_CHANNEL(result);

        uint delayhi = 0;
        uint delaylo = 0;            

        if (playing)
        {
            uint length_pcm = 0;
            float frequency = 0.0f;
            float volume = 0.0f;
            float pan = 0.0f;
            int priority = 0;

            result = channels[prev].getDelay(FMOD.DELAYTYPE.DSPCLOCK_START, ref delayhi, ref delaylo);
            ERRCHECK(result);

            result = sounds[prev].getDefaults(ref frequency, ref volume, ref pan, ref priority);
            ERRCHECK(result);

            result = sounds[prev].getLength(ref length_pcm, FMOD.TIMEUNIT.PCM);
            ERRCHECK(result);                

            length_pcm = (uint)((length_pcm * outputrate / frequency) + 0.5f);
            FMOD.DELAYTYPE_UTILITY.FMOD_64BIT_ADD(ref delayhi, ref delaylo, 0, length_pcm);                
        }
        else
        {
            result = system.getDSPClock(ref delayhi, ref delaylo);
            ERRCHECK(result);

            FMOD.DELAYTYPE_UTILITY.FMOD_64BIT_ADD(ref delayhi, ref delaylo, 0, min_delay);
        }

        result = channels[i].setDelay(FMOD.DELAYTYPE.DSPCLOCK_START, delayhi, delaylo);
        ERRCHECK(result);
    }

    public void Start()
    {
        FMOD.RESULT result = FMOD.RESULT.OK;

        for (int i = 0; i &lt; numsounds; ++i)
        {
            result = system.playSound(FMOD.CHANNELINDEX.FREE, sounds[i], true, ref channels[i]);
        }

        first_index = 0;

        uint delayhi = 0;
        uint delaylo = 0;            

        result = channels[0].getDelay(FMOD.DELAYTYPE.DSPCLOCK_START, ref delayhi, ref delaylo);
        ERRCHECK(result);

        FMOD.DELAYTYPE_UTILITY.FMOD_64BIT_ADD(ref delayhi, ref delaylo, 0, min_delay);            

        result = channels[0].setDelay(FMOD.DELAYTYPE.DSPCLOCK_START, delayhi, delaylo);
        ERRCHECK(result);

        for (int i = 0; i &lt; numsounds; ++i)
        {
            ScheduleChannel(i);
        }

        for (int i = 0; i &lt; numsounds; ++i)
        {
            result = channels[i].setPaused(false);
            ERRCHECK(result);
        }
    }

    public void Update()
    {
        FMOD.RESULT result = FMOD.RESULT.OK;

        result = system.update();
        ERRCHECK(result);

        if (paused)
        {
            return;
        }

        // start any channels that have stopped and schedule them at the end of the sequence
        bool playing = false;

        result = channels[first_index].isPlaying(ref playing);
        ERRCHECK_CHANNEL(result);

        while (!playing)
        {
            result = system.playSound(FMOD.CHANNELINDEX.FREE, sounds[first_index], true, ref channels[first_index]);
            ERRCHECK(result);

            ScheduleChannel(first_index);

            result = channels[first_index].setPaused(false);
            ERRCHECK(result);

            first_index = (first_index + 1) % numsounds;

            playing = false;
            result = channels[first_index].isPlaying(ref playing);
            ERRCHECK_CHANNEL(result);
        }
    }

    public void SetPaused(bool paused)
    {
        FMOD.RESULT result = FMOD.RESULT.OK;

        if (paused == this.paused)
        {
            return;
        }

        this.paused = paused;

        if (this.paused)
        {                
            result = system.getDSPClock(ref hipause_time, ref lopause_time);
            ERRCHECK(result);                               

            FMOD.DELAYTYPE_UTILITY.FMOD_64BIT_ADD(ref hipause_time, ref lopause_time, 0, min_delay);                

            for (int i = 0; i &lt; numsounds; ++i)
            {
                bool playing = false;
                result = channels[i].isPlaying(ref playing);
                ERRCHECK_CHANNEL(result);

                if (playing)
                {
                    // we use FMOD_DELAYTYPE_DSPCLOCK_PAUSE instead of setPaused to get
                    // sample-accurate pausing (so we know exactly when the channel will pause)                                        
                    result = channels[i].setDelay(FMOD.DELAYTYPE.DSPCLOCK_PAUSE, hipause_time, lopause_time);
                }
            }
        }
        else
        {
            uint hiCurrentTime = 0;
            uint loCurrentTime = 0;
            result = system.getDSPClock(ref hiCurrentTime, ref loCurrentTime);

            FMOD.DELAYTYPE_UTILITY.FMOD_64BIT_ADD(ref hiCurrentTime, ref loCurrentTime, 0, min_delay);

            // calculate how long it's been since we paused; this is how much we need
            // to offset the delays of all the channels by
            uint hiDelta = hiCurrentTime;
            uint loDelta = loCurrentTime;
            FMOD.DELAYTYPE_UTILITY.FMOD_64BIT_SUB(ref hiDelta, ref loDelta, hipause_time, lopause_time);

            for (int i = 0; i &lt; numsounds; ++i)
            {
                bool playing = false;
                result = channels[i].isPlaying(ref playing);
                ERRCHECK_CHANNEL(result);

                if (playing)
                {                        
                    uint hiunpause_time = 0;
                    uint lounpause_time = 0;
                    result = channels[i].getDelay(FMOD.DELAYTYPE.DSPCLOCK_START, ref hiunpause_time, ref lounpause_time);
                    ERRCHECK(result);

                    uint position = 0;                        
                    float frequency = 0.0f;
                    float volume = 0.0f;
                    float pan = 0.0f;
                    int priority = 0;

                    result = channels[i].getPosition(ref position, FMOD.TIMEUNIT.PCM);
                    ERRCHECK(result);

                    result = sounds[i].getDefaults(ref frequency, ref volume, ref pan, ref priority);
                    ERRCHECK(result);

                    // get the channel's position in output samples; the channel will start
                    // playing from this position as soon as the start time is reached, so
                    // we need to offset the start time from the original by this amount
                    //ulong position_output = (ulong)((position / frequency * outputrate) + 0.5);
                    uint position_output = (uint)((position / frequency * outputrate) + 0.5);

                    uint hiDeltaPositionOutput = hiDelta;
                    uint loDeltaPositionOutput = loDelta;
                    FMOD.DELAYTYPE_UTILITY.FMOD_64BIT_ADD(ref hiDeltaPositionOutput, ref loDeltaPositionOutput, 0, position_output);

                    FMOD.DELAYTYPE_UTILITY.FMOD_64BIT_ADD(ref hiunpause_time, ref lounpause_time, hiDeltaPositionOutput, loDeltaPositionOutput);

                    result = channels[i].setDelay(FMOD.DELAYTYPE.DSPCLOCK_START, hiunpause_time, lounpause_time);
                }
            }

            for (int i = 0; i &lt; numsounds; ++i)
            {
                result = channels[i].setDelay(FMOD.DELAYTYPE.DSPCLOCK_PAUSE, 0, 0);
                ERRCHECK_CHANNEL(result);

                result = channels[i].setPaused(false);
                ERRCHECK_CHANNEL(result);
            }
        }
    }

    public int GetChannelsPlaying()
    {
        FMOD.RESULT result = FMOD.RESULT.OK;
        int playingchannels = 0;

        result = system.getChannelsPlaying(ref playingchannels);
        ERRCHECK_CHANNEL(result);

        return playingchannels;
    }

    private void ERRCHECK(FMOD.RESULT result)
    {
        if (result != FMOD.RESULT.OK)
        {
            throw new Exception(result.ToString());
        }
    }

    private void ERRCHECK_CHANNEL(FMOD.RESULT result)
    {
        if (result != FMOD.RESULT.OK &amp;&amp; result != FMOD.RESULT.ERR_INVALID_HANDLE &amp;&amp; result != FMOD.RESULT.ERR_CHANNEL_STOLEN)
        {
            throw new Exception(result.ToString());
        }
    }
}

}
[/code:1kvngq96]

I’m going to study the example and play around with the code to have a better understanding of setDelay. Then I’ll try to make a gapless playback prototype with the information you gave me. I’ll post the solution here if I get it working.

Thanks again!

  • You must to post comments
0
0

This is definitely an esoteric issue. Try searching for "mp3 gapless playback" or "mp3 1152". Also, in the FMOD documentation, there’s a good description of this under Tutorials->Compression Quality, multichannel, and looping with lossy audio formats.

If you want to do slightly better, you could look for the MP3 info tag ([url:3pkkpaq5]http://gabriel.mp3-tech.org/mp3infotag.html[/url:3pkkpaq5]) using Sound::getTag(), which actually includes the encoder delays as part of the data. You could then use Channel::setPosition() and Channel::setDelay(FMOD_DELAYTYPE_DSPCLOCK_END) (along with FMOD_ACCURATETIME). Of course, this is dependent upon the existence of the mp3 info tag, which is not guaranteed to be there.

The sound sentencing feature that you’re using is handy for playlist playback, but I think that it is not designed to deal with the subtleties of gapless playback. It’s really designed to simplify some features of videogames, such as dynamically-generated voice commentary and the like. Other music players will have to go through similar contortions to create a gapless playback experience for MP3 files.

FMOD folks, this is an interesting question: would it be possible (or worth the trouble) for sound sentences containing MP3 files to automatically perform this sort of analysis? That is, FMOD looks for the MP3 info tag. If it exists, then it uses that information to automatically start/stop the MP3 at a sample-accurate location. If not, then it overlaps the playback of the last few frame or two of the previous MP3 in the sentence with the next one.

(Actually, it occurs to me that the mp3 info setup could be used to create gapless looping mp3s using the loop points API. And, of course, all of this is dependent upon the sound being opened with FMOD_ACCURATETIME).

  • You must to post comments
0
0

[quote="Adiss":39ib5ymq]This problem is endemic to the MP3 format itself. MP3 files are required to be exactly multiples of 1152 samples. If the file isn’t exactly a multiple of 1152 samples (which is most cases), then silence is added to the end, which is that slight delay that you’re hearing.

FMOD has a special MP3 encoder which it uses for the FSB format which solves this problem by time-stretching the last MP3 frame up to exactly 1152 samples and then tweaking to make sure that it would loop cleanly. However, this operation must happen at encode-time, and FMOD only provides the encoder for its FSB-building tool.

Your code will work fine for any non-MP3 format because those formats make no assumptions about the length of the sound. Since your product is a music player, you can’t expect people to encode FSBs with the gapless-MP3 encoder. What you’ll have to do is, if the file format is MP3, start the next sound just before the end of the previous sound in the sequence (you can do this very accurately using setDelay()), and (optionally) do a very quick crossfade to the next sound.

Sorry to be the bearer of bad news, but, as I said, it’s a "feature" of the MP3 format itself.

Hope that helps![/quote:39ib5ymq]

Hi Adiss,

First of all, thanks a ton for taking the time to reply to my post.

I didn’t know that fact about MP3 file length. I searched for a while around the web for the gapless problem, but I couldn’t find this information. I guess I wasn’t searching for the right keywords.

Anyway, if I understood your explanation well, I’ll have to use setDelay instead of the subsound features in order to start the playback of the next file slightly before the other one ends.

It’d be nice if FMOD could do this behind the scenes with the subsounds feature.

Thanks again for taking the time to answer my question. Have a great day!

EDIT: I’ve done some testing using iTunes and Foobar2k with MP3 files, and in both cases the time display is a little screwed up. iTunes displays -0:02, -0:01, 0:00, then comes back to -0:02, -:0:01 and starts the next song. It’s not always the case. Foobar2k seems to change songs one second before the displayed length. In both softwares, the gapless playback works fine. I’m guessing it must be related to the information you gave me.

  • You must to post comments
0
0

This problem is endemic to the MP3 format itself. MP3 files are required to be exactly multiples of 1152 samples. If the file isn’t exactly a multiple of 1152 samples (which is most cases), then silence is added to the end, which is that slight delay that you’re hearing.

FMOD has a special MP3 encoder which it uses for the FSB format which solves this problem by time-stretching the last MP3 frame up to exactly 1152 samples and then tweaking to make sure that it would loop cleanly. However, this operation must happen at encode-time, and FMOD only provides the encoder for its FSB-building tool.

Your code will work fine for any non-MP3 format because those formats make no assumptions about the length of the sound. Since your product is a music player, you can’t expect people to encode FSBs with the gapless-MP3 encoder. What you’ll have to do is, if the file format is MP3, start the next sound just before the end of the previous sound in the sequence (you can do this very accurately using setDelay()), and (optionally) do a very quick crossfade to the next sound.

Sorry to be the bearer of bad news, but, as I said, it’s a "feature" of the MP3 format itself.

Hope that helps!

  • You must to post comments
Showing 5 results
Your Answer

Please first to submit.