0
0

Hi, this isn’t a specific fmod related question but I’ve seen that a few people on here are working on beat detection in some way. I’m just having some trouble getting what the general idea behind it is. I can understand why you would use the autocorrelation function which I have covered in maths which is helpful. My main problem is what the fourier transform is used for which seems to be the main part of the analysis. As far as I can understand it gets you the frequencies contained in the amount of data that you look at. So how would this be used when you need to preserve time sortof to try and see when it correlates with itself. If anyone can help explain anything related to this area I would really appreciate it, cheers.

Matt

  • You must to post comments
0
0

Cool, yeah I’d be up for keeping this thread alive. The more information the better as my project is really research based even tho I will have to get something practical working. I am actually thinking of doing some work on just analysing the waveform this week as I’m having trouble getting fftw into a form I could use on windows. I read a paper from 1993 on using the autocorrelation function which was under very specific conditions but still might give peaks that would suggest a tempo.

Thinking about actually queing the tracks, if you’re pretty sure of the tempo and you edit the track so that it starts on a bar you could probably quite simply place markers right through on the start of each bar. Soundforge has a function to do this and you could probably program it too. Obviously won’t work on any songs with changing tempo etc. but it would be an easy solution. Other than that I’ve heard that the cross correlation function is the way to go for syncronising two songs. Aaaah and this is why you’ll never kill real DJs 😉

  • You must to post comments
0
0

IMHO the place to get all this sort of info is in Computer Music Journal, published by MIT press. It can be found in most university/conservatorium libraries, or subscribed to in online form.
CMJ is where the really heavy research into digital audio get published.

  • You must to post comments
0
0

Sorry to try and drag this thread back to life but it’s just cos someone mentioned using BASS to detect the tempo of a song. I’ve downloaded it but there seems to be no mention of it whatsoever unless it’s an extension feature (in the pack which is currently unavailable). Does anyone know how I can get hold of it or does anyone know of any other free methods of tempo calculation? I know that’s very unlikely but you never know.

I’m just trying to find a fallback incase I can’t get an algorithm of my own working so I can still write an application of some kind for my hons proj. I’m hoping to implement a comb filter next semester which might be the answer to all my problems but if I have a backup it’ll give me a bit more security and also help hold up my planning document. Cheers guys (and gals if any)

EDIT: Oh yeah, if the mods are wondering I will stil use FMOD for most of the code cos the streaming is so wonderful. Just incase you thought I was being cheeky posting here 😉

  • You must to post comments
0
0

Hi Mate,

Basically you would do the FFT on the stream either real time or before-hand with the aim of searching for the specific frequencies relating to the particular instruments needed to ‘beat match’ the track.

For instance a drum would appear in the very low frequency bands whereas a hi-hat/symbol would appear further up the spectrum. The general idea of a track running at 120BPM is that it would have a beat every half a second so the FFT should produce peaks for the main instrument (usually a drum) twice a second.

Obviously it would be a boring track if it had a drum beat throughout without missing a few or adding in other instruments. So, beat matching code should be able to locate the peaks in the spectrum at regular intervals and anticipate where the missing ones should be for the whole track so that both the BPM and the position of the strongest part of the beat can be found.

You probably knew or worked this out already anyway 8)

Hope it helps a little though 😀

TS

  • You must to post comments
0
0

they have removed the bass-fx-dll that supplied an accurate bpm in a few seconds. it’s beeing re-released in a few weeks ??/, i’ll forward you the dll from your PM.

ps.. It work very well, i’ve matched it against other system. traktor and mixmeister and it’s seems spot on, however you seem to get more accuracy if the file has been adjusted with replay gain, i might wrap the dll to include the replay gain also.

t’ta

  • You must to post comments
0
0

OK, this is going to hurt, but below is an easy way to get BPM detection, maybe this will kick someone into developing the same for FMOD.

Download BASS, it’s comes with a free BPM detection unit, i use BASS to build an audio database with BPM energy, gain and stuff and then use FMOD for the rest. I hate BASS but it’s got a free BPM dll and it works a dream. You probably won’t be able to use bass at the same time as FMOD for real-time BPM stuff.

PS. BASS is horrible, but as everybody is keeping their BPM routines ( GOOD ONES ) to themselves ( am i’m too thick to do my own ) then i make do..

:)

  • You must to post comments
0
0

As I told in another thread, I tried to write a beat detection algorithm for myself. It is messy and doesn’t work very well (yet), but it might inspire you a bit to write a better one :).
My algorithm is based on the [url=http://lena.uia.ac.be/~stijn/var/BeatDetectionAlgorithms.pdf:skd00i39]Beat Detection Algorithms[/url:skd00i39] paper. (especially page 9 to 11)
It handles about the possibility of detecting the beats themselves, but not on how you can find a regular beat pattern, which is needed to calculate the BPM value. So I invented something for myself, as you can see in my code. (below the line “// try to find a regular beat on each subband” )
I have written some comments between my code to make my thoughts clear, I hope you understand them a bit.

As I told, it basically works and seems to calculate an accurate BPM value for some songs (especially those with a strong beat.), but there is still a lot of room for improvement. I was thinking of following things as possible improvements:

  • Use a derivation filter to detect the beats themselves (as stated on page 13 of the “Beat Detection Algorithms” paper)
  • Find a better way to find a regular beat pattern on each subband. (I guess this is the most tricky and most importand part)
  • Use another measure to determine which subband contains the most regular beat pattern.

I think it would result in a very accurate algorithm if those things get straight, but I do not have the time to work at it at this moment.

This is my code:

[code:skd00i39]fftw_complex fgFFTin; //input array for FFT
fftw_complex *fgFFTout; // output array for FFTW
fftw_plan fgFFTp; // the FFT plan :)
double history[32][43]; // history buffer
int historycount[32]; // how much values are stored in the history buffer?
int beattimer[32]; // used to count the number of blocks between the current and the previous beat.
vector<int> intervals[32]; // the values of the beattimer are stored in an interval history, so we can calculate the expected time as an average of those values.
int blockswithoutbeat[32]; // this is used to check if there is enough time between every 2 beats (a minimum of 0.3 sec seems a good value to me)
int mode[32];
/

the mode indicates the state of the beat detection in each subband.
if it is equal to 0, the "expected time" between 2 beats is not known,
so we try to detect 2 beats and calculate the time between them, and use that value as the new "expected time" between 2 beats.
(and store it in the intervals[i] vector). Then we move on to mode 1, which means that everything is going OK: the expected time between 2 beats is known,
and the newly arriving beats occur at the expected times.
When a newly arriving "beat" occurs too early or to late (i.e. not at the expected time), the mode number is increased.
Otherwised, it is decreased (with a minimum value of 1, because 0 is the initialization mode)
When the mode reaches 10, it means that there have been a lot of beats that were not at the expected time, so we have to reinitialize.

*/
double bpm[32];
int bpmAccuracy[32];
double fgBPM;
double fgBPMAccuracy;

void init() {

fgFFTin = (double(*)[2])fftw_malloc(sizeof(fftw_complex) * 1024);
    fgFFTout = (double(*)[2])fftw_malloc(sizeof(fftw_complex) * 1024);
    fgFFTp = fftw_plan_dft_1d(1024, fgFFTin, fgFFTout, FFTW_FORWARD, FFTW_MEASURE);

}

void* dspcallback(void *originalbuffer,void *newbuffer,int length,int param) {
signed short *stereo16bitbuffer = (signed short *)originalbuffer;
int N = 32; // number of subbands
for (int loop = 0; loop < 17; loop++) { // the original buffer consists of 17 * 1024 samples

    // fill the input array for the FFT

    for (int count = 0; count &lt; 1024; count++) {
        fgFFTin[count][0] = *stereo16bitbuffer++; 
        fgFFTin[count][1] = *stereo16bitbuffer++;
    }

    fftw_execute(fgFFTp); 

    // compute the square of the modulo of the complex numbers that are returned by FFT
    double B[512]; 
    for (int count = 0; count &lt; 512; count++) {
        B[count] = sqrt(fgFFTout[count][0] * fgFFTout[count][0] + fgFFTout[count][1] * fgFFTout[count][1]);
    }
    double Es[N];
    double Ei[N];

    // divide the spectrum in N subbands, on a logaritmic scale
    for (int i = 0; i &lt; N; i++) {
        Es[i] = 0;
        int fromK = 0;
        int toK = 0;
        for (int j = 0; j &lt; i; j++) {
            fromK += w(j);
        }
        for (int j = 0; j &lt; i+1; j++) {
            toK += w(j);
        }
        for (int k = fromK; k &lt; toK; k++) {
            Es[i] += B[k];
        }
        Es[i] *= (double)w(i);
        Es[i] /= 512.0; //  don't know why :)
        Ei[i] = 0;

        // compute average energy for subband i
        for (int k = 0; k &lt; 43; k++) {
            Ei[i] += history[i][k];
        }
        if (historycount[i] &gt; 43) {
            Ei[i] /= 43.0;
        } else {
            Ei[i] /= historycount[i];
        }
    }
    // shift history buffer one position to the right and insert new computed value at first position
    for (int i = 0; i &lt; N; i++) {
        for (int k = 42; k &gt; 0; k--) {
            history[i][k] = history[i][k-1];
        }
        history[i][0] = Es[i];
        if (historycount[i] &lt; 43) historycount[i]++;
    }

    bool beat = false;
    // try to find a regular beat on each subband
    for (int i = 0; i &lt; N; i++) {

        if (Es[i] &gt; 7500 &amp;&amp; Es[i] &gt; 4.0*Ei[i]) { // there is a beat, should we accept it?
            // there should be a certain amount of time between every 2 beats.
            // 13 blocks ~= 0.3 sec, which means that ~= 200bpm is the highest beat rate that is acceptable. 
            if (blockswithoutbeat[i] &gt;= 13) { 
                if (mode[i] == 0) { // we are in the initialisation mode
                    if (beattimer[i] == 0) { 
                        beattimer[i]++; 
                        blockswithoutbeat[i] = 0;
                    }
                    else {
                        intervals[i].clear(); // since this is the initialisation phase, we clear the interval history
                        intervals[i].push_back(beattimer[i]); 
                        mode[i] = 1; 
                        beattimer[i] = 0;
                        blockswithoutbeat[i] = 0;
                        // at this point, we have established the time between the first and the second beat, 
                        // and it is stored in intervals[i]. 
                    }
                } else if (mode[i] &lt; 10) {
                    double expectedtime = 0;
                    if (intervals[i].size() == 0) {
                        exit(1); // this shouldn't happen
                    }
                    // the expected time between the current and the previous beat is the average of the time intervals between all the previous beats.
                    for (vector&lt;int&gt;::iterator p = intervals[i].begin();p != intervals[i].end(); p++) {
                        expectedtime += *p;
                    }
                    expectedtime /= (double) intervals[i].size();
                    bpm[i] = 60.0 / ((1024.0 / 44100.0) * expectedtime);
                    double quotient = expectedtime / (double)beattimer[i]; // this value indicates how much the time between the current and the
                    // previous beat deviates from the expected time. If it is equal to 1, the current beat occured exactly at the expected moment. 

                    double margin = 0.5 / (double) (intervals[i].size()); // the larger the size of the interval history, 
                    // the less tolerant we will be when deciding if a beat is &quot;on time&quot; or not.
                    if (margin &gt; 0.1) margin = 0.1; 
                    if (quotient &lt; 1.0 + margin &amp;&amp; quotient &gt; 1.0 - margin ) { // the beat occured at the expected time...
                        bpmAccuracy[i]++; // ... which means that this subband may provide an accurate beat rate
                        intervals[i].push_back(beattimer[i]);
                        if (intervals[i].size() &gt; 25) { // we keep a maximum of 25 intervals in the interval history.
                            intervals[i].erase(intervals[i].begin());
                            if (intervals[i].size() != 25) {
                                cout &lt;&lt; &quot;Strange things happening!&quot; &lt;&lt; endl;
                                exit(1);
                            }
                        }
                        beattimer[i] = 0;
                        if (mode[i] &gt; 2) mode[i]-= 2; 
                        else if (mode[i] &gt; 1) mode[i]--;
                        blockswithoutbeat[i] = 0;
                    } else if (beattimer[i] &lt; expectedtime) { // the beat was too early, so we don't reset the beattimer. 
                            beattimer[i]++;
                            mode[i]++; 

                    } else {
                        if (quotient &lt; (1.0 + margin)/2.0 &amp;&amp; quotient &gt; (1.0 - margin)/2.0)  {
                            // we have probably &quot;missed&quot; a beat, since the actual time is +/- the double of the the excepted time
                            beattimer[i] = 0;
                        } else {
                            // don't know why I did this
                            beattimer[i] -= (int)expectedtime;
                        }
                        mode[i] += 2; 
                        blockswithoutbeat[i] = 0;
                    }

                } else {
                    // we lost track, so we go back to initialisation mode
                    mode[i] = 0;
                    beattimer[i] = 0;
                    intervals[i].clear();
                    bpm[i] = -1.0;
                    blockswithoutbeat[i] = 0;
                    bpmAccuracy[i] = 0;
                }
            } else {
                blockswithoutbeat[i]++;
                if (blockswithoutbeat[i] &gt; 100) {
                    bpmAccuracy[i] = -1;
                    bpm[i] = -1;
                    mode[i] = 0;
                }
                beattimer[i]++;

            }
        }
        else {
            blockswithoutbeat[i]++;
            if (mode[i] &gt; 0 || beattimer[i] &gt; 0) beattimer[i]++;
        }

    }
    // try to find the subband with the most accurate beat detection.
    int max = 0;
    for (int i = 0; i &lt; 32; i++) {
        if (bpmAccuracy[i] &gt;= bpmAccuracy[max] &amp;&amp; bpm[i] &gt; 40.0 &amp;&amp; bpm[i] &lt; 240.0) max = i;
    }
    if (bpmAccuracy[max] &gt; 0) {
        fgBPM = bpm[max];
        // we try to keep the displayed bpm value between 60 and 180. 
        if (fgBPM &lt; 60) {
            fgBPM *= 2.0;
        }
        if (fgBPM &gt; 180) {
            fgBPM /= 2.0;
        }
        fgBPMAccuracy = bpmAccuracy[max];
    } else {
        fgBPM = -1;
        fgBPMAccuracy = 0;
    }
}
return originalbuffer;

}
[/code:skd00i39]

It can also be found [url=http://lena.uia.ac.be/~stijn/var/beatdetection.cpp:skd00i39]here[/url:skd00i39].

I hope this helps anyone, and, if you get it improved, please let me know!
You can contact me at stijn (dot) lamens (at) ua (dot) ac (dot) be for further questions or comments if I seem dead on this forum.

Stijn

  • You must to post comments
0
0

Thanks guys, looks like a wealth of information there. I think I was getting confused about the the FFT stuff thinking that I would be using the whole spectrum that was calculated. I’d actually looked at another paper already that had talked about finding peaks in the spectrum to identify the kick and snare drum and then look for alternating patterns of these (probably good for more types of music than those with a kick drum on every beat).

I may look at the Bass thing, especially if they have any info on how it’s done. It’s probably not really something I can use though as this is the topic of my university honours project. Well maybe I could but it would require changing the focus of my investigation to look at what would make a good virtual DJ maybe in terms of how it plays tracks and stuff.

I must have missed that paper somehow, how did you find it? I’ve got lots of papers already, mostly kinda high level and not about the actual implementation of it. Scheirer gets referenced quite a lot and I can’t remember the name of the other one but thats where i got the kick and snare drum related one. I really appreciate you sharing your code as well, I’ve been wanting more code to try and understand what happens in a dsp callback so that really is spot on. Even if you say it is messy I’m sure it will push me forward and if I do make anything interesting I’ll get back.

Cheers all,

Matt

  • You must to post comments
0
0

I think I found this paper using google. Just enter the search query “beat detection algorithms” et voilà.

  • You must to post comments
0
0

You guys seem to be on the case with the beat detection stuff ( , i’ve got a working solution for finding good in/out cue points. Have any of you done anything like this….

As the moment i take a fixed length sample ( say 10 seconds ) and track the energy looking for steady rises or hits in volume to identify tracks that ‘ start with a bang’ or fade out with a wimper….

any idea?

  • You must to post comments
0
0

Hmm, I have not made any efforts to find a good cue position, since I just use the monitor (headphones connected to second soundcard) to find one.
I find it more important now to write a good beat detection algorithm: once you can detect the beats, you can also find the first beat, can’t you?
However, the finding of a good cue point also depends on how the piece of music is built up, most songs are composed of measures (not sure if this is the right word) of 4, 8, 16, 32, … beats, and only the first beat of such a measure is a good cue position.
I think it’s very difficult to detect which beat is the “first” beat of a measure, sometimes the first beat is the strongest, but it can also be the second or third one.

Furthermore, I forgot a little piece of code of my beat detection algorithm, here it comes:

[code:1omkbokb]
int w(int i) {
return (int)(2.0/3.0 * i + 6.0);
}
[/code:1omkbokb]

Good luck to both of you, unfortunately I do not have the time to develop the algorithm further at this moment (I have a master thesis to write), but when I have more time I’ll definitely look at all this stuff again.

Stijn

  • You must to post comments
0
0

Hi,

I spent months writing beat matching algorithms using both FFT and straight volume searching. I have had a lot of success with just scanning the volume thresholds of the waveform. I have pages and pages of commented out old algorithm code :/

There are many factors that mess up beat matching not least bent or wobbly tracks or tracks that change pitch in the middle. The only way to guarantee straight tracks is to encode them yourself ;D Other factors are the structure of the music. For instance, dance music is a 4/4 beat and generally has a break beat every 64 beats. Although a lot of dance music is constructed in this way there are many exceptions … Living Joy – Dreamer is a good example. It plays the the 1st break (64 beats) perfectly, then the 2nd break has 72 beats, then the next three breaks follow the conventional 64 beat spacing again. Although mixed with another track it can sound good, it can also sound awful if break beats are not lined up :/ Then, different genras of music obey different rules, reggae for instance does not follow a 4/4 beat structure, 4/3 i think it is. Classical music speeds up and slows down all over the place. My concusion is it would not be possible to make a beat matching algorithm that works on every piece of music no matter what!

The biggest problem I have found with using FFT, apart from the speed issues, is that once you have found a good beat how do u tie the position to exact position in the waveform? For instance running an FFT using lets say 2048 samples means that the cue point is accurate to within 2048 samples. Overlapping the FFT with previous one helps but is just too slow as more overlapping occurs. I read about how SETI do their FFT and they run a large scale FFT, then gradually decrease the the FFT size over multiple passes. This is not really fast enough for realtime processing.

Finding the BPM is less than half the battle even with variable BPM tracks, finding the exact position of a beat is the key. You only have to play the same track twice and shift one by one sample to hear a discernable difference. Sample accuracy is extremely important. There are a number of DJ’ing programs out currently and I have not seen any with less than millisecond accuracy and 1ms is massive in the realms of mixing two bass beats.

I was well chuffed to work out that the software I am working on is accurate to 1/50,000,000 of a second!!! Cant wait to release it soon :DDD

Just … so much to do! keep this beat matching thread alive ;D

TS

  • You must to post comments
Showing 12 results
Your Answer

Please first to submit.