The values returned from FSOUND_DSP_GetSpectrum look rather confusing for me.

1) The level of low frequencies is much greater than of the high ones. I can guess that further processing is needed. I’ve searched the forum and found some algorithm in VB making use of a weighting frequency-based function, but it seems unclear. Is there some generally accepted method for displaying a spectrum analyzer graph?

2) I’ve read in the forum that both of stereo channels are represented in the array of 512 floats provided by FSOUND_DSP_GetSpectrum. The odd values for the left channel and even values for the right one. However, I failed to find confirmation for this in the help file. Is it really so?

Thanks.

- Rya asked 13 years ago

- You must login to post comments

we had this problem also,

here’s some code that makes a lame attempt to do this the 8-octave human way:

(*spectrum.data)) is the output of this function.. for me it’s a simple vector

[code:1o22jv25]

float B1 = pow(8.0,1.0/512.0); //1.004

float dd = 1*(512.0/8.0);
float* fft = FSOUND_DSP_GetSpectrum();

float a = (float)((pow(B1,(float)0)-1.0)

*dd);*

float b;

float diff = 0;

int aa;

for (int i = 1; i < 512; ++i) {

((spectrum.data))[i] = 0;

float b;

float diff = 0;

int aa;

for (int i = 1; i < 512; ++i) {

(

b = (float)((pow(B1,(float)i)-1.0)

*dd);*

diff = b-a;

aa = floor(a);

while (diff > 1) {

((spectrum.data))[i] += fft[aa]

diff = b-a;

aa = floor(a);

while (diff > 1) {

(

*2;*

++aa;

++a;

–diff;

}

if (a+diff < (float)ceil(a)) {

((spectrum.data))[i] += diff

++aa;

++a;

–diff;

}

if (a+diff < (float)ceil(a)) {

(

*(float)(fft[aa])*2;

} else {

(

*(spectrum.data))[i] += ((((float)ceil(a))-a))*(float)fft[aa]

*2;*

((spectrum.data))[i] += (b-(float)floor(b))

(

*fft[aa+1]*2;

}

a = b;

}

[/code:1o22jv25]

When rendered it looks like this:

- jaw answered 12 years ago

- You must login to post comments

Spectrum in WinAmp looks much smother than in FMod. They seem to display a limited frequency range or use a logarythmic scaling for axes.

Could you please tell me what are the measurement units for the values provided by FSOUND_DSP_GetSpectrum? And do you apply any smothing window (hanning, hamming, etc.) when performing the FFT calculations?

Thanks.

- Rya answered 13 years ago

- You must login to post comments

I think nullsoft applied a EQ filter to the data, this is why it s more accurate.

- KarLKoX answered 13 years ago

- You must login to post comments

I understand what Rya said, I’ve already note that thing. What he said is that if you plot the value in a graphic, you’ll have like a y=(1/x) function, low frequency are about always higher than high frequency. And in Windows Media Player, it’s more equal, depending on the song, from a value to another.

FMod: ¯_

WMP: ¯¯¯

It’s hard to explain.

I would like to know too why it’s like that.

[EDIT]

What is more accurate, FMod or Nullsoft? Before of after the filter?

- Jetson answered 13 years ago

- You must login to post comments

Further many visualizations stretch the ‘x-axis’ (the fq-axis) to a logarithmic scale.

The FFT (always) returns a linear ‘x-scale’ (e.g. 43 Hz difference from one value to the next at 44100 Hz) but the human ears hear the frequency exponential/logarithmic. So from 43 to 86 Hz is one octave (=fq*2), from 86 to 172, 172 to 344, 344 to 688, … and so on…, and 11025 to 22050.

So the range increases enormous. For the ear the fq-range from 11025 to 22050 Hz seems to be ‘the same range’ as from 86 to 172 Hz.

So what I mean:

We have the FFT-Array, left value is e.g. 1 Hz. From one value to the next the frequency increases by 43 Hz. So the right value is 22050 and in the middle is (because it’s linear) the value of 11025 Hz. So the whole right half if what we see is only one octave and for our ear as important as the octave between e.g. the values 1-2, which are one octave,too (43-86Hz)

So it looks like this (one ‘|’ is one octave, so e.g. from left to right 43,86,…11025,22050)

[code:e7oinzte]|| | | | | | |[/code:e7oinzte]

You have to stretch the data yourself so that it looks like:

[code:e7oinzte]| | | | | | | | | | |[/code:e7oinzte]

Or do it this way:

[code:e7oinzte]|| | | | | | | | | | | | ||[/code:e7oinzte]

Then you have more the illusion to SEE what you HEAR.

(1000 Hz are somewhere near the middle for normal)

Further you could ‘stretch’ the y-axis. At the moment the value 1/2 means -6 dB, a value of 1/4 -12 dB and so on.

You could stretch the values in this way, that you make it a linear dB-scale, so e.g. that each 30 pixels height do always mean -6 dB.

the common formula for dB <-> amp-factor is

[code:e7oinzte]dB = 20*log10(linearFactor)[/code:e7oinzte]So e.g.:
[code:e7oinzte]6.02dB = 20*log10(2)[/code:e7oinzte]

For our ear is +6 dB double times ‘louder’. so +18dB is 2-times ‘louder’ than +12dB. But the amp-factor/the sonic pressure of e.g. +60dB is *1000

(at a rock-concert you can have lots over 120dB pressure, that is about 1000000 times the normal silence-air-pressure ! Be aware of Megadeth 😉

- Froggerprogger answered 13 years ago

- You must login to post comments

**Your Answer**

Please login first to submit.