0
0

Hi!

I want to start copying data from microphone recording buffer to another buffer when some event occurs, and stop copying when another event occurs in my c# application. How do I start with FMOD?

I’ve some fmod code to get data from microphone and save it continously to wav file(based on fmod examples), but I want to start recording when sound level (with noise) reaches a preconfigured value, and record should be stopped when there is more than 3 sec silence.

Thanks for your help!

  • You must to post comments
0
0

Why do you lose 20-30ms of voice data?

  • You must to post comments
0
0

Because it is possible that only some noise caused the amplitude raise, and that’s why I wait 20-30ms with recording. If I immediate send the detected voice to the engine, it would lead to false recognition.

  • You must to post comments
0
0

[quote="peter":19n3zqv4]
Analysing recorded sound involves latency, that is a fact of life. The speed at which the hardware records the data wont be 100% constant so we need to leave a little gap between the recording and playback to prevent the playback overtaking the writing. Fortunately that latency isn’t too large generally less than 200ms.

Having said that if you want access to the raw data with minimum delay you could use FMOD::Sound::lock. That is our low-level function to directly access the recording buffer, however it would be quite a bit more complicated. The data is not garunteed to be 16-bit, you can check the format using Sound::getFormat.

-Pete[/quote:19n3zqv4]

Hi, sorry to open up an old thread, but I also need low-latency access to microphone input, and am running into similar problems. Pete, would you please describe in more detail how to use FMOD::Sound::lock to access microphone input as it is coming in?

Thank you!

  • You must to post comments
0
0

[quote:385feh6e]If I immediate send the detected voice to the engine, it would lead to false recognition.[/quote:385feh6e]
Just to make sure I understand, your trigger to start recoginition is 20-30ms of sustained noise? If that is the case, then all you need to do is buffer the data in the listener DSP and pass data that through to the recogition as well.

  • You must to post comments
0
0

Hi qwer90,

You can create a custom DSP to ‘listen’ to the sound coming through the microphone. When it is above a certain threshold, start recording. When the DSP detects that the sound level is below the threshold for more that 3 seconds, stop the recording. Check out the customdsp example.

  • You must to post comments
0
0

Until now, I used getwavedata maximum, but a sudden noise could activate recording. I tried to start a stopwatch when getwavedata maximum is above a treshold (60%) and it was counting for 100ms, after that I regarded the noise as a continous voice to record, but the CPU usage was above 50-60%.

So in custom_dsp example there is calllback function which has a Pointer inbuf and an integer length. But I have a question: inbuf is a pointer which points to a memory address where the wave data starts? and what is the purpose of length and outchannels?

Thx

  • You must to post comments
0
0

I have similar question :

I would like to record sound constantly and also be able to analyze it in bit by bit mode.

What would be the best way to do this. The thing is I can’t lose any bit and I must search for a specific combination of bits to trigger another thread ?

  • You must to post comments
0
0

@qwer,

The problem with using getWaveData is that it gives you a snapshot of the data. That snapshot is quite large and will overlap so you’re doing more work than you need to because you’re checking the same samples mutliple times. By creating a custom DSP you make sure you check every sample once only. You analyse the data in the DSP the same way as getWaveData.

[code:rh8sx4m0]FMOD_RESULT F_CALLBACK FMOD_DSP_READCALLBACK(
FMOD_DSP_STATE * dsp_state,
float * inbuffer,
float * outbuffer,
unsigned int length,
int inchannels,
int outchannels
);[/code:rh8sx4m0]

Here ‘inbuffer’ is the input to the dsp and ‘outbuffer’ is the output. They are both the same length, the length is (inchannels*length). Since the DSP is just listening and not changing the sound you can directly copy in to out with a memcpy. Once you’ve done that you can perform your analysis of the data.

  • You must to post comments
0
0

@orticelo

I don’t understand what you mean by ‘bit by bit mode’. If you just mean analyze every sample then my answer above should help you too.

  • You must to post comments
0
0

[quote:1itkesx1]I don’t understand what you mean by ‘bit by bit mode’. If you just mean analyze every sample then my answer above should help you too.[/quote:1itkesx1]

Yes I mean that I wanted to analyze every sample

But Now I am a bit lost

I was reading other posts concerning FMOD_DSP_READCALLBACK and noticed that my thinking how to use it is different than reality.

I read something like this :
[quote:1itkesx1][quote:1itkesx1]
How can i apply a DSP to the recording input (FMOD_System_RecordStart) ?
[/quote:1itkesx1]
That’s not easy. As you noted DSPs can only be applied to channels/channelgroups so you will have to call System::playSound to play that Sound in a Channel.[/quote:1itkesx1]

But there is one problem because in the same moment I start recording I need to be able to access the samples already recorded and analyze them in order to search specific string of data.

As I noticed before, when I call System_PlaySound it plays only this data which has been already recorded so I presume that DSP would have only access to those samples which were recoded before calling System_PlaySound.

I also wonder how to check if in inbuffer have already been stored any samples ?

I am also interested if inbuffer has constant size which is specified on callback (like in example below) or maybe it changes during catching new data ?

I also don’t know when exactly the first samples is stored in inbuffer ?

base on DSP custom exaple

[code:1itkesx1]FMOD_DSP_DESCRIPTION dspdesc;

    memset(&dspdesc, 0, sizeof(FMOD_DSP_DESCRIPTION));

    strcpy(dspdesc.name, "My first DSP unit");
    dspdesc.channels     = 0;                   // 0 = whatever comes in, else specify.
    dspdesc.read         = myDSPCallback;   //  is this moment when first data would be stored in inbuffer ???
    dspdesc.userdata     = (void *)0x12345678;

[/code:1itkesx1]

I my case I can’t wait too long for recoded samples because each millisecond is important. For example I must in 2 seconds find specific pastern in recorded sound and base on the pastern start new thread as fast as it possible. And do few other thing. (The sound to analyze is coming from microphone.)

I also wonder about one thing connected the

FMOD_RESULT F_CALLBACK FMOD_DSP_READCALLBACK

in documentation is written that inbuffer the same as outbuffer are Pointers to floating point -1.0 to +1.0 ranged data

But I need to operate on 16bit raw PCM data so is there any easy solution for this thing ?

  • You must to post comments
0
0

[quote:17h0fmxr]But there is one problem because in the same moment I start recording I need to be able to access the samples already recorded and analyze them in order to search specific string of data. [/quote:17h0fmxr]
Analysing recorded sound involves latency, that is a fact of life. The speed at which the hardware records the data wont be 100% constant so we need to leave a little gap between the recording and playback to prevent the playback overtaking the writing. Fortunately that latency isn’t too large generally less than 200ms.

[quote:17h0fmxr]But I need to operate on 16bit raw PCM data so is there any easy solution for this thing ?[/quote:17h0fmxr]
You can easily convert float data to 16-bit inside your DSP.

Having said that if you want access to the raw data with minimum delay you could use FMOD::Sound::lock. That is our low-level function to directly access the recording buffer, however it would be quite a bit more complicated. The data is not garunteed to be 16-bit, you can check the format using Sound::getFormat.

-Pete

  • You must to post comments
0
0

Here I’m again!

I’ve found another solution for voice activating but now I’m bit confused. When noise level is above the threshold capture will start, but that’s why I loose 20-30ms voice data, and in speech recognition this is a big problem.
The recognition algorithm works almost fine, but I cannot use it. If I set sample rate from 8kHz to 22kHz the problem is gone, but it takes more time to compare samples. Now we have demo solution for the problem (before the keyword, speaker must say ‘in’, and the program start recording, but ‘in’ cannot be hear in recording data), but we cannot expect saying ‘in’ before every keyword from the user, because nobody would use our program.
Every solution are welcome…

Thx

  • You must to post comments
Showing 12 results
Your Answer

Please first to submit.