0
0

Hi,

I want to make a game where the user talks for an avatar with a synthetic speech. So I need to record and play back the users voice with low latency. Unfortunatly It seams not be possible with fmod. I tried the recording example but I can not set lower latency values than 60ms :(

Is there any solution to do low latency recording with fmod like I can do with other media frameworks like Juce?

Best regards,
Maxim

  • You must to post comments
0
0

Thank you brett for your fast answer.

I was thinking the same thing when I searched the headers after "rate" and "frequency" :)

Unfortunately, the controlpanelrate seems to be only correct before I call init.
This might not be a big issue since I can easily determine the output rate upfront.
It looks now almost sufficient.

The workflow works with several device but not with a single mono USB Microphone (set via Audio-MIDI-Setup to 44100hz).

I get "AUHAL::AUIOProc: mono buffer too small (1024 > 512)" on the Console output. Any idea what I do wrong? I already tried to tweak DSP Buffer sizes
and the amount of channels in the user sound.

  • You must to post comments
0
0

[b:2tipfybv]AUHAL::AUIOProc: mono buffer too small (1024 > 512)[/b:2tipfybv]

Any ideas on this? I am really lost :) It looks like a configuration probem on my side but I really don’t understand the point of the error message.

  • You must to post comments
0
0

Ohh, when I first read this thread I wasn’t sure if this is the same thing I expirenced 6 months ago. I didn’t know if I was doing something wrong at that point or not.
But with maxims finding of 60 ms I could alter the demo program "recording.xcodeproj" to explain the issue a bit more. I just inserted 4 lines of code into the example. Those should make the example capture and play back imediately. The buffer size that is allocated for exinfo.length seems not to influence the behaviour. The only thing that seems to make it work is the usleep(1000*60). Everything below 60ms makes the sound crackling. This is really odd to me since Garageband can do the same with less than 6ms latency. The weird thing is that I can preproduce this issue on Windows as well, so it is not a Mac OS specific issue. Maybe we need to tune some other buffers sizes? I know that decreasing the latency might influence the DSP chain but without even having one it should be possible to route from in to out in less than 10ms with any sound card (currently this is the On-Board MacBook Pro sound device).

[code:12h3294m]
result = system->createSound(0, FMOD_2D | FMOD_SOFTWARE | FMOD_OPENUSER, &exinfo, &sound);
ERRCHECK(result);

printf("===================================================================\n");
printf("Recording example.  Copyright (c) Firelight Technologies 2004-2011.\n");
printf("===================================================================\n");
printf("\n");
printf("Press 'r' to record a 5 second segment of audio and write it to a wav file.\n");
printf("Press 'p' to play the 5 second segment of audio.\n");
printf("Press 'l' to turn looping on/off.\n");
printf("Press 's' to stop recording and playback.\n");
printf("Press 'w' to save the 5 second segment to a wav file.\n");
printf("Press 'Esc' to quit\n");
printf("\n");

/** inserted */
sound->setMode(FMOD_LOOP_NORMAL);
result = system->recordStart(recorddriver, sound, true);
usleep(1000*60); // <--- this is the Point Of Interest
result = system->playSound(FMOD_CHANNEL_REUSE, sound, false, &channel);
/** inserted ends */

/*
    Main loop.
*/
do

[/code:12h3294m]

  • You must to post comments
0
0

WOW, I just found a solution for the entire problems 😀 You won’t believe it.

We I tried to find a solution for the "AUHAL::AUIOProc:" problem I found that starting the record twice seems to fix the error message. So I tried to do so without even capturing in a separate buffer and playing it back with playDSP. It sounds weird but it works!

Instead of
[code:3qzv2u8x]result = systemRecord->recordStart(recorddriver, sound, true);[/code:3qzv2u8x]

I call:
[code:3qzv2u8x]result = systemRecord->recordStart(recorddriver, sound, true);
result = systemRecord->recordStart(recorddriver, sound, true);[/code:3qzv2u8x]

This fixes the AUHAL-Problem and the high latency on MacOSX and on Windows.

Now I am happy to have a solution for this with very minimal changes to my original code :)

  • You must to post comments
0
0

Hi,

where are the guys from Firelight? Brett, any suggestions?

Best regards,
maxim

  • You must to post comments
0
0

Does nobody uses the recording feature???

  • You must to post comments
0
0

I just searched the forum for the same topic and I found this thread: viewtopic.php?f=7&t=13817&hilit=record

Furthermore I observed that the internal ring buffers are not working correctly at least with recording. I am trying to summerize this observation:

  1. Assuming we take a looped recording sound of 1 second. This is far beyond any buffer bounds of CoreAudio or DirectSound/ASIO. Am I right?
  2. We start recording and play the recorded sound imediately. (almost {only function calling} NO Gap between recordStart and play)

[b:2wp58ewq]What do I expect?[/b:2wp58ewq]
I would expect that the playback is to fast I it should stall until the ringbuffer has enough data to be played back.

But that doesn’t happen. The playback stutters over and over again instead of swing into a smooth playback.

[b:2wp58ewq]Conclusion:[/b:2wp58ewq]
From my current perspective the recording API is not mature enough to be used in any use case other than a linear "record –> wait –> play" scenario. A Sleep between recordStart and Play is not really good API design. Instead I would expect a ringbuffer that waits until enough data is available. Maybe it’s an idea to let the API user decide what to do? A callbacked API for the record would also be great so this behaviour is user defined.
Right now minimum 60 ms of latency is really high for some applications. I tried portaudio with a very simple sample that’s equally to the sample above. I could reach direct playback not additional latency but the buffer sizes. The buffers has been 6 ms and it recorded and played flawlessly.

  • You must to post comments
0
0

Hi lemart,

thank you very much for your feedback! I think so too. Currently Fmod is noch suitable for low latency recording :( But I don’t understand this issue because fmod ex is for game development and many modern games use voice interaction. PortAudio and Bass.dll have excellent recording support – so please FireLight, improve this point of your library.

How can we reach the support directly? It seams that nobody of FireLight reads this forum :(

Best regards,
Maxim

  • You must to post comments
0
0

One tip I would recommend if you are using WASAPI is to create your record buffer using the same sample rate as the caps in getRecordDriverCaps. FMOD has to resample the buffer and introduce a double buffer if you use a sample rate that is not the same as the system rate.
The best way to get good latency from the OS with record to playback is to use FMOD_OUTPUTTYPE_ASIO, because that is a driver level implementation that actually synchronizes the record and playback streams into the same thread. You can easily get sub 20ms record to playback with that API.

Apart from this we are directly exposing the recording API of the OS. You’re not going to suddenly get better latency if we provide some ‘different’ API through FMOD. When you call System::getRecordPosition in fmod, you are directly calling the same function in directsound for example. Next you call Sound::lock/unlock to get that date. You cannot get any more direct than this. The lowest latency way to do it is to watch the record cursor position, and then as soon as it changes, lock and copy the new data (usually this happens in blocks).

You’re talking about the record example’s system of playing back recorded data. That is an extremely simple 1 line of code way to play some sound coming through the microphone. It is -not- an example of how to design a low level chat system. It is not meant to be. The examples are as basic and simple as can possibly be, to show you how to use the API. Not to write pages of code to confuse the user.
Rather than arbitrary sleeping the record cursor could be watched until it moved by a small amount. You also risk the issue of different drivers stuttering because they dont have good internal buffer sizes or latency. That’s why to be compatible you generally have to be generous with the start delay.

We are going to update the recording API soon to allow things like DSP chains on a record stream (so you can analyze and filter it without playing it). There are also going to be some other recording changes soon enough to match the release of a new product we are working on.
There are plenty of systems using FMOD’s recording to provide voice support in games. World of Warcraft and other games voice chat is using FMOD recording and playback, there are a multitude of voice apps on iOS app store doing the same.
lemart – did you know just about every single forum post, lets take the front page, has a reply from someone at FMOD? Did you also see the support page on our website that tells you exactly how to contact us? Why don’t you use it?

  • You must to post comments
0
0

Hi Brett,

thanks for your answer.

I’m working on Mac so I’m using CoreAudio – no ASIO, no WSAPI :)

IMO the problem is not the recording api (getRecordPosition()…) but the playback. So before I can playback the recorded audio i have to sleep an unspecified amount of time. There is no callback when there is enough recorded data to play back… This time is the resulted delay.

Is there any code example how to record and playback audio data from mic without any delay (<15ms)?

Best regards,
Maxim

  • You must to post comments
0
0

I actually found a partly solution for the problem.

  1. record to a looped sound
  2. play it imediately on a channel
  3. attach an custom DSP #1 on the channel #1
  4. save the float buffer to a circulating sound queue (probably std:vector might also be sufficient in this case)
  5. create a custom DSP #2
  6. playDSP with the data from the sound queue to a new channel #2

As long sampling frequency and playback frequency are equal everything is great. I get a latency that is comparable to other every audio layer. Depending on the DSP-Buffer size it is less than 10ms.

I could build my software that way and the user might also accept the samplerate issue. I know the technical reason is that there is almost no resampler involved anymore. My question is:

[b:1xgpccve]How do I detect the input sample rate and the output sample rate of a particular sound device?[/b:1xgpccve]

I know I can initialize with setSoftwareFormat and I can query it with getRecordCaps. But setSoftwareFormat just takes everything and converts it internally. I the case above I need to know the real sampling frequency of a sound device. If I would have this I could do a intersection between the output sample rate and the input sample rates to find the biggest equal sample rate. If they do not fit, I wouldn’t allow this chain.

Any further ideas?

  • You must to post comments
0
0

use System::getDriverCaps

FMOD_RESULT F_API getDriverCaps (int id, FMOD_CAPS *caps, int *controlpaneloutputrate, FMOD_SPEAKERMODE *controlpanelspeakermode);

controlpaneloutputrate is the rate you have set in your system.

  • You must to post comments
Showing 12 results
Your Answer

Please first to submit.