We are using FMOD’s 3D audio features for a voice conferencing application. The problem we are having is setting up correct parameter values to enable FMOD to playback the sound.
Here is the details of the problem:
We create an FMOD stream using the following:
[code:bcx73gfh]void create_fmod_sound(struct null_audio_stream stream)
memset(&gExInfo, 0, sizeof(FMOD_CREATESOUNDEXINFO));
gExInfo.cbsize = sizeof(FMOD_CREATESOUNDEXINFO); / Required. /
gExInfo.decodebuffersize = stream->param.samples_per_frame; / Chunk size of stream update in samples. This will be the amount of data passed to the user callback. /
gExInfo.length = gExInfo.decodebuffersize
* (stream->param.bits_per_sample/8) * 10; / Length of PCM data in bytes of whole song (for Sound::getLength) /
gExInfo.numchannels = stream->param.channel_count; / Number of channels in the sound. /
gExInfo.defaultfrequency = stream->param.clock_rate; / Default playback rate of sound. /
gExInfo.format = (stream->param.bits_per_sample == 16) ? FMOD_SOUND_FORMAT_PCM16 : FMOD_SOUND_FORMAT_PCM8; / Data format of sound. /
gExInfo.pcmreadcallback = fmodcallback; / User callback for reading. /
gExInfo.pcmsetposcallback = fmodsetposcallback; / User callback for seeking. */
FMOD_RESULT result = FMOD_System_CreateStream(gSystem, 0, FMOD_2D | FMOD_SOFTWARE | FMOD_OPENUSER | FMOD_LOOP_NORMAL, &gExInfo, &gSound); CHECK_RESULT(result);
Setting up the parameter values for gExInfo object is our main problem right now, but I’ll get to that below.
Once the stream is created, the callback function [b:bcx73gfh]fmodcallback[/b:bcx73gfh] is called to fill up the sound object with the required number of frames till the buffer is full.
After we create streams and the request for a call is made, the stream is started using
[code:bcx73gfh]FMOD_System_PlaySound(gSystem, FMOD_CHANNEL_REUSE, gSound, 0, &gChannel)[/code:bcx73gfh]
The sound then starts to play till the buffer gets empty which calls the fmodcallback again to fill up the buffer.
The fmodcallback looks like this:
[code:bcx73gfh]FMOD_RESULT fmodcallback(FMOD_SOUND sound, void *data, unsigned int datalen)
struct null_audio_stream stream = gStream;
unsigned static int curr_frame = 0;
frame.type = PJMEDIA_FRAME_TYPE_AUDIO; frame.buf = malloc (stream->frame_size); frame.size = stream->frame_size; frame.timestamp.u64 = stream->p_tstamp.u64; frame.bit_info = 0; status = stream->play_cb (stream->user_data, &frame); // this writes the audio data into frame.buf from the network socket memcpy(data, frame.buf, datalen); stream->p_tstamp.u64 += stream->param.samples_per_frame; free (frame.buf);
All we do in the call back is copy the audio frames from the SIP sound port into the data buffer passed to us using the call to play_cb. The datalen mentioned here is the amount of bytes that will be written into the data buffer in each callback, and this is specified in the gExInfo object while creating the stream. The problem is that there seems to be an inconsistency between what we mention as the decodebuffer size (ie. the number of samples to be updated in each iteration) and the length in the gExInfo and what actually gets registered by FMOD and the datalen thats passed in the callback.
For instance, we specify the decodebuffersize as 320 (ie. 320 samples per frame, supposed to be updated in every callback). Each sample is represented in 2 bytes and we have a single channel audio. We set the length of the whole audio as 10 such frames (ie. total length in bytes = 320 * numchannels * 2bytes/sample * 10). So in the callback, we should see a datalen of 640 bytes (320 samples * 2 bytes per sample). Instead, we get datalen as 6400 bytes (which is length of entire audio) passed to us.
So our queries are:
* What do the various fields in gExInfo represent?
* What kind of value should we supply for each field (e.g. decodebuffer size needs # of samples and not bytes, length needs bytes, defaultfrequency needs value in KHz, mHz, or just Hz)?
* Specifically, what do decodebuffersize and length represent?
* What is the datalen that is passed to the callback? What do the data and sound ptrs that are passed to it represent?
* We want to create a continuously playing FMOD audio stream, where we write into the stream whenever its empty using its callback. Is our approach of creating the stream and updating it correct as per FMOD specs? We are following the usercreatedsound example available in the FMOD sdk for Android.
* Can we use FMOD’s 3D capabilities to work on continuous streams? Specifically, can we apply the 3D spatialization effect on the FMOD stream that we create by the above mentioned method, or is it only possible with static preloaded sounds?
* Can you point me to any FMOD examples on streaming audio players, similar to the above implementation?
- msafdari asked 7 years ago
- You must login to post comments
Sorry for not getting back you on this sooner.
Can you try the newest version of fmod –
15/05/12 4.40.06 – Stable branch update
- Fix FMOD_OPENUSER streams not passing the user defined decodebuffersize to the
callback and instead passing a value that was rounded to the nearest 256.
There was an issue that meant the size passed in was being rounded and not representative of what the user passed in.
To answer some of your questions.
– defaultfrequency is Hz. Sorry should have said that in the docs.
– decodebuffersize is the internal memory blocksize for 1 side of a double buffer. Your callback should get the length passed in to represent this value. Length is how long the ‘sound’ is. If it is infinite, just use -1. You may have a sound that feeds 20ms at a time, but the sound may be 1 minute long. ‘length’ is for the 1 minute.
– Your code basically looks correct
– Yes you can just add FMOD_3D to the createSound/Stream call and it will work as a normal sound.
- You must login to post comments
Please login first to submit.