0
0

hi,

we are trying to implement voice over ip using fmod and speex. most of the stuff is implemented and works fine, i.e. encoding / decoding, network transmission etc.

the last remaining problem is the playback of received voice data. we have created a stream sound and feed it in the decoded pcm data in read callback function — which is enqueued when arrived by network instance.

however the played sound has a periodic wobbling; i don’t know how to explain the effect :-( we have played with decode buffer size a little and enlarging it causes a playback of longer periods without wobblings but with bigger pauses. i suspect that it has to do something with the fact that the decodebuffersize ist of fixed size!?

here some code snippets, any help is highly appreaciated ๐Ÿ˜‰


[code:qsyrbwz0]

define CODEC_FRAME_SIZE 320

define VOICE_SAMPLE_RATE 16000

define VOICE_SOUND_FORMAT FMOD_SOUND_FORMAT_PCM16

define VOICE_DATA_FORMAT_TYPE short

FMOD_RESULT F_CALLBACK voiceReceiverReadPCM( FMOD_SOUND* p_sound, void* p_data, unsigned int datalen )
{
FMOD::Sound* p_fmodsound = reinterpret_cast< FMOD::Sound* >( p_sound );
void* p_userdata;
p_fmodsound->getUserData( &p_userdata );
SoundNode* p_soundnode = reinterpret_cast< SoundNode* >( p_userdata );

if ( !p_soundnode )
    return FMOD_OK;

VOICE_DATA_FORMAT_TYPE*   p_sndbuffer = reinterpret_cast&lt; VOICE_DATA_FORMAT_TYPE* &gt;( p_data );
unsigned int    cnt = datalen / sizeof( VOICE_DATA_FORMAT_TYPE );

// get the proper sample queue and put the samples into raw sound buffer
SoundSampleQueue&amp; samplequeue = *p_soundnode-&gt;_p_sampleQueue;
if ( !samplequeue.empty() )
{
    //log_verbose &lt;&lt; &quot;recv bytes: &quot; &lt;&lt; samplequeue.size() &lt;&lt; std::endl;

    // handle buffer underrun
    if ( samplequeue.size() &lt; cnt )
    {
        log_verbose &lt;&lt; &quot;playback buffer overrun: &quot; &lt;&lt; samplequeue.size() &lt;&lt; &quot;, &quot; &lt;&lt; cnt &lt;&lt; std::endl;
        cnt = samplequeue.size();
    }
    for ( ; cnt &gt; 0; --cnt )
    {
        *p_sndbuffer++ = samplequeue.front();
        samplequeue.pop();
    }
}
//! TODO: better turn off the sound when buffer is empty instead of  erasing the buffer
else
{
    memset( p_sndbuffer, 0, datalen );
}

return FMOD_OK;

}

SoundNode* VoiceReceiver::createSoundNode()
{
// create a sound
FMOD_RESULT result;

FMOD_MODE               mode = FMOD_2D | FMOD_OPENUSER | FMOD_SOFTWARE | FMOD_CREATESTREAM;
FMOD_CREATESOUNDEXINFO  createsoundexinfo;
memset( &amp;createsoundexinfo, 0, sizeof( FMOD_CREATESOUNDEXINFO ) );
createsoundexinfo.cbsize            = sizeof( FMOD_CREATESOUNDEXINFO );    
createsoundexinfo.decodebuffersize  = CODEC_FRAME_SIZE; //! TODO: determine a good buffer size, too big sizes result in noise and echos
createsoundexinfo.length            = -1;//10 * VOICE_SAMPLE_RATE * sizeof( short );
createsoundexinfo.numchannels       = 1;
createsoundexinfo.defaultfrequency  = VOICE_SAMPLE_RATE;
createsoundexinfo.format            = VOICE_SOUND_FORMAT;
createsoundexinfo.pcmreadcallback   = voiceReceiverReadPCM;

// create a new node
SoundNode* p_soundnode = new SoundNode();
result = _p_soundSystem-&gt;createSound( 0, mode, &amp;createsoundexinfo, &amp;p_soundnode-&gt;_p_sound );
_p_soundSystem-&gt;playSound( FMOD_CHANNEL_FREE, p_soundnode-&gt;_p_sound, false, &amp;p_soundnode-&gt;_p_channel );    
p_soundnode-&gt;_p_sound-&gt;setUserData( static_cast&lt; void* &gt;( p_soundnode ) );

return p_soundnode;

}

[/code:qsyrbwz0]

cheers
boto

  • You must to post comments
0
0

Hey boto, I am working on something similar (receiving audio data from network and playback the recorded data).

How have you structured the receiving part?
What is the VoiceRecorder class doing and when are you creating a new SoundNode? (every time a new user is talking?)

What I am asking really is; When you are receiving audio data, what are you doing? How did you organize it?

Happy for some answers!

Best regards

  • You must to post comments
0
0

hi,

brett:

>What happens if the data comes in faster than can be consumed?

the incoming data is queued in _p_sampleQueue, the queue is then used in voiceReceiverReadPCM. i.e. if data is coming faster as can be played no problem occurs. but in normal case this should not happen as the sound data producer works with same sampling frequency as the the receiver consumes.

are there other mechanisms for playing a stream in addition to what i am doing? concerning your advice playing and ripping a sine wave; good idea, gonna do it.

Ymer:

you can get the full source code, visit our project site at http://yag2002.sf.net

just to quickly answer your questions: the sound nodes represent clients which are in a voice group. every client has a receiver and several senders which are connected to other’s receivers. the sound node is used for managing the voice groups. the voice groups are built dynamically depending on position distances during players walk around in the 3d world.

if you need more details then come to #vrc channel on freenode ๐Ÿ˜‰

cheers
boto

  • You must to post comments
0
0

boto, thanks for your answer.
No one but chanserver was in #wrc, so I ask you here:
Where do I find the source for the audio stuff you are refering to in this thread? Eg. I can’t find SoundNode anywhere in the VRC src.

  • You must to post comments
0
0

hi Ymer,

sorry that you could not meet me in #vrc, i am not alway there, just sometimes evenings for a couple of hours. so we should setup a date for #vrc meeting ๐Ÿ˜‰

the source code is very new, it was not included in last official release of VRC, 1.1.0 it is. so you have to checkout the sources directly from svn server. you get more info about the svn repository at project site in section Support. all relevant sources you need are located in folder

src/gamecode/voice

there is also a web access to the svn:

http://svn.berlios.de/viewcvs/yag2002/t … ode/voice/

be aware that the code is not stable, it is still under development. but it should suffice for getting out the concepts ๐Ÿ˜‰

cheers
boto

  • You must to post comments
Showing 4 results
Your Answer

Please first to submit.