0
0

Hi, I have been struggling this problem almost whole day,
and i’m not sure is this Compiler specific or bug in FMOD or something else.
Because I think that i’m not doing nothing wrong, but results are quite interesting what ever I do. If you can, please help.

I’m using Borland Turbo C++ (the newer one, not the old one) and C interface in it ( naturally :( )

so my sample code here is:

FMOD_TAG fm_taginfo;

And i’m iterating throught the tags. And normal FMOD_TAGDATATYPE_STRING type work correctly,
but when encountering (at least) FMOD_TAGDATATYPE_STRING_UTF16 problems start.

I have looked in to this matter using debugger and testing with WinAmp 5.5

Anyways, when I am saving TAG in winamp it is saved in UTF16 I believe.
which are represented in WIN32 as wchar_t in. And In my democode I am doing my cast as:

wchar_t* pszSrc = reinterpret_cast<wchar_t*>(fm_taginfo.data);
int wtf16len = lstrlenW(pszSrc);

As you see from the picture attached also FMOD_TAG is containing wrong .datalen before cast, so I dont belive the problem is with the cast itself.

Anyways, I managed to find out that first byte in the string is
UTF16 Byte order mark (BOM). (0xFEFF)

Then comes the actual TAG information, as in this case "Demo" and then these appendix bytes, 0xAB00, 0xABAB, 0xABAB, 0xABAB, 0xFEAB, 0xFEEE, 0xFEEE and last but not least the usual ‘\0’ which terminates string..

So I am asking that what are above?, at least I know that some functions Windows(?) are doing 0xAB00 etc fill the block reserved (see http://www.samblackburn.com/wfc/technotes/WTN006.htm)

But if so, I believe that those bytes should be come after ‘\0’ not before.. unless null-termination is done manually somewhere lower?

Also I could not find anything that should support the bytes to belong to UTF16 or Unicode..

I have attached picture which can be found from:
[url:liucxwyk]http://img237.imageshack.us/img237/4008/wtf16lf1.jpg[/url:liucxwyk]

  • You must to post comments
0
0

[quote="mathew":1zzjszfw]Hi guys, I just ran through the debugger to check out this tag issue you are having. I don’t see anything incorrect going on.

The file posted to the forum has 1 tag, its a UTF-16 text title tag id = TIT2.

The length of the tag is 23 bytes, the first 2 bytes are the BOM (0xFFFE), the next 20 bytes are the string (which is "Silverfang"), the last byte is the null terminator.

EDIT:

Okay, so I see what your issue probably is, you are casting to WCHAR but only have a single null in the data stream, I can probably change that so it puts 2 nulls so your string wont just wander off into memory.[/quote:1zzjszfw]

— Can be deleted. (removed the pic)

  • You must to post comments
0
0

Due ‘Little Endian’, on your position [11] bytes are actually 00 AB meaning the single null byte is on the right position, and will hopefully be 00 00 and terminated proper for unicode. As it seems the DataLen includes the null termination, it should be 24 then, including the unicode null character…

  • You must to post comments
0
0

I checked also with Vistual Studio with c++ interface, same results, am i’m dong something wrong?

I also tested with latest stable Fmod and version before it.

  • You must to post comments
0
0

Thanks Controller, that is correct. Currently it will say datalen is 23 bytes, 2 (BOM) + 20 (string) + 1 (terminator), this is incorrect though, datalen should be 24 bytes, 2 (BOM) + 20 (string) + 2 (terminator). This is the change that I made that will be available in our release that we are building at the moment.

  • You must to post comments
0
0

Maybe Winamp does something wrong…?
Can you upload the file?

  • You must to post comments
0
0

[quote="Controller":3bhtrrjs]Due ‘Little Endian’, on your position [11] bytes are actually 00 AB meaning the single null byte is on the right position, and will hopefully be 00 00 and terminated proper for unicode. As it seems the DataLen includes the null termination, it should be 24 then, including the unicode null character…[/quote:3bhtrrjs]
Hi Controller, so those extra bytes are just junk ([11] 0xAB00, [12] 0xABAB, [13] 0xABAB, [14] 0xABAB, [15] 0xFEAB) before next null terminator in the memory) because cast ignores single byte terminator? and uses next one (just clarifying this to myself).

[quote="mathew":3bhtrrjs]Thanks Controller, that is correct. Currently it will say datalen is 23 bytes, 2 (BOM) + 20 (string) + 1 (terminator), this is incorrect though, datalen should be 24 bytes, 2 (BOM) + 20 (string) + 2 (terminator). This is the change that I made that will be available in our release that we are building at the moment.[/quote:3bhtrrjs]
Excellent, I’ll be removing above pic and the example.mp3 shortly, so case will be closed, another "bug/feature" found then :)

Thanks for all (even thought I fixed the case long time ago), and only discussed the reason for it 8)

  • You must to post comments
0
0

[quote="Controller":1i5kmxfh]Maybe Winamp does something wrong…?
Can you upload the file?[/quote:1i5kmxfh]

You can do those files yourself in WinAMP, just edit the ID3v2 TAG (select item from playlist and press ALT+3), and you should have tags in UTF16, if not, remove the tag and then recreate the tag.

Also, I just verified with hex editor that the data is not saved in the file (and windows explorer is reporting correctly same name anyways) (Only BOM is saved, which should be saved anyways)

  • You must to post comments
0
0

Yes, when you have wide chars you need 0x0000 to properly terminate the string, otherwise you will see whatever is next in memory, until eventually you come across a 0x0000.

  • You must to post comments
0
0

Getting unicode tags isn’t that easy on Windows 9x (when I added support for my player it was even tricky on Windows 2000 because Winamp didn’t support (!) input (!) of unicode tags at this time yet, so I hex-edited).

If you can upload your file, I can check with my fmod / fmodex based player to see if it works with fmod / fmodex at all.

Can provide you source code (PureBasic) that should work with all valid tags (If some idiots have tags with chr(1) etc at the beginning, I can’t help them)

  • You must to post comments
0
0

so 0x000000 whould be better in case due some reasons the length is odd…

  • You must to post comments
0
0

[quote="Controller":x2q4557l]Getting unicode tags isn’t that easy on Windows 9x (when I added support for my player it was even tricky on Windows 2000 because Winamp didn’t support (!) input (!) of unicode tags at this time yet, so I hex-edited).

If you can upload your file, I can check with my fmod / fmodex based player to see if it works with fmod / fmodex at all.

Can provide you source code (PureBasic) that should work with all valid tags (If some idiots have tags with chr(1) etc at the beginning, I can’t help them)[/quote:x2q4557l]

Here is sample: <link removed, not required anymore>
it includes one tag in UTF16 (or atleast fmod reports so)

Yeah, I cant paste any code here, but code does work with STRING type tags. (And data lenght with UTF16 type strings is wrong when getTag returns already), but please give it a try.

I could write example code that does getTag etc. But then again you could see the example from the documentation/sdk (sdk sample homewer does support only type STRING).

  • You must to post comments
0
0

It should be safe with what I have changed, because I make the assumption that the tag data itself does not have a null terminator (sometimes they do sometimes they don’t). So if its UTF-16 it will get 2 terminators + any terminator in the tag already. I also update the length of the tag to ensure our additional terminator is considered.

  • You must to post comments
0
0

(Tested with FModEx 4.15.01 (and quicktest with FMod 3.75))
Works fine here, the tag is unicode / wide (so there are no real characters)

Data length = 23 (=bytes)… correct, but it’s unicode (widechar), the BOM is 2 bytes, so len should be a multitude of 2?
This should work with unicode tag fields (no particular basic dialect):

if DataLen > 3
if IsBOM(*ptrData)
*ptrData + 2 ‘ ignore the BOM

' Substract BOM length, make DataLen a multiple of 2) and divide by 2 to get length in characters
If (DataLen % 2)
  ' odd length:
  DataLen = (DataLen - 3) / 2
  Else
 ' even length:
  DataLen = (DataLen - 2) / 2
  EndIf
' Tag should be ok, however terminating null-character may be invalid / missing etc;
' if you don't have a function to copy strings with max-length parameter, copy to a temporary memory, manually terminate and read it;
' in PureBasic it would look like
xChar1 = PeekS(xLong1, xLong2, #PB_Unicode)
endif

endif

round down and divide by 2 should give the actual max length in characters, regardless the terminating null-character.

I re-used just few old debug functions here, which outputs DataLen and DataType, as well as character and its hex value(s), bytes beyond DataLen are not processed:
23, 4
_ _ S i l v e r f a n g
FFFE530069006C00760065007200660061006E00670000

  • You must to post comments
0
0

And it works with latest version like it should be 😀

  • You must to post comments
0
0

[quote="Controller":3lsxjw8m](Tested with FModEx 4.15.01 (and quicktest with FMod 3.75))
Works fine here, the tag is unicode / wide (so there are no real characters)

Data length = 23 (=bytes)… correct, but it’s unicode (widechar), the BOM is 2 bytes, so len should be a multitude of 2?
This should work with unicode tag fields (no particular basic dialect):

if DataLen > 3
if IsBOM(*ptrData)
*ptrData + 2 ‘ ignore the BOM

clip
[/quote:3lsxjw8m]

Yeah, I know how to calculate it and how to strip BOM but still, (in my opinion), null termination should come before "garbage", not after. Which would eliminate calculations.. and this should be done in FMOD (feel stupid to parse data that FMOD should parse automatically (so it would return only possible byteorder AND tag with nulltermination). Or am i’m seriously wrong?

But if this is normal when dealing with Unicode, I’ll write something like above for the WTFStrings then..

  • You must to post comments
0
0

In Turbo C++ example I did it like this, naturally function is only called when encountering STRING_UTF16:

[code:1q4o83c0]
wchar_t * pszTAG = (wchar_t*)fm_taginfo.data;
AnsiString szAnsiTAG;

    if((*pszTAG == 0xFEFF) &amp;&amp; (fm_taginfo.datalen &gt; 3))
    {
        ++pszTAG;

        int tagStrLength = ((fm_taginfo.datalen-2) / 2);
        szAnsiTAG = WideCharLenToString (pszTAG, tagStrLength);
    }

    return szAnsiTAG;

[/code:1q4o83c0]

But then again, almost all of above would not be needed if function should only return what excepted :), hopefully I will still get some fmod developer insight or some explanation for those extra data that is not supposed to be there.

And Controller, you where right about the lenght, lenght is reported correctly, only data is wrong (or null termination is in wrong place). Thanks :)

  • You must to post comments
0
0

Hi guys, I just ran through the debugger to check out this tag issue you are having. I don’t see anything incorrect going on.

The file posted to the forum has 1 tag, its a UTF-16 text title tag id = TIT2.

The length of the tag is 23 bytes, the first 2 bytes are the BOM (0xFFFE), the next 20 bytes are the string (which is "Silverfang"), the last byte is the null terminator.

EDIT:

Okay, so I see what your issue probably is, you are casting to WCHAR but only have a single null in the data stream, I can probably change that so it puts 2 nulls so your string wont just wander off into memory.

  • You must to post comments
Showing 17 results
Your Answer

Please first to submit.