Internationalization woes

Internationalization woes

Postby andrewj » 10 May 2015, 03:20

I recently had a problem that a user could not run one of my C++ programs because their username (in Windows) contained an accented character -- fopen() would succeed for most things, but failed from the windowing toolkit's file-chooser dialog since it returned UTF-8 encoded string instead of whatever 8-bit codepage Windows was using.

I read a bit about the fopen() function for C and C++, and the standards say that the character encoding of the filename is not standardized, it is upto the implementation.

Linux uses UTF-8 for all it's file access functions (at the kernel level, and I guess that flows through to the standard C and C++ libraries) -- and most of the time everything "just works".

For Windows, though, I don't know exactly what to do. I would prefer to keep all filename strings in UTF-8 encoded unicode, and I could have an fopen() replacement in my own code, but then any library which I use which uses fopen() to open files is not going to work, and that's the real sticky part -- third party libraries.

Anyone with experience with this? What was your solution?
User avatar
andrewj
 
Posts: 194
Joined: 15 Dec 2009, 16:32
Location: Tasmania

Re: Internationalization woes

Postby Sauer2 » 10 May 2015, 10:29

A small search brought that filename strings are internally encoded as UTF16 on Windows and that fopen can't handle wide characters.

Some stackoverflow page has some countermeasures that should probably if-defed for Windows:
http://stackoverflow.com/questions/2050 ... -stored-as
User avatar
Sauer2
 
Posts: 430
Joined: 19 Jan 2010, 14:02

Who is online

Users browsing this forum: No registered users and 1 guest