You might get some headache because of the fact that the C++ standard dictates that wide-streams are required to convert double-byte characters to single-byte when writing to a file, and how this conversion is done is implementation-dependent. We are working through our issues, but I was wondering: Is there a good "guide" out there for writing C++ code that is designed for 8-bit characters and which will work properly when UTF-8 data is given to it? Now, we're trying to install it in Iceland, and are running into problems where the Icelandic characters are getting screwed up. We use std::string and plain-old C strings. Unfortunately the console font selected has to be a font that supports the codepage, and I can't see a way to set the font. The encoding is defined by the Unicode standard. UTF-8- is a variable width character encoding capable of encoding all 1,112,064 valid code points in Unicode using one to four 8-bit bytes. UTF-8 can encode each of the 1,112,064 valid code points in the Unicode code space. UTF-8 can store a character in more than one byte.

The standard bitmap fonts only support the system default OEM codepage. That is: 110x xxxx = two-byte character Just be 8-bit clean, for the most part. The Icelandic alphabet is all contained in ISO 8859-1 and hence Windows-1252. Why did UTF-8 replace the ASCII character-encoding standard? A) UTF-8 only uses 128 values B) UTF-8 can store a character in more than one byte C) ASCII can store a character in more that one byte D) ASCII can represent emoji.

1110 xxxx = three-byte character 1111 0xxx = four-byte character

ASCII can store a character in more than one byte.

1 point ASCII can store a character in more than one byte. etc. UTF-8 only uses 128 values.

  If this is a console-mode application, be aware that the console uses IBM codepages, so (depending on the system locale) it might display in 437, 850, or 861. For the most part, these applications just pass data around; they don't "process" the text in any way other than copying it from place to place.

