Monday, September 14, 2015

Curing Qt UTF-8 console pain on Windows

The situation

It happens that sometimes I have to write console application in C for Windows, and having only ASCII output is not always an option: basically, I need to have a possibility to perform text output in Ukrainian which is okay when you use UTF-8. Unlike others, Windows use legacy code pages system to make console work with natural languages text in different countries with different localisations, and that's a weird thing for Linux/Mac user.

I work in Qt Creator when coding in C, and it brings another tricky thing: qt_process_stub.exe - a nasty utility which is run from the IDE, which actually runs your app. My task was to find a non-tricky solution to have a possibility to work seamlessly when debugging my C programs from Qt Creator.

Step 1: Changing the font

Most of Windows developers possibly know that there's a default font in Windows cmd.exe (which is a terminal itself) that does not support UTF-8 encoding (yeah...), so the first thing you have to do is to run cmd.exe (via Ctrl+R), click Command Prompt icon and choose Properties context menu item (pic 1.1)

Step 1.1: Enter properties

There, you have to choose Lucida Console font as it supports UTF-8 symbols (at least Cyrillic ones ;), and press OK button at the bottom of the dialog.

Step 1.2: Choosing the need font.

Now, we are capable of using UTF-8 text in console and even see something meaningful there.

Step 2: Changing the code page

By default, the code page used by cmd.exe on my system is 437 which is a DOS Latin US character set. You can easily check it by running chcp command in the terminal:

chcp

The same command is used to change current code page. Everything you need is to pass codepage number as an argument:

chcp 65001

Cool! Now we use UTF-8 code page. But the problem is that when you close the terminal and start it again, the codepage will default to 437 again.

Here we actually don't need to play with codepages, but rather use the information that Qt runs console application from its IDE via qt_process_stub.exe. And it uses default system encoding information from MS Windows System registry. This setting can be easily tweaked from command-line:

REG ADD HKCU\Console /v CodePage /t REG_DWORD /d 0xfde9

This command will fix the Qt Creator problem, but cmd.exe will still run 437 by default. If you want to change this setting globally, start regedir (if you have administrator rights, of course), and add a new setting to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Command Processor folder. Name it Autorun with value chcp 65001 >nul.

Step 3: Checking the Result

It is very easy now to check the result. Let's start minimal project in Qt Creator (Non-Qt Projects section -> Plain C Application), which main.c file will look like this:

#include <stdio.h>
#include <conio.h>

int main(void)
{
    printf("Привіт!\n"); // Means "Hello!" in Ukrainian
    getch();
    return 0;
}

Warning! Calling getch() serves here not for stopping console from disappearing as it usually happens (Qt Creator stops it by default), but it helps to avoid errors of qt_process_stub.exe. The absence of some sleeping/waiting mechanism will force errors because the utility has some bugs in reading from a pipe :). And the result of running this small code will look like this:


As you can see, no weird character, only plain text.

Thanks for reading. Hope this post helps to fix your troubles.