Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"06-07-10 - Unicode CMD Code Page Checkup"

7 Comments -

1 – 7 of 7
Comment deleted

This comment has been removed by the author.

June 8, 2010 at 10:18 AM

Blogger Thatcher Ulrich said...

UTF-8 FTW!

June 10, 2010 at 6:07 PM

Anonymous Anonymous said...

I imagine the _t came about because they wanted to avoid collisions in code that was already using the term 'wchar' (or 'size', or etc.).

As far as I can imagine wcslen() must have been a decision not to want things to be one character longer. Possibly also it's "wcs" for "wide character string" and someone pedantically decided that "wide string" sounded lame and so "wstr" wouldn't do? I don't know, I can't get behind the wcs thing at all, because it just totally fucks. (E.g. wstrlen, wstrcpy, wsprintf, wfgets... that could have all been consistent.)

I use utf8 everywhere and have wrappers around fopen etc. that in windows convert to utf16 and call the appropriate functions. And I just blow off functioning correctly for filenames from the commandline.

June 15, 2010 at 8:12 PM

Blogger cbloom said...

"I use utf8 everywhere and have wrappers around fopen etc. that in windows convert to utf16 and call the appropriate functions. "

Yeah I thought about doing that when I originally went unicode. On the surface it has the appeal that your existing char * based string code all just works, so that's cool. The problem I have is that workage is a bit illusory. eg. lots of those string routines will assume that a char is one letter. And you have to be really careful to always call the wrapped routines, it makes it hard to make sure you converted everything to unicode correctly because you can easily be calling plain old A-page stuff from anywhere.

"And I just blow off functioning correctly for filenames from the commandline. "

CLI4LIFE !

June 15, 2010 at 8:54 PM

Blogger cbloom said...

"Yeah I thought about doing that when I originally went unicode. "

Anyway, the internal unicode part in UTF-whatever is not the hard part, that's just coding, and Coding is Easy. The hard part is interacting with the Windows CLI which is thoroughly fucked.

(I have yet to determine if other Windows mechanisms like DDE or Clipboard or any of those ways of getting file paths sent to you are functional or not).

June 15, 2010 at 8:55 PM

Anonymous Anonymous said...

"chcp 65001" works in Win XP. Well, I haven't checked that it actually sends commandline args in utf8. I just mean that if I type the command it claims to work.

June 22, 2010 at 2:15 PM

Blogger cbloom said...

"chcp 65001" works in Win XP. Well, I haven't checked that it actually sends commandline args in utf8. I just mean that if I type the command it claims to work."

Yeah chcp 65001 works on any modern Windows - depending on what font you have set on your command line. I haven't actually tested thoroughly how useful that is, because if you're trying to write command line apps that work on any system you can't rely on that.

June 22, 2010 at 4:10 PM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.