Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So, it was always wrong, but they didn't notice until the "exceptional" cases became common enough in their own use.

This, to me, is an argument for emoji being a good thing for string-handling code. The fact that they're common means that software creators are much more likely to notice when they've written bad string-handling code.

Whatever platform you're using should have an API call for counting grapheme clusters. It may be more complex behind the scenes, but as an ordinary programmer it should be no more difficult to do it correctly than it is to do it wrong.



Ironically, this cuts both ways. Existing folks who didn't care about this now care about it due to emoji. Yay. But this leads to a secondary effect where the idea that this is "just" an emoji issue is spread around, and people who would have cared about it if they knew it affected languages as well may decide to ignore it because it's "just" emoji. It pays to be clear in these situations. I'm happy that emoji exist so that programmers are finally getting bonked on the head for not doing unicode right. At the same time, I'm wary of the situation getting misinterpreted and just shifting into another local extremum of wrongness.

Another contributor to this is that programmers love to hate Unicode. It has its flaws, but in general folks attribute any trouble they have to "oh that's just unicode being broken" even though these are often fundamental issues with international text. This leads to folks ignoring things because it's "unicode's fault", even though it's not.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: