Category: Unicode

  • String Length Differs Between Programming Languages

    Measuring string length seems easy, but it’s such a nuanced trap. Here’s a string. Can you guess its length? “My 👨‍👩‍👦‍👦‍ is fun” It’s a trick question. There isn’t a single answer. In PHP, it’s 38. strlen($string) counts bytes in memory. In JavaScript, it’s 22. string.length counts UTF-16 code units. In Swift, it’s 11. string.count…

    Read article →

  • Unicode Facts You Should Know

    Before Dennis Snell started talking to me about Unicode, I thought displaying text on a screen was the most boring thing. I had no clue. It’s fascinating! And I’m sharing my favorite bits below. Aha, if you wonder what are codepoints, code units etc. you may want to start with the short introduction to Unicode…

    Read article →

  • Short introduction to Unicode and UTF-8

    I’m fascinated with Unicode. This blog post is one I wish I had 20 years ago when I was starting to learn about text encoding. Around 2005, I struggled to use Polish words on my website. I would type in koło in Windows notepad, but in Firefox I would see ko³o. Why? Notepad used the…

    Read article →