Measuring string length seems easy, but it’s such a nuanced trap. Here’s a string. Can you guess its length? “My 👨👩👦👦 is fun” It’s a trick question. There isn’t a single answer. In PHP, it’s 38. strlen($string) counts bytes in memory. In JavaScript, it’s 22. string.length counts UTF-16 code units. In Swift, it’s 11. string.count…
Before Dennis Snell started talking to me about Unicode, I thought displaying text on a screen was the most boring thing. I had no clue. It’s fascinating! And I’m sharing my favorite bits below. Aha, if you wonder what are codepoints, code units etc. you may want to start with the short introduction to Unicode…
I’m fascinated with Unicode. This blog post is one I wish I had 20 years ago when I was starting to learn about text encoding. Around 2005, I struggled to use Polish words on my website. I would type in koło in Windows notepad, but in Firefox I would see ko³o. Why? Notepad used the…