In part 1, I discussed the practical problems with online ratings: the scales are misleading, the scores can’t be compared with each other, the numbers don’t reflect the experience, and different sites often disagree whether something is good or bad. This part explores why that happens.
Let’s start with a story.
I went to a burlesque show with Ania and some friends I’ll call J and N. It was in one of those elegant jazz clubs where smartly dressed guests enjoy their champagne by candlelight, surrounded by vinyl discs and saxophones hanging on red velvet walls. We got there a few minutes early, ordered ours, and started getting comfortable. Shortly after that, the announcer entered the stage.
GOOOOOD EVEEEEENIIING! He screamed through the sound system with the power of ten thousand roaring elephants, crushing not only our ears but also our chances of actually having one. So loud! We promptly complained to the staff who then reduced the volume slightly. It wasn’t nearly enough, but I still wanted to see the show, so I spent most of the evening plugging my ears with my fingers.
The show opened with a few mediocre songs. Just as I began thinking what a disaster, a flirty chair dance started, followed by a few bolder and remarkably better-performed songs. I even started having fun! And that’s when they absolutely butchered Hit the road, Jack. It was so bad that I fled the crime scene, excusing myself to the bathroom.
Shortly after I returned, the star of that evening began the first of her two performances. It was fantastic! We finally got what we came there for: feathers, comedy, and stunning sensuality. I even unplugged my ears to clap. The following few songs now seemed like mere music on hold as we awaited her return in the grand finale. And boy, oh boy, did it make an impression! We gasped in awe and our jaws dropped and eyes protruded as the essence of burlesque unfolded in front of us, this time sparking not just clapping but a full-on standing ovation.
Alright alright, but how is it all related to online reviews? Well, let me ask you the same question I asked everyone at our table: how many stars did the show deserve?
Numbers cannot capture experiences
Imagine you wrote down fifty phrases ranging from downright awful to breathtakingly amazing and asked me to describe our evening using a single position from that list. I just couldn’t, not without lying to you. Part of it was terrible, another part was fantastic, and there were also bad, mediocre, and great moments.
But here’s the thing – fifty options is everything the typical overall score ranging between 1.0 and 5.0 stars can express. That’s a single, negligible byte.
Compressing memories into star-shaped byte-sized containers strips them of any meaning in the same way as reducing a university degree to a single tweet would. My short account above consists of 1661 bytes. A photo from that evening would contain about 10 million bytes, and one minute of video footage – about 10 billion. One byte is not even enough to communicate which of the 2000 Pantone colors the walls had.
Am I using an unusually nuanced example? Perhaps, but life is all about nuance. Take something as bland as ordering a waffle: Ania and I had a pricey one recently, and it tasted great. Unfortunately, they kept us waiting outside for about 15 minutes on a freezing day. I’d give them five for the food and one for the waiting time, but I can only choose one option, and three feels unfair.
Expressing experiences using stars is like sculpting a song, dancing about a painting, or cooking about a movie. It’s the wrong tool for the job.
There isn’t a correct way to assign stars
How do you rate enjoyable restaurants?
I give the full five stars, reasoning that taking one away indicated something was wrong. My friend Mike, however, leaves at most four-star reviews. In his book, five stars are reserved for outstanding experiences. I also talked to someone who uses the same four stars to rate things he doesn’t like. Why? In his own words: objectively, it was good. It just wasn’t for me.
People don’t have a shared understanding of what a 4-star place is. Some sites attempt to clarify it with labels such as 5 stars – very good or 4 stars – good, but that just moves the problem one step further: what does good mean? Is the restaurant still good if the service took longer than you expected? What if the bathroom wasn’t as clean as you hoped for?
Good is in the eye of the beholder.
As I learned on the burlesque night, five people can have the same experience and rate it in five different ways.
The couple following table seemed to have the time of their lives, and I’d bet some money their review would have five stars. Our friends were less excited. J rated the evening as four because everyone tried their best, and a few segments were stunningly good. N said three because it was too loud, the show started late, and the waiter forgot about our order. Ania shared the exact same concerns, but it made the show a two for her.
And me? I said one. The grand finale was perfect, but it was still the worst 90 minutes my ears have ever experienced – and I’ve been at multiple metal festivals. Here’s the thing, though: despite my own low rating, I’d go there again.
And that was part two of the Online Reviews series. Everything we just discussed has profound statistical implications that part three covers in-depth. See you next time!
Special thanks to my wife Anna for her suggestions and feedback on this post.