Correlation is Not Causation 🤷

--

Photo by Nadir sYzYgY / Unsplash

This is one of those phrases that people throw around a lot and have no idea of the profoundness of it.

Causation

When you think of Causation think “A causes B” with no question to it. This is the realm of mathematics and some clichés, where you know that if you jump out of a 10-store window, you will die due to gravity. Jumping causes falling.

For the nerds out there I’m sure you can come up with loads of possible results from jumping, but I’m only trying to show what “pure” causation might look like, so I had to oversimplify.

Correlation

Correlation is where things get fuzzy. Human beings have evolved to see the correlation between events. When you think of correlation think “These two events happened one before the other, so they MIGHT be related or not”. Emphasis on MIGHT.

Correlation is something that can confuse people looking at large quantities of data if they don’t have statistical tools to separate what is cause and effect vs what isn’t. And again, “cause and effect” vs “no relatedness” is a gradient in the real world.

Any physicist worth their weight might show how everything in the universe is connected and related. Some hippies and new-age folks might also be able to make a convincing case on the relatedness of all things, although from a completely different viewpoint.

The Key Differentiator

The key difference is how useful some data (let’s call it “Green Dolphins killing sharks”) is to predict a separate piece of data, which we will call “Humans saved by Green Dolphins from shark attacks”.

So these two pieces of data might be very related, as in green dolphins might be saving humans, or it might NOT be very related if green dolphins only kill sharks when there aren’t any humans around.

Spurious Correlations

I was looking for funny correlations, where data A looks like it is a good predictor for data B, but luckily this awesome dude did most of the heavy lifting for me. Take a look at some correlations.

This next one got me laughing a LOT.

What about this one?

See, there are plenty more examples of this. I suggest you take a look at his website.

My main point here is to enlighten a little bit more people who might be into statistical nerd topics, and people who aren’t that into it but don’t want to be fooled by statistics online.

As Daniel Kahneman wrote in Thinking, Fast and Slow even trained statisticians are often fooled by statistics!

So I leave this post with a warning, don’t trust any statistics you look online, either look into it more carefully or find someone that you trust can examine it for you. Don’t make decisions on top of data you don’t understand.

And another topic I want to treat in a separate post is: a way statisticians have found to examine and eliminate false correlations(like the spurious ones I’ve shown above) is to run the data with corrections.

One of those is the Bonferroni correction. But this post is getting long as it is, and I need to get a better grasp on the theme to explain it more succinctly. So I will see you in the next post.

😗 Enjoy my writing?

Like my content? Feel free to Buy me a Coffee ☕.

Subscribe to my exclusive email newsletter here.

Forward to a friend and let them know where they can subscribe (hint: it’s here).

Anything else? Just say hello in the comments :).

Join an Exclusive Tech Friendly Community! Connect with like-minded people who are interested in tech, design, startups, and growing online — apply here.

Originally published at https://lucas-schiavini.com on February 1, 2023.

--

--