Correlation and causality | Statistical studies | Probability and Statistics | Khan Academy

Correlation and causality | Statistical studies | Probability and Statistics | Khan Academy


I have this article
right here from WebMD. And the point of this isn’t
to poke holes at WebMD. I think they have
some great articles and they have some great
information on their site. But what I want to
do here is to think about what a lot of
articles you might read or a lot of research you
might read are implying and to think about
whether they really imply what they
claim to be implying. So this is an excerpt
of an article, and the title of the article
says “Eating breakfast may beat teen obesity.” So they’re already
trying to create this cause-and-effect
relationship. The title itself says if
you eat breakfast then you’re less likely–
or you won’t be obese. You’re not going to be obese. So the title right there
already sets up this. That eating breakfast
may beat teen obesity. And then they tell
us about the study. “In the study,
published in Pediatrics, researchers analyzed the
dietary and weight patterns of a group of 2,216 adolescents
over a five-year period from public schools in
Minneapolis-Saint Paul, Minnesota.” And I won’t talk
too much about this. It looks like a
good sample size. It was over a large
period of time. I’ll just give the researchers
the benefit of the doubt, assume that it was
over broad audience, that they were able to control
for a lot of variables. But then they go on to
say, “The researchers write that teens who ate
breakfast regularly had a lower percentage of total
calories from saturated fat and ate more fiber
and carbohydrates.” And to some degree that
first– “than those who skipped breakfast.” And to some degree this
first sentence is obvious. Breakfast tends to be
things like cereals, grains. You eat syrup, you
eat waffles– that all tends to fall in the category
of carbohydrates and sugars. And frankly, that’s not even
necessarily a good thing. Not obvious to me
whether bacon is more or less healthy than
downing a bunch of syrup or Fruit Loops or whatever else. But we’ll let that
be right here. “In addition, regular
breakfast eaters seemed more
physically active then the breakfast skippers.” So over here they’re
once again trying to create this other
cause-and-effect relationship. Regular breakfast eaters
seemed more physically active than the breakfast skippers. So the implication
here is that breakfast makes you more active. And then this last
sentence right over here, they say “Over time,
researchers found teens who regularly
ate breakfast tended to gain less weight and
had a lower body mass index than breakfast skippers.” So you could–
they’re telling us that breakfast skipping–
this is the implication here– is more likely, or it can be a
cause of making you overweight or maybe even making you obese. So the entire narrative here,
from the title all the way through every
paragraph, is look, breakfast prevents obesity. Breakfast makes you active. Breakfast skipping
will make you obese. So you just say then, boy,
I have to eat breakfast. And you should always
think about the motivations and the industries around
things like breakfast. But the more
interesting question is does this research
really tell us that eating breakfast
can prevent obesity? Does it really tell us
that eating breakfast will cause some to
become more active? Does it really tell us
that breakfast skipping can make you overweight
or make it obese? Or, it is more likely,
are they showing that these two things
tend to go together? And this is a really
important difference. And let me kind of state
slightly technical words here. And they sound fancy, but
they really aren’t that fancy. Are they pointing
out causality, which is what it seems like
they’re implying. Eating breakfast causes
you to not be obese. Breakfast causes
you to be active. Breakfast skipping
causes you to be obese. So it looks like they are
kind of implying causality. They’re implying
cause and effect, but really what the study
looked at is correlation. The whole point of this is
to understand the difference between causality
and correlation because they’re saying
very different things. Causality versus correlation. And, as I said, causality
says A causes B. Well, correlation just
says A and B tend to be observed at the same time. Whenever I see B
happening, it looks like A is happening
at the same time. Whenever A is happening,
it looks like it also tends to happen with
B. And the reason why it’s super
important to notice the distinction
between these is you can come to very,
very, very, very, very different conclusions. So the one thing that
this research does do, assuming that it
was performed well, is it does show a correlation. So the study does
show a correlation. It does show, if we
believe all of their data, that breakfast
skipping correlates with obesity and
obesity correlates with breakfast skipping. We’re seeing it
at the same time. Activity correlates with
breakfast and breakfast correlates with activity–
that all of these correlate. What they don’t say–
and there’s no data here that lets me know one way or
the other– what is causing what or maybe you have
some underlying cause that is causing both. So for example, they’re saying
breakfast causes activity, or they’re implying
breakfast causes activity. They’re not saying
it explicitly. But maybe activity
causes breakfast. Maybe. They didn’t write the study
that people who are active, maybe they’re more likely
to be hungry in the morning. Activity causes breakfast. And then you start having
a different takeaway. Then you don’t say, wait,
maybe if you’re active and you skip
breakfast– and I’m not telling you that you should. I have no data one way
or the other– maybe you’ll lose even more weight. Maybe it’s even a
healthier thing to do. We’re not sure. So they’re trying to say,
look, if you have breakfast it’s going to make
you active, which is a very positive outcome. But maybe you can have
the positive outcome without breakfast. Who knows? Likewise they say
breakfast skipping, or they’re implying breakfast
skipping, can cause obesity. But maybe it’s the
other way around. Maybe people who have
high body fat– maybe, for whatever reason, they’re
less likely to get hungry in the morning. So maybe it goes this way. Maybe there’s a causality there. Or even more likely,
maybe there’s some underlying cause that
causes both of these things to happen. And you could think of a bunch
of different examples of that. One could be the
physical activity. And these are all just theories. I have no proof for it. But I just want to
give you different ways of thinking about the same
data and maybe not just coming to the same conclusion that
this article seems like it’s trying to lead us to conclude. That we should eat breakfast if
we don’t want to become obese. So maybe if you’re
physically active, that leads to you being
hungry in the morning, so you’re more likely
to eat breakfast. And obviously being
physically active also makes it so that
you burn calories. You have more muscle. So that you’re not obese. So notice if you
view things this way, if you say physical activity
is causing both of these, then all of a sudden
you lose this connection between breakfast and obesity. Now you can’t make the
claim that somehow breakfast is the magic formula for
someone to not be obese. So let’s say that there
is an obese person– let’s say this is the reality, that
physical activity is causing both of these things. And let’s say that there
is an obese person. What will you tell them to do? Will you tell them, eat
breakfast and you won’t become obese anymore? Well, that might
not work, especially if they’re not
physically active. I mean, what’s going
to happen if you have an obese person who’s
not physically active? And then you tell
them to eat breakfast? Maybe that’ll make things worse. And based on that,
that the advice or the implication from the
article is the wrong thing. Physical activity
maybe is the thing that should be focused on. Maybe something other
than physical activity. Maybe you have sleep,
maybe people who sleep late and they’re not
getting enough sleep, maybe that leads to obesity. And obviously, because they’re
not getting enough sleep, they wake up as late
as possible and they have to run to the
next appointment– or they have to run to school
in the case of students– and maybe that’s why
they skip breakfast. So once again, if you
find someone that’s obese, maybe the rule here isn’t
to force a breakfast down your throat. Maybe it will become even
worse because maybe it is the lack of sleep that’s
causing your metabolism to slow down or whatever. So it’s very, very
important when you’re looking at any of
these studies to try to say, is this a correlation
or is this causality? If it’s correlation, you cannot
make the judgment that, hey, eating breakfast is necessarily
going to make someone less obese. All that tells you is that
these things move together. A better study would be one
that is able to prove causality. And then we could think of
other underlying causes that would kind of break down the
narrative that this piece is trying to say. I’m not saying it’s wrong. Maybe it’s absolutely
true that eating breakfast will fight obesity. But I think it’s equally or more
important to think about what the other causes
are, not to just make a blanket statement like that. So for example,
maybe poverty causes you to skip breakfast
for multiple reasons. Maybe both of your
parents are working. There’s no one there
to give you breakfast. Maybe there’s more
stress in the– who knows what it might be? And so when you
have poverty maybe you’re more likely to
skip breakfast and maybe when there’s poverty,
and maybe you have two– both your
parents are working and the kids have to make their
own dinner and whatever else– maybe they also eat less
healthy at all times of day and then that leads to obesity. So once again in this
situation, if this is the reality of things,
just telling someone to also eat breakfast regardless
of what that breakfast is, even if it’s Fruit Loops
or syrup, that’s probably not going to
help the situation. Maybe it’s just eating
unhealthy dinners is the underlying cause. And if you eat an unhealthy
dinner maybe by breakfast time you’re not hungry
still because you’ve binged so much on breakfast. So you skip breakfast. And this also leads to obesity. But once again, if this
is the actual reality, doing the advice that
that article’s saying might actually be a bad thing. If you need an unhealthy
dinner and then force yourself to eat a
breakfast when you’re not hungry, that might make
the obesity even worse. So the whole point
of this video isn’t to say that the implications
from that article are necessarily wrong. The important thing is to just
realize that it might be wrong. And that just because you saw
this correlation with the data, it doesn’t mean that
eating breakfast is going to somehow magically
fight obesity.

100 comments

  1. I think eating breakfast will not help, if your breakfast is made of big mac, three cheesburgers, fried potatoe and a big coke portion…

  2. @ZergAteu
    If what the study implied was a correlation rather than a causation, they shouldn't have said "eating breakfast may beat teen obesity" because it singles out one correlated factor. Why not name it "getting up early may been teen obesity" or "being more active may beat teen obesity"? Don't get me wrong, I agree with you that the article does not imply causality, but that's only because it's filled with weasel words.

  3. @ZergAteu
    @qtutoringhelps
    I believe a big part of the problem here lies in the target audience. If a paper was published in a scientific journal that used a weasel word here or there the readers, other scientists, would understand the difference between causation and correlation (hopefully). However, the referenced WebMD article is meant to be an easy to digest summary of the study fit for consumption by the masses (non scientists).

  4. In this case, the readers do not likely know the difference between causation and correlation so the weasel words result in the reader believing something is true and backed up by science when really it isn't true (not that it's false, just that it's not proven to be true or false yet).

  5. Weasel words in the paragraph referenced: "…Breakfast **may** Beat…", "…eaters **seemed** more…", "…breakfast **tended** to…". In a scientific journal article these statements would be fine except the title. However, this isn't a scientific journal and the readers are going to take off the surface, they are not going to delve deeply into the statistics.

  6. Ideally, the authors would make the notes that Khan has made and provide some alternatives to causation as Khan did. This way, when the reader reads the article they don't come away thinking, "if I want to lose weight I just need to eat breakfast" but instead, "people are researching ways to help me lose weight, but they still don't know much besides "eat healthy and exercise".

  7. @alique087
    We cannot judge the intentions of the article but we can attempt to draw conclusions about the effects the article has on it's target demographic (non-scientists). A non-scientist reading this article is likely to come away with the impression that eating breakfast will help them lose weight. The article does not technically show causation, but it does imply causation to the average reader who is unfamiliar with causation vs correlation.

  8. Hi Khan. I'm a big fan of your work and like following you.

    I hope you don't mind if I make a few suggestions. I think you did a great job at differentiating between causality and correlation.

    A few things I would recommend mentioning in addition to this are: timing (which came first?), co-linearity (overcoming it to analyze causality), and randomization or multivariate models (how it takes care of the confounding issue you mention). Good luck and great job!

  9. I think giving a more obvious example (e.g. the correlation between the number of known stars and the population of earth) would've been a better way to start, like c0nc0rdance did in his video on the subject.

  10. Maybe breakfast eaters are more active because they are actually awake all day instead of sleeping half the day then waking up and not doing shit.

  11. everyone that i know who smokes marijuana has tried milk.

    milk leads to marijuana smoking.

    what a slippery slope.

  12. @ZergAteu
    I didn't vote it down 🙂
    You might be underestimating the influence the title has over the way the rest of the article is read. If it starts out with "X may beat Y", then the reader will interpret the content as being in line with that summary. It's a tactic frequently used by tabloid journalists.

  13. Edward Bernays (I think it was) engineered propaganda like this around the 50's promoting the Idea that eating egg and bacon every morning was 'a healthy thing to do'.
    He was hired by respective industries in America to do so, due to a glut in those rapidly expanding markets.
    Great upload and perspective pointers here.

  14. As I was listening to the first couple of seconds I thought that maybe american titan General Mills is probably the one that funded that study cited in webmd. Anyway, great video, very khana – educational. People on jury duty should probably see this video before making any decision.

  15. Wow! This video, apart from explaining the difference between correlation and causality, tells us why it is important to know the difference too. Causality is a correlation. However, not all correlations are causalities. What we need is the objectivity to see a causality as a causality and a correlation as a correlation. Above all, we must have the ability to picture the interactions between a multitude of correlations and causalities while evaluating an event.

  16. Causality starts with a C, so does CANCER!

    Those "science articles" in the mainstream always make you cringe. They try to make it sound a lot bigger than it really is, often contain factual errors and the journalist sometimes just clearly doesn't understand a what he's writing about.

    Then you wonder why so many people don't trust science and prefer to believe in all kinds of superstition.

  17. I really like these videos from Khan Academy. I encourage my students to use them to do independent study. There is a definite correlation between using the videos and success in the examinations, but not proof of causality yet 🙂

  18. So smoking correlates with cancer and does not cause it? Smokers tend to engage in other risky activity which promotes cancer?

  19. Correlation sounds boring. Causality and narrative are what our minds crave. Journalists will always imply causality as long as the public is scientifically illiterate.

  20. Thought you were making a pun on my being conspircy minded @blonkm > : P
    Bertrand Russell's another good one to look into for media perspective-setting.
    Your puzzle vids are great btw.

  21. @SEThatered
    My Grammar SS sense is hurting from your comment. It's causing me pain. I don't even get what there is to thumb up about this, it's so obvious and poorly phrased.

  22. @hedonism13
    So, you are an old geezer from SS? You sir are in the "interwebz", and you expect a flawless grammar?
    Good luck in your search! BTW i have suspicion you do have such a junk breakfast each morning, so you felt insulted… And now you pretend to be a megasmartypants-knowitall, and make a "smart" comment…
    You sir, are a troll! Have a nice day!

  23. Great video Sal!
    More about reading research papers please!
    Even the same topic with a different example would be good.
    Cheers!

  24. @SEThatered Hmmm, well I am sure as sure can be that the SS grammar Nazi (with his ironically imperfect grammar) is a troll. I feel it should be said that the defense 'I'm using the internet as my medium of communication' is a little…absurd.

    Questions are provoked from such a stance, I certainly do not percieve it to be a valid or even semi-defensible position to take. Is there a good reason not to strive for good grammar? Why is it stronger when using the internet? Not to cause an argument.

  25. After all these years of hating… I finally start liking statistics 🙂
    I am really sad because of the fact that this is the final lesson.

    Thank You Sal!

  26. I thought this video identifies the difference between correlation and causality, this video should have been named Eating Breakfast Causes Kids to Lose Weight.

  27. I wish more people thought like this, not taking things at face value, making up their own mind. would certainly help with the media and their inflexions towards news and current affairs.

  28. I know lots of fat people who eat breakfast… they eat a lot of breakfast too. So according to this article, they should be anorexic.

  29. Causality always creates correlation, except in a few pathological cases. We generally say that X causes Y when doing X changes the probability of Y occurring. As a result, X and Y will be correlated. The point of this video was to say that correlation does not imply causation. The converse, however, is generally true. Causation implies correlation.

  30. Correlation is saying that two things can go on simultaniously and cause and effect is saying that one thing happens which affects the next thing in the sequence. Directing this to the video, correlation is saying that sleeping late could impact your ability to eat breakfast because maybe they dont have enough time and vice versa and cause and effect is saying that because they are obese, they could not eat breakfast. That is just an example.

  31. Thank u sir, you have expanded my way of thinking or looking at things to a different level, your videos are very helpful and I hope you make more of these useful videos. If you need our help through donations so you can make more videos, than add a link and I will gladly support.

  32. By virtue of its intentional humorous content, this comment compelled me into a state of continuous guffawing.

  33. causality and correlation are synonymous. It doesn't matter if you eat cinnamon buns or bran muffins in the morning you will become overweight at some point in your life if you keep this type of diet. This study is typical research news the public is fed via the media. If the public is generally ignorant(or skeptical) of research methods how are they supposed to make informed decisions about diet.

  34. Causality and correlation are synonymous?

    So the following is true?

    As the number of people who breath air increases, the number of people that die increases. Therefore, breathing air causes death.

  35. Sorry, I think I wrote this in hurry and forgot to correct myself afterwards. Correlation is not necessarily causation but that isn't holy writ either. My point is that if the public doesn't know how research methods are conducted they can't make informed decisions.

  36. Great video! Makes me rethink a lot of articles I've read in the past.

    I'm not entirely knowledgeable on diets, how the body processes foods, and which time of the day the body processes them fastest; but I had always been led to believe that eating a healthy breakfast in the morning gives people the energy boost to be more active throughout the day, or makes non-active people less likely to splurge on snacks and big lunch meals for lunch or dinner.

    However, I do agree on the causality affect. I work a desk job and have not been a very active person until recently when I started doing at home work outs. Since I've started, I've definitely been a lot more hungry in the morning.

  37. Great videos. Loved all of it. I feel the regression part could be expanded to talk of linear and multiple regression analysis, and also explaining how to interpret excel outputs of regression. But overall, really good!

  38. Unexpected sudden ending of 68 video's. Thank you very much for clarifying all this material for me, your explanations are very clear and easy to watch!

  39. carbs are essential, I hate that there is this stigma behind them. Every cell in your body runs of glucose, aka simple/complex sugars… sorry just had to get that out there. great video!

  40. A fluctuation of A and B tend to be observed at the same time, Instead of A causing B; That's a very good example to illustrate Correlation and Causation, in my opinion.

  41. Dude, you can hypothesize all day but without performing an experiment to test your hypothesis, you don't have a leg to stand on. The study 'suggested' that there 'may' be a connection.

  42. I now finished the whole playlist to get some introductory ideas about statistics. I think I got some now. It really provides a good value for learners, great videos, he explains well to make understand intuitively. He's great.

  43. So… since life is filled with variables, how does one conclude actual cause and effect other than in a controlled lab or controlled environment where even those are often misinterpreted based on unforeseen conditions?

  44. I understand the message behind the video, but I disagree with Sal in that WebMD was indicating causality. MD used the words "may" and "seem," which are simply indicators of possibility. If MD would have used "will" or "does," then I would have called them on bad write-ups.

  45. i dont still get it. where's the correlation in there?
    I hope you reply me. I really need the answers for our reporting

  46. Hello sir, thank you for explaining. Have one question – So the way we can calculate and determine if the correlation exists or not, is there a way we can confidently say that this is not just a correlation but causation?

  47. Done with whole statistics playlist and I can say it was a great help. Now I believe I have the working knowledge of the topics covered in this series of lectures. Thank You!

  48. I Appreciate for all the statistics videos You are the one of the best teachers I ever meet!Your teaching is clear and kinda funny sometimes make me laugh lol anyway thank you again!!

  49. All examples are wrong. you gain weight by eating more. you cant gain weight if you eat reasonably (Unless you have some kind of disease). So obesity is due to eating excessively. Thus, eating less will lead you to not be obese.

Leave a Reply

Your email address will not be published. Required fields are marked *