91热爆

Data correlation

Correlation

A correlation is a link between two things. Evidence is required to establish a correlation between a factor and an outcome. If an outcome happens when a factor is present, and does not happen when the factor is absent, there is a correlation. Other factors that could affect the outcome also need to be considered.

A correlation graph shows a strong link between deaths from cancer and the number of cigarettes smoked.
Figure caption,
Graph showing number of cigarettes smoked against the number of deaths from cancer

The graph shows that as the number of cigarettes a person smokes increases, the number of deaths from cancer also increases. It shows a correlation, suggesting that the more a person smokes the more likely they are to get lung cancer.

Individual cases do not provide convincing evidence for or against a correlation. For example, there may be a 90-year-old who has smoked 40 cigarettes a day and not developed lung cancer. There may also be a 40-year-old who has never smoked a cigarette but who has developed lung cancer. These are exceptions - they do not disprove the correlation. The 90-year-old is lucky as there has been a correlation between smoking and lung cancer.

Causal link

The fact that there is a correlation between a factor and an outcome does not necessarily mean this factor causes the outcome. Data must be collected and must provide evidence to prove that the factor has caused the outcome.

There is a correlation between the pollen count in the air and the incidence of hay fever, for instance. The pollen count increases from spring onwards, reaching a peak in mid-summer. It is therefore possible that pollen causes hay fever.

There is also a correlation between the amount of ice cream sold during the summer and the number of hay fever cases. But nobody would suggest that eating ice cream causes hay fever.

To show a , scientists must find evidence that scientifically explains the connection.

If there is no scientific explanation then there is only a correlation. It cannot be shown that the factor causes the outcome.

Question

There is a correlation between carbon dioxide levels and global average temperatures. What scientific explanation could provide evidence for a causal link between these factors?

Correlation and risk

Sometimes a change in a factor leads to an outcome, but not in all cases. Scientists say that a change in the factor increases the risk of the outcome. This is very common when discussing causes of ill-health.

For example, when nitrogen dioxide levels stay high for several days, more people have asthma attacks. However, not everyone has an asthma attack. This type of correlation describes how a factor is connected with an increase in the risk of a particular outcome.

Scientists then try to find a causal link to explain the connection. If no causal link is known, this remains a correlation. It cannot be scientifically shown that the factor causes the outcome.

Question

Data shows that children who live near major roads are more likely to suffer from asthma.

Why is this finding a correlation and not a causal link?

Question

What research might scientists carry out to try to find a causal link between living near a major road and asthma?