Last week, I got into a bit of a heated discussion with an admin on the facebook Vegonews page. They had shared a graph, attributed to the website www.diseaseproof.com (though I’ve been unable to find it there), which I think is clearly designed to suggest a causative relationship where the data simply does not show one.
Here is the graph:
Apart from the scaremongering “KILLER DISEASES” title, the first thing that struck me upon looking at this graph was that the countries on the left generally have a much higher living standard than those on the right, so people in those countries probably live longer, and thus are more likely to develop diseases such as heart disease and cancer, which tend to affect more people later in life. But that was just a hunch, and if I’m critiquing someone else’s use of data, I should probably have my own to counter with. So I headed over to http://esds.ac.uk/international/ and opened up the World Bank macro dataset “World Development Indicators”. After about five minutes of selecting and downloading data, I had the following information:
|Country||Life expectancy at birth (years)||GDP per capita, PPP (2005 international $)|
|Korea, Dem. Rep.||69||..|
As you can see, I also included GDP for comparison. GDP as an indicator of development is massively abused, and is something I think we should be moving away from as much as possible, but for a quick exercise such as this, I think it is an acceptable shorthand for “can the average person afford to get enough food?”
I’ve also included both Koreas, as the original graph-designer somehow, astonishingly, neglected to specify which one they meant. Is it the famously secretive, dictatorial North Korea, with a life expectancy of 69 and not enough data for the World Bank to estimate their GDP? (though Wikipedia handily estimates it at $2.4k per capita). Or is it the democratic, high-standard-of-living South Korea, where you can expect to live to the ripe old age of 81?
Here’s my graph:
Not the most conclusive graph in the world, but then I would say the same for the original, and sadly I’m sure there are many people who took it at face value. I took a couple of quick averages, splitting the countries into left-of-Greece (where we eat too little unrefined plant foods and die of heart disease and cancer) and right-of-Greece (where we eat nothing but vegetables and nobody gets cancer!). (I excluded both Koreas from this).
Left average life expectancy = 78, GDP = $27k
Right average life expectancy = 73, GDP = $7k
So the question becomes – would you rather die of heart disease at 78, or of something else (starvation, diarrhoea, pneumonia) at 73?
But the thing that most grates about this graph is the apparently random selection of countries. If you have enough data points (e.g. countries) then you can select the ones you want to make a relationship look like it exists where it doesn’t. So I undertook a similar exercise, and downloaded data for all 220 countries available from the World Bank on forested area (as a percentage of total land area) and risk of maternal death (% over a lifetime). And behold, I have found a terrible relationship! We must plant trees in order to save the poor mothers!
(I’d like to say that I didn’t spend a lot of time on this graph, but that would be a lie. It’s actually quite engrossing seeing what you can do once you decide your intention is to abuse the data).
Edited to add:
BadgerBrian points out in the comments that a scatterplot can be a much better visual tool for identifying whether there is a relationship between two variables. His graph here of the GDP and life expectancy of the 12 countries originally mentioned demonstrates this well: there is a strong positive correlation between increasing GDP and increasing life expectancy up until about $25k, then it flattens out (and the USA does a stellar job of having very high income and quite underwhelming life expectancy!) I’ve not been able to get the graph to display in the comments so here it is:
 In my original comment on facebook, I used GDP per capita, constant 2000 US$. I’ve changed this for PPP – Purchasing Power Parity, where the dollar amount of GDP is adjusted to reflect how much it actually costs to afford certain products in that particular country.
 For example, I argued at an Oxfam meeting last year that income should only be used as an measurement of a broader dimension “livelihoods”, rather than be a dimension itself, and will hopefully be using that in the framework for my PhD.