Thursday, January 21, 2010

Dramatized Data

When I was in the sixth form, rather a long time ago, my English teacher asked the class if anyone had read Rachel Carson's Silent Spring. I turned out to be the only person who had so the teacher asked me what I thought of it. I said that I thought the data were convincing. Her response was that she had been impressed by the power of the writing. I wonder if this reflects a widespread difference in how people react to data. Are there those who ask what do these data tell me and those who ask how do I react to these data? One group is puzzled by being presented with data that seem incomplete, irrelevant or lacking sufficient context to draw conclusions and the other can’t understand why people are pedantically pointing out the limitations of the presented data when to them the statistics are obviously appalling and clearly demonstrate that something should be done.

An example of data presented to elicit an emotional response rather than to diagnose a problem is the statement that 47% of chemistry graduates are women but only 6% of chemistry professors. What does this tell us? We would have expected the percentage of women among professors to match the percentage of women among new graduates only if the system was in a steady state and had been for around forty years. This is clearly not the case. Current professors are largely drawn from a pool of people who were undergraduates 25-45 years ago, i.e. from about 1965 to 1985. With no further information we cannot tell whether these figures show that few women studied chemistry 25-45 years ago or whether the ones that did were much less likely to become professors of chemistry than their male counterparts. In fact this statement is not even very successful at eliciting an emotional response since it is unclear whether we should rejoice that the proportion of women among those studying chemistry is much higher than it used to be or bemoan that the low proportion of women among chemistry professors indicates that many women in chemistry are not fulfilling their potential.

Another example is the institutional gender pay gap. In February 2009 the University of Cambridge published an Equal Pay Review [] which noted that the % Difference between average pay for women and average pay for men was 32%. Sounds terrible, doesn’t it? Especially when the national gender pay gap (calculated using mean hourly pay, excluding overtime) is 16% []. The first thing to notice is that these percentages are not calculated on the same basis. The Cambridge figure tells us the mean pay for men is 32% greater than the mean pay for women while the Office of National Statistics figure tells us that the mean pay for women is 16% less than the mean pay for men. Comparing on the same basis gives either 24% (Cambridge) and 16% (national) using the mean pay for men as a reference or 32% (Cambridge) and 20% (national) using the mean pay for women as a reference. If it were the case that women were paid 24% less than men for doing the same, or equivalent, work, this would, indeed, be appalling. But this is not the case. The analysis by grade shows minimal differences with gender. The information content of the Cambridge gender pay gap is that the University pays its 1500 or so, predominantly male, academic staff rather more than it pays its 1500 or so, predominantly female, clerical and secretarial staff. Even if the academic staff were 50:50 male and female this still would not remove the pay gap because of the very high percentage of women among lower paid support staff. Redressing this imbalance will require an intelligent analysis of the issues and a long-term, sustainable strategy, not knee-jerk reactions to statistics.

The common feature of both these examples is that they are both attempts to summarize a complex, multi-factor situation in one or two numbers. Perhaps this is necessary in some contexts but these ‘sound-bite statistics’ should be backed up by proper data and analysis.

Does it matter? After all, people do not generally take action based on a rational analysis of data. If quoting ‘dramatized data’ inspires action then surely that is a good thing. This approach is not without its dangers. One is that the response to ‘something must be done’ is ‘we must be seen to be doing something’ whether it is relevant and effective or not. The other is that those whose response to data is to ask ‘what does this tell me’ will conclude that the answer is nothing and that there is, therefore, no need to take any action. They may also conclude that if people are quoting unconvincing data it is because there are no convincing data. Which would be a pity because when you look at the percentage of women among academic staff in a school and discover that it has barely changed in a decade you can’t help thinking that there genuinely is a problem.

1 comment: