Thursday, 2 November 2017

GRIM

Stop the Gardai
The Gardai are in the news again this week, albeit with a re-airing of a shabby same-old same-old story. In April I mentioned a million breathalyser tests which boosted the activity data-sheet of Gardai across the country but which had not actually been carried out. Since then the Garda Commissioner has been induced to fall on her sword and another 400,000 bits of fake data have been brought up out of the murk. Yesterday saw the publication of an auditor's report by Crowe Horwath commissioned by the Policing Authority which ladled out blame at all levels of the force for chicanery, cover-up, disobedience, lying, false accounting, contempt-of-taxpayer . . . and probably the sin against the holy ghost (I haven't read the whole litany yet). As a small example of the hubris endemic in the police force, the Garda Commissioner [the Boss] ordered her regional managers to carry out an analysis of the extent of the problem in their district. Half of these underlings flipped their boss the bird because they couldn't be bothered to obey [her] orders any more and the rest dragged their feet or submitted mere superficialities.

You want to be careful if you lie. You have to be consistent when people ask you what happened and consistency is so much easier if you just have to recollect events as well as you can. If your first account is a fabrication you have to get the same details wrong in the same way or people around you will smell a rat. You may be able to carry it off if you can keep the story straight in your mind. What about science? Suppose you aren't getting the result you [and your boss, maybe] are expecting and decide to fudge the data.  Real data have certain consistencies, which is hard to imitate in fake data.

We've been here before Blobbin' 'bout Benford and the distribution of first digits. Science, and policing, are often about recording numbers and any random scientific paper sergeant's desk report should have a sample of these in tables or embedded in the text. If these numbers are genuine there will be about 6 times as many starting with the digit 1 as beginning with 8 or 9. If they have been invented out of whole cloth - perhaps to support a cherished hypothesis or to satisfy the Garda bean-counters - then the expected Benford ratio may be distorted as the perp inserts numbers 'at random' and includes too many leading  7s 8s and 9s and not enough initial 1s and 2s.  Of course, an accomplished and numerate fraudster may be aware of Benford and fake his data in  a Benford compliant way. If s/he is prepared to create such an elaborate and internally consistent story you have to ask whether just doing some real experiments / breath-test and reporting the real results might not be easier.

But there are other internal consistencies the violation of which should lead the reader to suspect the findings, the analysis and/or the honesty and competence of the authors. One of these is the GRIM - Granularity-Related Inconsistency of Means - which is so simple a child of six could carry it out (with the help of a calculator, maybe). Science is not progressed with a single datum, we all agree that several replications should be carried out so we have data. Often enough, the average of these replications is reported in the paper. The county by county average of breath-tests was certainly sent to headquarters.
The average = arithmetic mean = (sum of the measures) / number of measurements.
If the number of measurements is stated to be or can be deduced to be 6, there are only a few possible answers for the average. Some thing is wonk IF an N=6 average ends in a number other than 
x.00 [0/6]     x.50 [3/6]  
x.17 [1/6]     x.67 [4/6]
x.33 [2/6]     x.83 [5/6]
THEN something is wrong. It may be a typo, or a mistake, or a straw in the wind that the whole dataset is a fantasy.  Of course if (N .ne. 6) but some other total that the acceptable mean-value endings will be different.

I wish I could get access to all that fake data dreamed up by Gardai across the country. Maybe the boys in blue in Donegal tend to use lucky 7s in their madey-uppy statistics, while the lads from Meath are biased towards deep 6s. It must have been like a magician's convention: "Pick a number, any number, between 1 and 100; don't tell me what it is; but write it down on that breath-test report form and sign at the bottom".  

No comments:

Post a Comment