Tuesday 17 March 2015

Cata data

I'm sure you're all dying to hear the results of the Irish Times Cat Genetics Survey that was launched at the end of January. 30 years ago, I did that sort of thing for a living and wrote a PhD thesis on the cats of New England and the Canadian Maritime Provinces with a special focus on polydactyl cats.  From more than two years of field-work, I had the relevant information on about 10,000 cats from New York to Newfoundland.  The plain people of Ireland, mobilised and empowered by the Irish Times have delivered another 10,000 cats into the mill for analysis.  That's an important proof of principle - if the topic is engaging then people will engage.

But is the data any good?  How would we know?  A couple of weeks ago, when Dick Ahlstrom, the IT's science correspondent, sent me a fat Excel data-sheet with 10,000+ cats from all over Ireland, I gave a little moan of pleasure and did a first pass analysis as a quality control QC check. Last Thursday, he reported the results, which finishes up with an unintelligible statement by a so-called expert called Lloyd.  What he was trying to say was that the dataset has failed a key QC test.  I'll see if I can explain.

It hinges on the fact that, in mammals, sex is determined by the presence of XY chromosomes [males] or XX chromosomes [females] in the fertilised egg. Not to outrageously simplify reality, if you have a Y chromosome then SRY, one of the very few genes on the diminutive Y, will program the cells of the "primordial gonadal ridge" to develop into testes which over the next several months migrate down and almost out.  If you have XX, these cells migrate less far and become ovaries. The X chromosome is about 5% of the whole genome and has many genes including those that cause haemophilia [previously], one form of muscular dystrophy [previously] and red-green colour-blindness  . . . and in cats the gene variant that causes orange fur to develop.  Having two copies of all these (and hundreds more) genes in females would have an impact on the delicate business of development which requires balancing the effects of many genes. That the 'dosage' of genes is important is indicated by the suite of symptoms shown in Down Syndrome where there is an extra Chr21.  Early in pregnancy, in each and every cell of the fetus, one of the X chromosomes is switched off: it could be either the X inherited from the mother or the paternal X and this 'decision' is made at random; girls thus have one functional copy of all those genes just like boys. If a kitten has inherited 'orange' from her mother and 'non-orange' from her father she will grow up tortoiseshell with their fur having random blobs of the two colours [Above R: cutiness factor 8.5].  Clearly male kittens can be either orange or not, because they only have one X chromosome . . . unless they have Klinefelter's syndrome, XXY, in which case you'd expect them to have rather small testes and a suite of other abnormalities  . . . but who'd looking? Lesson on sex-linkage aka inheritance of genes carried on the X.

Klienfelter's occurs in about 1:1000 live births but not all Klinefelter cats are going to be torties because that requires the additional condition of having parents of opposite colours. Let's say that we expect 2/10,000 tortoiseshell males in our dataset.  How many are reported?  244!!  The data:
It's clear that John and Mary O'Phobail needed better instruction on what 'tortoiseshell' looks like, possibly by showing them pictures [above R].  A standard way of dealing with that anomaly is to exclude 'equivocal data' and assume the rest of the observations are sound.  On that basis we conclude that the frequency of Orange in male cats in Ireland in 2015 is 950/(950+3050) = 0.24.  This is much higher than any of the previous values calculated by trained geneticists 40 years ago which range from 0.03 in Waterford to 0.17 in Donegal with Dublin, Dundalk, Limerick and Galway intermediate.  We can probably conclude that The Crowd doesn't really know what geneticists mean by 'orange' and have included a bunch of sandy tabbies in their N=950. At this stage I should give up but I did persist in seeing if the female data was consistent with that from males.  It sort of is, but here there is a disconcerting and statistically significant under-reporting of tortoiseshell in females.

Even the sex-ratio [F: 4786; M: 4244] is a bit squiffy with an excess of females far outside the statistical expectation of 1:1 ratio unless you're dealing with human centenarians where  antient grannies are far more common than old grandfathers.  You could hypothesise that the cats reported, with their excess of females, include a disproportionate number of [grossly obese, coddled, indoor-only] aged female cats like the blob that shares my outlaws' home. But I don't believe it; it's more likely that folks just don't know how to sex a cat [lift the tail, folks].  My one escapade in writing a computer program in COBOL was a contract from The Cat Fancy Association of America to see if they could detect fraudulent pedigrees from the genetic data declared for sire-dam-offspring trios.  A pilot study revealed a 7% error rate!  That showed me that professional cat-breeders hadn't a clue about genetics and couldn't look with sufficient care at their own cats to see what colour they were. In summary, we cannot, with 20/20 hindsight, expect to generate reliable data from owners' reports without a bit more training in the difficult cases.

BUT, I think we we can assume that the errors are systematic.  It would be a stretch to imagine that punters in Carlow were better at cat genetic diagnosis than their city cousins in Cork, so we can probably go forward to establish relative frequencies of the genetic variants, county by county, across the country. There's a BT Young Scientist's project in there, and I have comparative data on hundreds of sites across the World. Go to!

No comments:

Post a Comment