Thursday 6 February 2014

Ascertainment Bias

I mentioned AB a couple of days ago as an explanation of why people persist in believing that St Blaise will cure esophageal cancer with prayers and a couple of candles.  They believe because for the multiple-multiple times the cure fails nobody notices but if someone goes into remission on the 3rd of Feb then it makes the tabloids and the ticker on Fox-news and shares in міжнародні свічки вкл  go UP on the Kiev stock exchange.  Because I needed to make sure of the definition, I came across a great classroom exercise from Steven Carr at Memorial U. in Newfoundland to show the effect of AB.  This is what you do:

Ask a sample of women to record the sexes of everyone in their sib-group (brothers and sisters) including themselves and then total the number of brothers and the number of sisters recorded by all participants.  You should finish up with a biased sex-ratio - there should be more women.  If you ask the reciprocal question of a group of men, there should be an excess of blokes.  You may now dispute that premise - you may claim that there is no such effect.  You may also try to figure out why there should be an effect such as I've described.  But I should first ask you: what is the sex-ratio in normal human populations?  RA Fisher did the math 80 years ago to show that in any population under the action of natural selection (that's ALL populations, Middle America, even Southern Baptists) the sex-ratio must be close to 1:1.  The fact that the sex-ratio is 1:1 is not, from many surveys of students over the years, obvious to all thinking people, whatever about any explanation.

With me, the thought of a "good classroom exercise" is the deed. I've explained the concept to my Mon and Tue Yr1 Biology classes this week and got them to write up the data from their families onto the lab white-board: red for girls and blue for boys of course.  The alternative was to make drawings of a number of preserved specimens of insects, so there was a queue for the pens and I got my data.

Women's Family Men's family

Mon 6 11 16 4
Tue 4 12 12 10
Total 10 23 28 14
This departure from equal ratios is very much statistically significant (ChiSq = 9.7, p < 0.001). I didn't think that the effect would be so visible in such tiny samples.  There are only 10 students in the Tuesday class and 15 on Monday.  But for the total, there is a 2:1 ratio in the expected direction for women's families and for men's.  One of the students pointed out (before we gathered the data) that if you don't bin your respondents by sex, you should get the expected 1:1 ratio and (data incoming):
(10+28) = (23+14)
it is embarrassingly close.

So why does it work?  It works much better now than 100 years ago in Ireland when families were a lot bigger.  It worked perfectly black and white until very recently in China where couples were limited to one child.  If you ask women in such a society to record the sex of all the children of their parents you'll get a sex ratio of  0:1; and if you ask men it will be 1:0.  It works for larger families too but less grossly.
Q. What possibilities are there for sex-distribution in families of two?
A. Three: all boys, all girls and one-of-each.
If you ask women about their families they'll never reply that their family is all boys.  If you ask enough of them, you get a sex-ratio of 2F:1M - because the three possibilities for families which include at least one woman are FF, FM and MF.  one-of-each is twice as common as either all girls or all boys, because  girl then boy and boy then girl are equally likely events.  This last fact is also not obvious to all thinking people.

That is indeed a pretty cool classroom exercise, a bit counter-intuitive, a bit of real data and a bit of hypothesis to test - science in fact!


  1. Nice data set. Classroom mechanics intrude.

    (1) The sex ratio in most intro biology classrooms will be biased towards women, often strongly. This biases the expected end result, that total numbers of male and females sibs should be 1:1. [You didn't mention the sex ratio in your 25 students].

    (2) In a large lecture hall, toting up sibships for 100+ students is tedious. So, ask all women to stand, take a count, then ask all women with brothers to sit. Repeat head count, ask all women with sisters to sit, then count all remaining standees, who are female only-children. The AB point is made swiftly and intuitively, semi-quantitatively, and participatively.

    (3) Fisher's math applies to the primary sex ratio (at conception, or birth), and bright lads and lasses will observe that there are more old ladies than old men. But it always corrects itself in the next generation.

    (4) Newfoundland has a history of Irish Catholic and English Protestant settlement, the former with large families and the latter with small. This persists (you will be quizzed by parents of your date on this point). Protestants *tend* to stop after a first-born boy, less so after a girl, Catholics just keep on going. The advanced student may model sex ratio with a 1:1 birth expectation, and full stop after the first boy.

    Steve C. (Memorial U: three-generation ex William from Co Fermanagh. See if you can guess)

  2. Reviewing this while going through old email: the other means to overcome the classroom mechanics is simply to come prepared with a set of "data" that demonstrates the point (if this makes you squeamish, use last year's data) and appear to do the calculations cold, while having the answers available. NB: after years of student's apparently drinking the reagents for the PCR lab, I have shifted to having each component of the rxn mix be colored H20. Use indicators if you like. Put the tubes in a PCR machine adjusted to do 10 turbo cycles in 30 min, so that everyone can observed the temperature shifts, then whisk the results to another room and replace with pre-dispensed size markers. Students run these out in gels. Announce one of the bands as the PCR product, and have students estimate size from the remainder. Change the band chosen from lab to lab, or year to year. [You can of course pre-prepare an actual PCR amplification product, and run it in combination with or parallel to the size markers].

    Steve C

  3. ... and while I'm at it. The first PhD student of the late great population geneticist Theodosius Dobzhnasky was Chinese. He returned to China at the beginning of the Sino-Japanese War, and was assigned to do medical checkups of army recruits. Thinking to make the checkups do double duty, he also asked about 'tongue rolling', the archetypal ability to roll the tongue into a 'U' shape which at the time was considered a simple Mendelian trait. Oddly, recruits were 'rollers' at a much lower rate than in other populations examined, and he thought he had something. The Ascertainment Bias was realized only years later: recruits, being asked if they could roll their tongues, thought that inability to do so would make them unfit for military service (like flat feet), so they lied. More recent counts show that Han Chinese show typical incidence; the trait is now known not be a simple Mendelian character.

    1. Thanks for all that Steve. It's a bit like all the data about "all humanity" which is just the responses of WEIRD undergraduates who happen to take psych and economics classes when the profs are getting the chains jangled for more research outputs.