I was up an deatach mór - in the Big Smoke last week. No, not an drúcht cheo - the bog smoke - that's something completely different. The Big Smoke is Dublin, what passes for a city (a couple of cathedrals and three universities) in The Republic. I was asked to come up and consult on yet another MSc, which is being launched in TCD next academic year, but I came on an early bus so that I could do Weekly Lab Meeting with my old immunology research group. The Effective in the mill that day was the post-graduate student who is processing the samples of the super-innate women who were exposed to Hepatitis C Virus HCV in the anti-D blood scandal of a generation ago. These Rhesus-negative women were, according to the imperfect records of the BTSB, defo given a shot of antibodies against the Rhesus D antigen; which was defo contaminated with HCV . . . yet they are now approaching the pensionable years with not a bother on them. Not a hepatitis-bother anyway.
IF the TCD Comparative Immunology team can find a molecular marker that distinguishes the super-innate women's blood from age matched controls THEN they have a really strong lead for developing therapies for other people - about 200 million of them across the world - who are carrying HCV and just waiting for jaundice, cirrhosis and an early death. It's a really lovely project for a young chap to get his teeth into: dusty archives, cutting edge technology, loadsa lovely data. Certainly a change from making chocolates in Kerry.
An interesting principle came up in the discussion: although a lot of scientific experiments are designed to give a Y/N 1/0 +/- on/off red/green answer for each of the cases - some of the samples are more equal than others. Some are yes but others are YES. And whatever your assay, there will be some cases which are incertae sedis a bit grey and hard to place. There are a couple of ways to deal with such problem cases. The easiest is <Fall eins> to call them missing data and dump them from the analysis. The other is <Fall zwei> to step back from thinking as a standard reductionist scientist and design a more realistic experiment.
<Case 1> That can be a hard call in biomedical research where it is often rather difficult to recruit and retain human contributors as guinea-pigs. At the lab meeting, some data was presented as 'contaminated' or 'not contaminated' [with HCV]. In the middle of the list was a number that looked to me like the con vs non-con called was wrong. Obviously, I was a blow-in; they had seen these data many time before and decided that was the best decision. Wearing my 'external peer-reviewer' hat, I maintained that it looked bad and said the data could be non-binary; ie yes/maybe/no - or - yes/[maybe]/no - or after trimming - yes / no having put aside the niggling number even at the cost of reducing the sample size. But you want to be really careful of p-hacking your data - parsing your data in a couple of different permutations until you get a result with which you feel comfortable.
<Case 2> Traditional scientific experiments are set up so that all the possible and imaginable variables except one are kept constant between two samples: treatment vs control. After the experimental procedure, the results are tallied up. If all the individuals under treatment go blue while all the controls stay pink then you have a paper. If it's 90:10 vs 10:90 then you might have a paper - it depends on the sample size and you will need to do a statistical test. But that sort of paper might mean nothing at all in the real world because nothing exists in a vacuum and no effect happens isolated from the maelstrom of physical and biochemical interactions in a cell. It's better to set up an experiment in a way that more closely mimics the booming buzzing confusion of the real world; measure a bunch of variables and apply an appropriate form of multivariate statistical analysis [whc prev]. Because you are making a lot more measurements, each one contribution a certain amount of statistical noise, you need to recruit a bigger sample in the first place.
And there we are again back to the key point about modern science: to mean anything, the results need to be robust and reproducible [see Brian Nosek] and that probably means the biggest sample you can afford. If you can't afford a big enough sample, you should return your grant money before you spend it so it can be consolidated with other flitters and straws of cash and somebody can do some decent science.