Thursday 29 June 2017

Metameasuring

At The Institute there are several exit strategies. We offer one and two year Diplomas and Certificates, a three year Degree and a 4 year Honours Degree. The difference between the last two options is a research project. Here each student is given a pick and shovel and a section of the Frontier of Knowledge and is invited to hew at the coal-face of science for a few weeks.  It has to be original science - no point in re-inventing the wheel - so the first task is to do a literature survey. Our students' attempts at this task range from poor to appalling. Back in the day, this meant going to the library and finding a recent review of the field and reading that; and reading the references there cited and then going down a recursive rabbit-hole until you're reading manuscripts by Isaac Newton [portrait] or Antoine-Laurent de Lavoisier. In other words it was all retrospective.

I wrote about the process of reading the literature rather than just photocopying papers in February: "Have you tried neuroxing those papers?". There I cited Eugene Garfield for inventing Current Contents and Science Citation Index SCI in the 1960s which made The Literature much more easily accessible. I regret to report that Dr Garfield [See L being enthusiastic] died 26 Feb 2017 two weeks after I wrote that piece; he was 91. Obits: The Scientist - Nature. The Blob is not the place to get The News!

Current Currents was an indexing/abstracting service which allowed you to scan through this week's reports from the Cutting Edge looking for keywords related to your project. Saved having to scan the Table of Contents ToC of 20 key journals each week. More importantly, it allowed you to scan the contents of journals which your institute couldn't afford and send out a reprint request card [prev] to get the one paper in that week's Journal of Molecular Biology that was of interest. The great thing about SCI was that a ramble through the literature could, after its invention, go in both directions: by recording who cited a key earlier paper you could work forward along another branch of research that was developing parallel to, or diverging from, your own interests. It changed the metaphor of a paper from a leaf on a static twig on a branchlet on a branch on a limb on the trunk of science to a more bushy, interconnected and recursive way of viewing the process.

SCI had unintended consequences: as well as making it easier to read the literature, it also allowed data-wonks to treat the citations as meta-data and analyse them. You could count the number of times your own papers were cited [my track record is fair to middling] and be often bemused at which had gone viral and which jewels were languishing unpolished. And prospective employers could tally up your citations to see whether your contributions had made a bigger dent in science-space than Dr Aziz's or Dr Zabriski's.  Garfield's people also invented the Impact Factor for a journal, initially as an internal metric to help them decide which publications to index. Impact Factor is essentially the average citation count for a paper in that journal. That allowed persuaded scientists to try for a pub in a journal with high impact factor in the hope that more clued-in scientists would read it . . . and cite it . . . and boost your employment prospects.

These objective metrics are fine as aids to the decision-making process or the progress of science but too often became the tail wagging the dog. If you hire by algorithm you mustn't be surprised if you acquire colleagues who are selfish. sexist, specialist and sententious shit-heads. Impact Factor is a generic metric which generates winner-takes-all harrypotterism; if you follow IF you may edge away from the subset of scientists who work in your [minority] field. If you really want a lot of cites, you should develop a new method rather than discover new primary data. Methods papers get widely cited. Garfield himself despaired at the misuse of his tools, castigating “bibliographic negligence” “citation amnesia” because scientists were too lazy to read the literature that they were citing. Have you ever cited a paper which was cited in a paper you have scanned for the figures without reading the original . . . because you were under pressure to get your own stuff out? If you haven't then the chap in the office next door has. It is partly because life itself is not long enough to read all the relevant papers every working day in 2016, six new papers on TLR4 were published. If you're a TLR4 groupie you wouldn't have time to both read this leaf-storm (ignoring TLR3 and TLR5 papers) and do your own work.

I will here gives tribs to Ken Wolfe, a walking genius and my one-time boss at TCD, who was overwhelmed with the task of separating the literature wheat from the chaff. He invented Pubcrawler to scan the daily updates to PubMed for key-words to the areas of science in which he was interested, and send him the abstracts. This super-useful tool was developed by Karsten Hokamp as a free public subscription service. Try it? - " It goes to the library - you go to the pub(TM)"

Our plagiarism policy document at The Institute is called Credit Where Credit is Due, which is what Garfield wanted us all to take on board. I've reflected on that issue as well. PageRank the Google algorithm also acknowledge a debt of gratitude to Eugene 'Serial Pioneer' Garfield. Really interesting and informative interview with Garfield from 2000.

No comments:

Post a Comment