Science matters: Sexiest Protein Competition

Wednesday, 16 December 2015

Sexiest Protein Competition

Pubmed is a database of the scientific literature. A key element of science is endeavouring not to unwittingly reinvent the wheel by doing an experiment that someone has already carried out. Replicating another group's experiments by design is another matter and carried out less frequently than might be desirable. So before you launch your research project you should read the scientific papers that have appeared on the subject. This will stop you getting a red face from being seen to copy someone else's work as if your contribution was totally novel. Reading will also fill in the gaps in your knowledge, give you inspiration and food for thought and help you see places where you and your students can usefully make a contribution. But b'gob you cannot read every paper ever written: there are 26.7 million papers indexed in PubMed, with 1.15 million which came out this year.
You had better learn how to use Pubmed effectively so that a) you get to read, or at least scan, all the papers of interest b) you don't have to trudge through lots of irrelevant off-topic material to locate the jewels. Years ago, I wrote a manual called Better PubMed, and I've updated it periodically. At the end it points out some of the hilarious blunders that lurk in this all-compassing database like a number of papers which include both psuedogene AND pseudogene in the same Abstract. You can't do anything about that but you can find

papers published out of Institutions in Waterford: waterford [AD]
the couple of papers published by Dr Mouse ignoring the couple of million papers published about The Mouse Mus musculus: mouse [AU]
papers published by Dr S Bob: Bob S [AU]
papers published in the noughties: 2000:2009 [PDAT]

I've been on about PubMed before insofar as it exposes a pernicious HarryPotterism at the heart of science. Aled Edwards in Canada has made a devastating analysis of this funding-fondling problem. Scientists don't study what's important, so much as they study what other scientists are working on. Some areas, some genes, some proteins 'get legs' and sweep all before them, leaving a lot of orphan genes weeping for lack of attention in the corners. How to encourage students on, say, a Masters of Imm course to find out how to use PubMed effectively? Why, run a competition, of course! offering a small bag of Werther's Original butter candies. I asked them all to bring to class the name of an immune Protein-of-Interest on which they would be carrying out their molecular evolutionary analyses.

The first step in any research project is to discover what the competition is doing . . . by reading the literature . . . using PubMed to open the door to these data. I suggested that we could look into the hypothesis that some proteins/genes were more "sexy" [as in hot current trendy] than others.
Q. How to measure that?
A. Count the number of publications about Protein "P"; then count the number that have appeared since, say, Jan 2014. Divide the latter by the former et voila! you have a Sexy Quotient.

Protein	PubMed	Recent	Sexy Qt	Protein	PubMed	Recent	Sexy Qt
NLRP3	2173	1024	0.47	CD47	870	157	0.18
IL28B	1249	462	0.37	CTLA4	5647	882	0.16
CD3g	39	12	0.31	ERBB2	22375	2691	0.12
NFKBIA	212	64	0.30	CD56	7874	941	0.12
IL8	2586	712	0.28	p53	78696	9351	0.12
NFKB1	781	195	0.25	iKba	69	8	0.12
MyD88	5031	1239	0.25	CCR5	8208	871	0.11
STAT3	14356	3518	0.25	CD154	7209	584	0.08
TLR4	13752	3345	0.24	EBAG9	164	8	0.05
RAG1	1417	273	0.19	recA	6351	260	0.04

I've sorted the chosen proteins by hotness, and there turns out to be an order of magnitude in the difference between Princess and Cinderella. Why might this make a difference? You can see that some widely cited proteins, like TLR4 and STAT3 are really going off the boil, while NLRP3 and IL28B are on the up-and-up. If you have a choice, I suggest you are going to pull down more grant money and find it easier to publish in Nature if you devote your time to Sexy Proteins than tired old dowager proteins.

Science matters

Wednesday, 16 December 2015

Sexiest Protein Competition

No comments:

Post a Comment