I've riffed before on Pointless - the TV quiz game where success is when you can give a correct answer which nobody else has picked. If the question is "Name a female scientist who contributed to biomedical science in the late 20thC" then Margaret Dayhoff will be a winning Pointless answer. The answer to "Which pair of scientists made the first contribution to cracking the genetic code?" is not "Crick and Watson" - they 'just' gave us the physical structure of DNA. It is rather Nirenberg and Matthaei who in 1961 determined that UUU codes for Phenylalanine. That was the first codon assignment. The rest tumbled into place over the next 4 years, revealing that 20 amino acids are the basic inventory from which all proteins - all the enzymes, all the receptors, actin & myosin, haemoglobin, oxytocin, insulin - are constructed. The trouble is that the 20 amino acids were known and named years before the genetic code was AThing. The smallest, glycine, is from γλυκός glycos because it tastes sweet. I'm not sure about the connexion with soya Glycine max. Serine was first isolated from sericum the Latin for silk etc.
Dayhoff's first qualification was in mathematics which she subsequently started to apply to physical chemistry including the nature of chemical bonds. From there she moved into the structure of proteins and applied her mathematical and computing toolkit to the storage, retrieval and analysis of protein sequences - of which an increasing number were coming on stream. In 1960, she was appointed associate director of the National Biomedical Research Foundation in Maryland. Back then, protein sequencing was running in parallel and quite a way ahead of DNA/RNA sequencing. The first substantive piece of RNA sequencing saw RW Holley take a whole year 1965 to work out the 80ish bases of Alanine tRNA. That would now be knocked off in a μ-second. aNNyway, Dayhoff saw that the inventory of protein sequences was growing exponentially and, albeit from a small baseline, was going to get massive. Writing down each sequence on paper wasn't going to be the answer. Accordingly, she started to record sequences on punched cards [prev] and quickly grew dissatisfied with the convention that each amino acid was represented by a three-letter abbreviation based on its first three letters in English: Phe, Gly, Ser have been mentioned above. Dayhoff realised that with only 20 AAs in the inventory, each could be uniquely identified with one of the 26 letters in the Latin alphabet.
But whoops, here are those 20 amino acids: alanine - arginine - asparagine - aspartic acid - cysteine - glutamine - glutamic acid - glycine - histidine - isoleucine - leucine - lysine - methionine - phenylalanine - proline - serine - threonine - tryptophan - tyrosine - valine - and the first thing you note is that 20% of them begin with A! So her first pass was to assign the easy [unique initial] ones:
- C H I M S V
- it was also easy to assign F to phenylalanine at this stage which freed up
- P for proline
- 8/20 done
- A = alanine; [G = glutamine]; L=leucine; T = threonine
- that allowed K for lysine as the next unassigned letter in the alphabet.
- 13/20 done
- let's reverse a bit to give G = glycine then
- D = aspartate, the E = glutamate to fill in the early hole between C = Cys and F = Phe
- N = asparagiNe and Q = glutamine [G looks a bit like Q] fills a similar later hole.
- note that D precedes E because Aspartate precedes Glutamate
- (18-1)/20 done
- R = aRginine; Y=tYrosine
- and W the biggest letter is given to the largest amino acid tryptophan
- and that's it!
- 20/20 for Margaret Dayhoff
Life has gotten more complex since those idyllic simple early days: we've discovered selenocysteine Sec U and pyrrolysine Pyl U. We finally give B to aspar* and Z to glutam* as ambiguity codes because a lot of the chemical protein sequencing protocols render the acids indistinguishable from their amides. Phew! with U and O we have a full set of vowels to play with.
Now the alphabet is almost full [J and X only unassigned] and we can use protein sequences to write names as a kind of geek-code. If you want to out-geek the geeks you can write your name as a peptide using Peptify a toy developed by Nuritas to stop their employees playing solitaire on their lunch-breaks. Nuritas is the spin-off of Nora Khaldi [bloboprev] an entrepreneurial woman in science. Here's PeptoBob me:
more women in science
I am Margaret Dayhoff's son in law Perhaps you wouls like to know that her Daughter (my wife) Ruth Dayhoff M.D. is a widely recognized Pioneer in medical computing and her Granddaughter Margaret Dayhoff Brannigan PhD is a molecular biologist with FDA.
ReplyDeletewe can be reached at Firelaw@firelaw.us Vincent Brannigan
ReplyDelete