It is the waterloo of biochemistry and premedical students everywhere: memorizing the 20 naturally occurring amino acids along with their respective 1- and 3-letter codes. Unlike most classes of biomolecules, this family does not have a systematic naming system, forcing students to come up with their own creative ways of matching structure to name. So how did the amino acids get their names in the first place? Was there either rhyme or reason behind these choices? And why do some of their codes seem completely arbitrary? In his paper “Amino acid names and parlor games,” Murray Saffran, a biochemist at the Medical College of Ohio, discusses the history of these biomolecules.

A few of the amino acids were named with a rational system in mind, based on their chemical groups and resemblance to existing structures. For example, methionine contains a methyl (meth-) group and a sulfur (thio-) group [1]. Proline’s cyclic structure resembles the chemical pyrrole, while phenylalanine contains a benzene ring (also called a phenyl group) [2]. Some amino acids were mistakenly named for incorrect structures, but the name still stuck: believing that it contained an aldehyde group, scientists named a newly discovered amino acid Alanine, only to find out later that it consisted of only a single methyl group [2].

Figure A: the structures of pyrrole (left) and the amino acid proline (right) are very similar [3,4].Figure B: the structure of amino acid phenylalanine with its phenyl group highlighted [5].

Figure A: the structures of pyrrole (left) and the amino acid proline (right) are very similar [3,4].

Figure B: the structure of amino acid phenylalanine with its phenyl group highlighted [5].

Many of the other amino acid names originate from Greek words that describe either their physical properties or source of isolation. “Arginine” comes from the Greek word for “silver,” alluding to the shiny, metallic appearance of its crystals [2]. Glycine was named for its sweet taste; the prefix “Gly” comes from the Greek word for “sweet” [1,2]. “Serine” is derived from the Latin word for silk (“sericus”), as it was isolated from a protein involved in silk production [1]. Some amino acid names originate from English words, making them even more memorable: asparagine and aspartic acid were isolated from asparagus, while glutamine and glutamic acid come from gluten [2]. Histidine was found in tissues (the study of which is called histology) while cysteine was first observed in urine (the bladder is sometimes called a “cyst”) [1,2]. 

Arginine (structure on right) was named for its metallic, silvery appearance [6,7].

Arginine (structure on right) was named for its metallic, silvery appearance [6,7].

Serine (structure on left) was first isolated from proteins involved in the production of silk [8,9].

Serine (structure on left) was first isolated from proteins involved in the production of silk [8,9].

Barring a few exceptions to eliminate redundancy, most of the three-letter codes for the amino acids are simply the first three letters of the amino acid’s name. However, the one-letter codes posed more problems. Only amino acids with unique first letters—cysteine, histidine, valine—could be represented without ambiguity; alanine, arginine, aspartic acid and asparagine cannot all be “A” in shorthand [10]. To solve this problem, scientists came up with new codes phonetically. Alanine remains A; arginine sounds like “R”-ginine, so its 1-letter code is R [10]. Aspar-”D”-ic acid is represented with a D [2]. Glutamine becomes “Q”-lutamine (Q), and phenylalanine is “F”-enylalanine (F) [2]. Some amino acids, while not abbreviated phonetically themselves, are named by their “proximity” to other amino acids. For example, because glutamic acid’s side chain has one more carbon than aspartic acid (D), it is abbreviated as E, the next letter in the alphabet [2]. Because leucine is abbreviated as L, lysine is assigned K [2]. Still other choices are even more ambiguous: tryptophan is abbreviated as W simply because its bulky ring structure is reminiscent of the letter itself [10].

At first, it may seem counterintuitive to use such a complex and unstructured naming system; yet knowing these anecdotes may aid students in memorization. As Saffran points out in his paper, it is much easier to remember “the names and characteristics of 20 relatives and friends” than it is a list of 20 random names [2]. By learning the history behind the amino acid names, you may remember a unique detail that can help you recall the one-letter code or structure when it comes to test time.

References

  1. Vickery, H. B. and Schmidt, C. L. A. The History of the Discovery of the Amino Acids. Chemical Reviews 1951, 9 (2), 169-318. https://pubs-acs-org.ezproxy.rice.edu/doi/pdf/10.1021/cr60033a001

  2. Saffran, M. Amino Acid Names and Parlor Games: From Trivial Names to a One-Letter Code, Amino Acid Names Have Strained Students’ Memories. Is a More Rational Nomenclature Possible? Biochemical Education 1998, 26 (2), 116–118. https://doi.org/10.1016/S0307-4412(97)00167-2.

  3. luketoboot. (2012, October 8). Pyrrole [Image]. Retrieved from https://www.flickr.com/photos.

  4. Stevens, C. (2013, November 7). Proline [Image]. Retrieved from https://www.flickr.com/photos

  5. Keith&KatieBond. (2011, September 21). Phenylalanine [Image]. Retrieved from https://www.flickr.com/photos/keithkatiebond.

  6. Stewart, P. (2007, July 9). Silver coin of Otacilia [Photograph]. Retrieved from https://www.flickr.com/photos/peterstewart

  7. Kirchoff, B. (2015, October 6). 118 Arginine 9 STR [Image]. Retrieved from https://www.flickr.com/photos/brucekirchoff.

  8. Stevens, C. (2013, November 7). Serine [Image]. Retrieved from https://www.flickr.com/photos.  

  9. 1sock. (2013, November 10). Silk [Photograph]. Retrieved from https://www.flickr.com/photos/1sock

  10. IUPAC-IUB Comm. on Biochemical Nomenclature. A One-Letter Notation for Amino Acid Sequences. Tentative Rules. Biochemistry 1968, 7, (8), 2703-2705. https://pubs-acs-org.ezproxy.rice.edu/doi/pdf/10.1021/bi00848a001

Comment