Have you ever seen the word “peptidein”?
Chances are, the answer is no, because it was just coined by a large international team of genomics biologists. “Peptidein” is a neologism combining the front half of “peptide” (a short chain of amino acids, typically from 2 to 50) and the back half of “protein” (normally, these are much longer chains, from 100 to a few thousand amino acids in length).
The DNA sequences that code for peptideins are puzzling. Because of their short length, these sequences have long been overlooked by automated sequence annotation programs, as they fall outside what was customarily considered to be a bona fide, attention-worthy open reading frame (ORF), coding for a protein.
But it wasn’t just their diminutive length that left peptideins as short, lonely bachelors in rented formal wear, leaning on the wall, nervously checking their watches at the annotation dance. These sequences are frequently not conserved across species, or, as the open access article at Nature (linked below) puts it, they exhibit “low evolutionary constraint.” If you don’t look like the other eligibles in the darkened gymnasium, no one is going to give you a second glance.
Someday, and may I live to see it, this use of evolutionary conservation as a necessary criterion for functional status will be seen as another pernicious, imagination-killing consequence of the theory of common descent.
The Australian geneticist and functional RNA expert John Mattick, bless him, nailed this point a few years ago, with this unforgettable quip. Conservation (i.e., shared inter-taxon similarity), he said, is not the be-and-end-all of function. Think about your phone number.
Open access at Nature: “Expanding the human proteome with microproteins and peptideins.”









































