De Novo Gene Origination: Process and Evolutionary Force
Manyuan Long 1
Department of Ecology & Evolution and the College. The University of Chicago, Chicago, IL, United States
Do new proteins originate from scratch? While it was conventionally believed impossible until recently, it is also a challenge to identify de novo genes because of a conflation of orphan genes that can be created through multiple non-de novo mechanisms. We found that plant genomes, not least those from grasses for example Oryza and bamboos, harbor a large number of young de novo genes that were generated from intergenic noncoding sequences one to a few million years ago, showing an unexpected origination rate of de novo proteins. Taking advantage of the young ages (0.5-3 Million years ago) de novo genes in Oryza, we investigated the origination processes since their protogene stages, underlying evolutionary forces and evolution of protein structures. We identified strong positive selection on the protogenes when the genes are initially created by acquiring an open reading frame and expression pattern. While there are few lines of evidence that the detected selection may lead to adaptive evolution to ecological environmental conditions, there are more evidence for that sexual selection and sexual conflict, as a source of the detected positive selection, may more frequently drive the creation of de novo genes. We also detected rapid structural evolution of de novo proteins in a short timescale in terms of the disorder degree, secondary structure enrichment, and tertiary interaction in protein complex.