, Volume 66, Issue 6, pp 539-554,
Open Access This content is freely available online to anyone, anywhere at any time.
Date: 03 Jun 2008

Collagen’s Triglycine Repeat Number and Phylogeny Suggest an Interdomain Transfer Event from a Devonian or Silurian Organism into Trichodesmium erythraeum


Two competing effects at two vastly different scales may explain collagen’s current translation length. The necessity to have long molecules for maintaining mechanical integrity at the organism and supraorganism scales may be limited by the need to have small molecules capable of robust self-assembly at the nanoscale. The triglycine repeat regions of all 556 currently cataloged organisms with collagen-like genes were ranked by length. This revealed a sharp boundary in the GXY transcript number at 1032 amino acids (344 GXY repeats). An anomalous exception, however, is the intron-free Trichodesmium erythraeum collagen gene. Immunogold atomic force microscopy reveals, for the first time, the presence of a collagen-like protein in T. erythraeum. A phylogenetic protein sequence analysis which includes vertebrates, nonvertebrates, shrimp white spot syndrome virus, Streptococcus equi, and Bacillus cereus predicts that the collagen-like sequence may have emerged shortly after the divergence of fibrillar and nonfibrillar collagens. The presence of this anomalously long collagen gene within a prokaryote may represent an interdomain transfer from eukaryotes into prokaryotes that gives T. erythraeum the ability to form blooms that cover hundreds of square kilometers of ocean. We propose that the collagen gene entered the prokaryote intron-free only after it had been molded by years of mechanical selective pressure in larger organisms and only after large, dense food sources such as marine vertebrates became available. This anomalously long collagen-like sequence may explain T. erythraeum’s ability to aggregate and thus concentrate its toxin for food-source procurement.