Crop Genome Annotation: A Case Study for the Brassica rapa Genome
Genome annotation is crucial for the bridging the gap between sequence and biology. Nonetheless, it is also a dynamic and continuous improvement process for better understanding of the molecular biology of the genome. With the deep RNA-sequencing of eight Brassica rapa tissues, it should be able to predict protein-coding genes with more accuracy when incorporating this type of RNA information into analysis. In doing so, we used our built annotation pipeline to re-annotate the B. rapa genome on the levels of repetitive elements, protein-coding genes and non-coding RNA genes, respectively. In total, we identified 139.9 MB repetitive elements, 6,088 non-coding RNA genes and 45,149 protein-coding genes, respectively. These results, together with those published previously, would provide a valuable resource for further understanding of B. rapa.
KeywordsGene Ontology Long Terminal Repeat Genome Annotation Repetitive Element Gene Predictor
This work was supported by the National Natural Science Foundation of China (Grant: 31171235). We thank to the people who have contributed to the building and maintaining of the genome annotation pipeline in the Laboratory of Computational Molecular Biology of the Beijing Normal University.
- Jones P, Binns D, Chang H-Y, Fraser M, Li W, et al (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240Google Scholar