Minimum Factorization Agreement of Spliced ESTs

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5724)


Producing spliced EST sequences is a fundamental task in the computational problem of reconstructing splice and transcript variants, a crucial step in the alternative splicing investigation. Now, given an EST sequence, there can be several spliced EST sequences associated to it, since the original EST sequences may have different alignments against wide genomic regions.

In this paper we address a crucial issue arising from the above step: given a collection C of different spliced EST sequences that are associated to an initial set S of EST sequences, how can we extract a subset C′ of C such that each EST sequence in S has a putative spliced EST in C′ and C′ agree on a common alignment region to the genome or gene structure?

We introduce a new computational problem that models the above issue, and at the same time is also relevant in some more general settings, called Minimum Factorization Agreement (MFA). We investigate some algorithmic solutions of the MFA problem and their applicability to real data sets. We show that algorithms solving the MFA problem are able to find efficiently the correct spliced EST associated to an EST even when the splicing of sequences is obtained by a rough alignment process. Then we show that the MFA method could be used in producing or analyzing spliced EST libraries under various biological criteria.


Alternative Splice Intron Retention UNIGENE Cluster Transcript Isoforms Factorization Agreement 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ausiello, G., Crescenzi, P., Gambosi, V., Kann, G., Marchetti-Spaccamela, A., Protasi, M.: Complexity and Approximation: Combinatorial optimization problems and their approximability properties. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  2. 2.
    Bonizzoni, P., Mauri, G., Pesole, G., Picardi, E., Pirola, Y., Rizzi, R.: Detecting alternative gene structures from spliced ests: A computational approach. Journal of Computational Biology 16(1), 43–66 (2009) PMID: 19119993CrossRefPubMedGoogle Scholar
  3. 3.
    Bonizzoni, P., Rizzi, R., Pesole, G.: Computational methods for alternative splicing prediction. Briefings in Functional Genomics and Proteomics Advance 5(1), 46–51 (2006)CrossRefGoogle Scholar
  4. 4.
    Brett, D., Hanke, J., Lehmann, G., Haase, S., Delbruck, S., Krueger, S., Reich, J., Bork, P.: EST comparison indicates 38% of human mRNAs contain possible alternative splice forms. FEBS Letters 474(1), 83–86 (2000)CrossRefPubMedGoogle Scholar
  5. 5.
    Caceres, J., Kornblihtt, A.: Alternative splicing: multiple control mechanisms and involvement in human disease. Trends Genet. 18(4), 186–193 (2002)CrossRefPubMedGoogle Scholar
  6. 6.
    Castrignanò, T., D’Antonio, M., Anselmo, A., Carrabino, D., Meo, A.D.D., D’Erchia, A.M., Licciulli, F., Mangiulli, M., Mignone, F., Pavesi, G., Picardi, E., Riva, A., Rizzi, R., Bonizzoni, P., Pesole, G.: Aspicdb: A database resource for alternative splicing analysis. Bioinformatics 24(10), 1300–1304 (2008)CrossRefPubMedGoogle Scholar
  7. 7.
    Castrignanò, T., Rizzi, R., Talamo, I.G., Meo, P.D.D., Anselmo, A., Bonizzoni, P., Pesole, G.: Aspic: a web resource for alternative splicing prediction and transcript isoforms characterization. Nucleic Acids Research 34, 440–443 (2006)CrossRefGoogle Scholar
  8. 8.
    Eyras, E., Caccamo, M., Curwen, V., Clamp, M.: ESTGenes: alternative splicing from ESTs in Ensembl. Genome Res. 14, 976–987 (2004)CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Galante, P., Sakabe, N., Kirschbaum-Slager, N., de Souza, S.: Detection and evaluation of intron retention events in the human transcriptome. RNA 10(5), 757–765 (2004)CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Gupta, S., Zink, D., Korn, B., Vingron, M., Haas, S.: Genome wide identification and classification of alternative splicing based on EST data. Bioinformatics 20(16), 2579–2585 (2004)CrossRefPubMedGoogle Scholar
  11. 11.
    Heber, S., Alekseyev, M., Sze, S., Tang, H., Pevzner, P.: Splicing graphs and EST assembly problem. Bioinformatics 18(suppl. 1), S181–S188 (2002)CrossRefGoogle Scholar
  12. 12.
    Kan, Z., Rouchka, E.C., Gish, W.R., States, D.J.: Gene structure prediction and alternative splicing analysis using genomically aligned ESTs. Genome Res. 11(5), 889–900 (2001)CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Kim, N., Shin, S., LeeSanghyuk: Ecgene: genome-based est clustering and gene modeling for alternative splicing. Genome Research 15(4), 5 (2005)CrossRefGoogle Scholar
  14. 14.
    Lacroix, V., Sammeth, M., Guigó, R., Bergeron, A.: Exact transcriptome reconstruction from short sequence reads. In: Crandall, K.A., Lagergren, J. (eds.) WABI 2008. LNCS (LNBI), vol. 5251, pp. 50–63. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  15. 15.
    Leipzig, J., Pevzner, P., Heber, S.: The Alternative Splicing Gallery (ASG): bridging the gap between genome and transcriptome. Nucleic Acids Res. 32(13), 3977–3983 (2004)CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Raz, R., Safra, S.: A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP. In: STOC, pp. 475–484 (1997)Google Scholar
  17. 17.
    Wu, T.D., Watanabe, C.K.: Gmap: a genomic mapping and alignment program for mRNA and est sequence. Bioinformatics 21(9), 1859–1875 (2005)CrossRefPubMedGoogle Scholar
  18. 18.
    Xie, H., Zhu, W., Wasserman, A., Grebinskiy, V., Olson, A., Mintz, L.: Computational analysis of alternative splicing using EST tissue information. Genomics 80(3), 326–330 (2002)CrossRefPubMedGoogle Scholar
  19. 19.
    Xing, Y., Resch, A., Lee, C.: The multiassembly problem: reconstructing multiple transcript isoforms from EST fragment mixtures. Genome Res. 14(3), 426–441 (2004)CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Xu, Q., Modrek, B., Lee, C.: Genome-wide detection of tissue-specific alternative splicing in the human transcriptome. Nucleic Acids Res. 30(17), 3754–3766 (2002)CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  1. 1.DISCoUniv. Milano-BicoccaItaly
  2. 2.Dip. StatisticaUniv. Milano-BicoccaItaly
  3. 3.Dip. Scienze dei Linguaggi, della Comunicazione e degli Studi CulturaliUniv. BergamoItaly

Personalised recommendations