Evolutionary Intelligence

, Volume 4, Issue 2, pp 69–80 | Cite as

Collective classification of textual documents by guided self-organization in T-Cell cross-regulation dynamics

  • Alaa Abi-Haidar
  • Luis M. Rocha
Special Issue


We present and study an agent-based model of T-Cell cross-regulation in the adaptive immune system, which we apply to binary classification. Our method expands an existing analytical model of T-cell cross-regulation (Carneiro et al. in Immunol Rev 216(1):48–68, 2007) that was used to study the self-organizing dynamics of a single population of T-Cells in interaction with an idealized antigen presenting cell capable of presenting a single antigen. With agent-based modeling we are able to study the self-organizing dynamics of multiple populations of distinct T-cells which interact via antigen presenting cells that present hundreds of distinct antigens. Moreover, we show that such self-organizing dynamics can be guided to produce an effective binary classification of antigens, which is competitive with existing machine learning methods when applied to biomedical text classification. More specifically, here we test our model on a dataset of publicly available full-text biomedical articles provided by the BioCreative challenge (Krallinger in The biocreative ii. 5 challenge overview, p 19, 2009). We study the robustness of our model’s parameter configurations, and show that it leads to encouraging results comparable to state-of-the-art classifiers. Our results help us understand both T-cell cross-regulation as a general principle of guided self-organization, as well as its applicability to document classification. Therefore, we show that our bio-inspired algorithm is a promising novel method for biomedical article classification and for binary document classification in general.


Artificial immune systems Biomedical document classification Data mining Machine learning Bio-inspired computing Complex adaptive systems Guided self-organization 



This work was partially supported by a grant from the FLAD Computational Biology Collaboratorium at the Instituto Gulbenkian de Ciencia in Portugal. We also thank the ICARIS2010 committee board for encouraging this work. We acknowledge the computational resources provided by Indiana University used to conduct the simulations we report.


  1. 1.
    Carneiro J, Leon K, Caramalho I, van den Dool C, Gardner R, Oliveira V, Bergman ML, Sepúlveda N, Paixão T, Faro J, Demengeot J (2007) When three is not a crowd: a crossregulation model of the dynamics and repertoire selection of regulatory cd4 t cells. Immunol Rev 216(1):48–68Google Scholar
  2. 2.
    Krallinger M (2009) The biocreative ii. 5 challenge overview, p 19Google Scholar
  3. 3.
    Hunter L, Cohen KB (2006) Biomedical language processing: what’s beyond pubmed?. Mol Cell 21(5):589–594CrossRefGoogle Scholar
  4. 4.
    Jensen L, Saric J, Bork P (2006) Literature mining for the biologist: from information retrieval to biological discovery. Nat Rev Genet 7(2):119–129. doi: 10.1038/nrg1768 CrossRefGoogle Scholar
  5. 5.
    Shatkay H, Feldman R (2003) Mining the biomedical literature in the genomic era: an overview. J Comput Biol 10(6):821–856CrossRefGoogle Scholar
  6. 6.
    Hersh W, Bhupatiraju RT, Corley S (2004) Enhancing access to the bibliome: the trec genomics track. Medinfo 11(Pt 2):773–777Google Scholar
  7. 7.
    Hirschman L, Yeh A, Blaschke C, Valencia A (2005) Overview of biocreative: critical assessment of information extraction for biology. BMC Bioinform 6(Suppl 1):S1CrossRefGoogle Scholar
  8. 8.
    Krallinger M, Valencia A (2007) Evaluating the detection and ranking of protein interaction relevant articles: the biocreative challenge interaction article sub-task (ias). In: Proceedings of the 2nd biocreative challenge evaluation workshopGoogle Scholar
  9. 9.
    Feldman R, Sanger J (2006) The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  10. 10.
    Abi-Haidar A, Kaur J, Maguitman A, Radivojac P, Retchsteiner A, Verspoor K, Wang Z, Rocha LM (2008) Uncovering protein interaction in abstracts and text using a novel linear model and word proximity networks. p 9(Suppl 2):S11Google Scholar
  11. 11.
    Kolchinsky A, Abi-Haidar A, Kaur J, Hamed AA, Rocha LM (2010) Classification of protein-protein interaction full-text documents using text and citation network features. IEEE/ACM Trans Comput Biol Bioinform/IEEE, ACM 7(3):400–411. doi: 10.1109/TCBB.2010.55. URL
  12. 12.
    Hofmeyr SA (2001) An interpretative introduction to the immune system. Design principles for the immune system and other distributed autonomous systemsGoogle Scholar
  13. 13.
    Segel LA, Cohen I (2001) Design principles for the immune system and other distributed autonomous systems. Oxford University Press, OxfordGoogle Scholar
  14. 14.
    Mitchell M (2006) Complex systems: network thinking. Artif Intell 170(18):1194–1212CrossRefGoogle Scholar
  15. 15.
    Peak D, West JD, Messinger SM, Mott KA (2004) Evidence for complex, collective dynamics and distributed emergent computation in plants. PNAS 101(4):918–922CrossRefGoogle Scholar
  16. 16.
    Helikar T, Konvalina J, Heidel J, Rogers JA (2008) Emergent decision-making in biological signal transduction networks. Proc Natl Acad Sci USA 105(6):1913–1918. doi: 10.1073/pnas.0705088105 CrossRefGoogle Scholar
  17. 17.
    Walters M, Sperandio V (2006) Quorum sensing in escherichia coli and salmonella. Int J Med Microbiol 296(2–3):125–131. doi: 10.1016/j.ijmm.2006.01.041 CrossRefGoogle Scholar
  18. 18.
    Pratt SC (2005) Quorum sensing by encounter rates in the ant temnothorax albipennis. Behav Ecol 16(2):488–496. doi: 10.1093/beheco/ari0210.1093/beheco/ari020 CrossRefGoogle Scholar
  19. 19.
    Crutchfield J, Mitchell M (1995) The evolution of emergent computation. PNAS 92(23)Google Scholar
  20. 20.
    Rocha LM, Hordijk W (2005) Material representations: from the genetic code to the evolution of cellular automata. Artif Life 11(1–2):189–214CrossRefGoogle Scholar
  21. 21.
    Shalizi C, Haslinger R, Rouquier J-B, Klinkner K, Moore C (2006) Automatic filters for the detection of coherent structure in spatiotemporal systems. Phys Rev E 73Google Scholar
  22. 22.
    Timmis J (2007) Artificial immune systems today and tomorrow. Nat Comput 6(1):1–18MathSciNetzbMATHCrossRefGoogle Scholar
  23. 23.
    Twycross J, Cayzer S (2002) An immune system approach to document classification. Master’s thesis, COGS, University of Sussex, UKGoogle Scholar
  24. 24.
    Dasgupta D, Nino F (2008) Immunological computation: theory and applications. AUERBACHGoogle Scholar
  25. 25.
    Garrett SM (2003) A paratope is not an epitope: implications for immune networks and clonal selection. pp 217–228Google Scholar
  26. 26.
    Abi-Haidar A, Rocha LM (2008) Artificial immune systems (Proc. ICARIS), pp 36–47Google Scholar
  27. 27.
    Abi-Haidar A, Rocha LM (2008) Artificial life XI: 11th international conference on the simulation and synthesis of living systems. MIT Press, Cambridge, pp 1–9Google Scholar
  28. 28.
    Tsymbal A (2004) The problem of concept drift: definitions and related work. Comput Sci Dep Trinity Coll Dublin 4(C):200415Google Scholar
  29. 29.
    Paul WE, Technologies IO (1993) Fundamental immunology. Raven Press, New YorkGoogle Scholar
  30. 30.
    Burnet SFM (1959) The clonal selection theory of acquired immunity. Vanderbilt University Press, NashvilleGoogle Scholar
  31. 31.
    De Castro LN, Timmis J (2002) Artificial immune systems: a new computational intelligence approach. Springer, BerlinzbMATHGoogle Scholar
  32. 32.
    Sepulveda NH (2009) How is the t-cell repertoire shaped. Ph.D. thesis, Instituto Gulbenkian de CienciaGoogle Scholar
  33. 33.
    Abi-Haidar A, Rocha LM (2010) ICARIS 2010: Proceedings of the 9th international conference on artificial immune systems. In: pp 237–249Google Scholar
  34. 34.
    Abi-Haidar A, Rocha LM (2010) Artificial life XII: twelfth international conference on the simulation and synthesis of living systems. In: pp 706–713Google Scholar
  35. 35.
    Metsis V, Androutsopoulos I, Paliouras G (2006) Spam filtering with Naive Bayes–Which Naive Bayes? In: Third Conference on Email and Anti-Spam (CEAS)Google Scholar
  36. 36.
    Joachims T (2002) Learning to classify text using support vector machines: methods, theory, and algorithms. Kluwer, DordrechtGoogle Scholar
  37. 37.
    Porter MF (1980) An algorithm for suffix stripping. Program 13(3):130–137Google Scholar
  38. 38.
    Sokolova M, Japkowicz N, Szpakowicz S (2006) Beyond accuracy, f-score and roc: a family of discriminant measures for performance evaluation, pp 1015–1021Google Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  1. 1.School of Informatics and ComputingIndiana UniversityBloomingtonUSA
  2. 2.FLAD Computational Biology CollaboratoriumInstituto Gulbenkian de CiênciaOeirasPortugal

Personalised recommendations