Exploring the Power of Outliers for Cross-Domain Literature Mining

  • Borut Sluban
  • Matjaž Juršič
  • Bojan Cestnik
  • Nada Lavrač
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7250)


In bisociative cross-domain literature mining the goal is to identify interesting terms or concepts which relate different domains. This chapter reveals that a majority of these domain bridging concepts can be found in outlier documents which are not in the mainstream domain literature. We have detected outlier documents by combining three classification-based outlier detection methods and explored the power of these outlier documents in terms of their potential for supporting the bridging concept discovery process. The experimental evaluation was performed on the classical migraine-magnesium and the recently explored autism-calcineurin domain pairs.


Outlier Detection Query Term Domain Pair Literature Mining Domain Outlier 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Aggarwal, C.C., Yu, P.S.: Outlier detection for high dimensional data. In: Sellis, T. (ed.) Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, pp. 37–46 (2001)Google Scholar
  2. 2.
    Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. Journal of Artificial Intelligence Research 11, 131–167 (1999)zbMATHGoogle Scholar
  3. 3.
    Dubitzky, W., Kötter, T., Schmidt, O., Berthold, M.R.: Towards Creative Information Exploration Based on Koestler’s Concept of Bisociation. In: Berthold, M.R. (ed.) Bisociative Knowledge Discovery. LNCS (LNAI), vol. 7250, pp. 11–32. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  4. 4.
    Fortuna, B., Grobelnik, M., Mladenic, D.: OntoGen: Semi-automatic Ontology Editor. In: Smith, M.J., Salvendy, G. (eds.) HCII 2007. LNCS, vol. 4558, pp. 309–318. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  5. 5.
    Juršič, M., Mozetič, I., Erjavec, T., Lavrač, N.: Lemmagen: Multilingual lemmatisation with induced ripple-down rules. Journal of Universal Computer Science 16(9), 1190–1214 (2010)Google Scholar
  6. 6.
    Juršič, M., Sluban, B., Cestnik, B., Grčar, M., Lavrač, N.: Bridging Concept Identification for Constructing Information Networks from Text Documents. In: Berthold, M.R. (ed.) Bisociative Knowledge Discovery. LNCS (LNAI), vol. 7250, pp. 66–90. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  7. 7.
    Koestler, A.: The act of creation. MacMillan Company, New York (1964)Google Scholar
  8. 8.
    Macedoni-Lukšič, M., Petrič, I., Cestnik, B., Urbančič, T.: Developing a deeper understanding of autism: Connecting knowledge through literature mining. Autism Research and Treatment (2011)Google Scholar
  9. 9.
    Mednick, S.A.: The associative basis of the creative process. Psychological Review 69, 219–227 (1962)CrossRefGoogle Scholar
  10. 10.
    Petrič, I., Cestnik, B., Lavrač, N., Urbančič, T.: Outlier detection in cross-context link discovery for creative literature mining. The Computer Journal (2010)Google Scholar
  11. 11.
    Petrič, I., Urbančič, T., Cestnik, B.: Literature mining: Potential for gaining hidden knowledge from biomedical articles. In: Bohanec, M., et al. (eds.) Proceedings of the 9th International Multiconference Information Society, pp. 52–55 (2006)Google Scholar
  12. 12.
    Petrič, I., Urbančič, T., Cestnik, B.: Discovering hidden knowledge from biomedical literature. Informatica 31, 15–20 (2007)Google Scholar
  13. 13.
    Petrič, I., Urbančič, T., Cestnik, B., Macedoni-Lukšič, M.: Literature mining method RaJoLink for uncovering relations between biomedical concepts. Journal of Biomedical Informatics 42(2), 220–232 (2009)Google Scholar
  14. 14.
    Petrič, I., Cestnik, B., Lavrač, N., Urbančič, T.: Bisociative Knowledge Discovery by Literature Outlier Detection. In: Berthold, M.R. (ed.) Bisociative Knowledge Discovery. LNCS (LNAI), vol. 7250, pp. 313–324. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  15. 15.
    Sluban, B., Gamberger, D., Lavrač, N.: Performance analysis of class noise detection algorithms. In: Ågotnes, T. (ed.) Proceedings of the 5th Starting AI Researchers Symposium - STAIRS at ECAI 2010, pp. 303–314 (2011)Google Scholar
  16. 16.
    Sluban, B., Juršič, M., Cestnik, B., Lavrač, N.: Evaluating Outliers for Cross-Context Link Discovery. In: Peleg, M., Lavrač, N., Combi, C. (eds.) AIME 2011. LNCS, vol. 6747, pp. 343–347. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  17. 17.
    Smalheiser, N.R., Swanson, D.R.: Using ARROWSMITH: a computer-assisted approach to formulating and assessing scientific hypotheses. Comput. Methods Programs Biomed. 57(3), 149–153 (1998)CrossRefGoogle Scholar
  18. 18.
    Srinivasan, P.: Text mining: Generating hypotheses from MEDLINE. Journal of the American Society for Information Science and Technology 55, 396–413 (2004)CrossRefGoogle Scholar
  19. 19.
    Swanson, D.R.: Undiscovered public knowledge. Library Quarterly 56(2), 103–118 (1986)CrossRefGoogle Scholar
  20. 20.
    Swanson, D.R.: Medical literature as a potential source of new knowledge. Bulletin of the Medical Library Association 78(1), 29–37 (1990)Google Scholar
  21. 21.
    Swanson, D.R., Smalheiser, N.R., Torvik, V.I.: Ranking indirect connections in literature-based discovery: The role of medical subject headings (mesh). Journal of the American Society for Information Science and Technology 57(11), 1427–1439 (2006)CrossRefGoogle Scholar
  22. 22.
    Urbančič, T., Petrič, I., Cestnik, B., Macedoni-Lukšič, M.: Literature Mining: Towards Better Understanding of Autism. In: Bellazzi, R., Abu-Hanna, A., Hunter, J. (eds.) AIME 2007. LNCS (LNAI), vol. 4594, pp. 217–226. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  23. 23.
    Weeber, M., Vos, R., Klein, H., de Jong-van den Berg, L.T.W.: Using concepts in literature-based discovery: Simulating Swanson’s Raynaud–fish oil and migraine–magnesium discoveries. Journal of the American Society for Information Science and Technology 52, 548–557 (2001)CrossRefGoogle Scholar

Copyright information

© The Author(s) 2012 2012

Authors and Affiliations

  • Borut Sluban
    • 1
  • Matjaž Juršič
    • 1
  • Bojan Cestnik
    • 1
    • 2
  • Nada Lavrač
    • 1
    • 3
  1. 1.Jožef Stefan InstituteLjubljanaSlovenia
  2. 2.Temida d.o.o.LjubljanaSlovenia
  3. 3.University of Nova GoricaNova GoricaSlovenia

Personalised recommendations