Advertisement

MathWebSearch 0.5: Scaling an Open Formula Search Engine

  • Michael Kohlhase
  • Bogdan A. Matican
  • Corneliu-Claudiu Prodescu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7362)

Abstract

MathWebSearch is an open-source, open-format, content-oriented search engine for mathematical formulae. It is a complete system capable of crawling, indexing, and querying expressions based on their functional structure (operator tree) rather than their presentation.

In version 0.5, we concentrate on scalability issues in MathWebSearch to take advantage of corpora in the giga-formula range. We re-implemented the index to make it distributable and made all the APIs web standards conformant. Our experiments show that this architecture results in a scalable application.

Keywords

Search Engine Query Term Slave Node Index Node Query Response Time 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [arXMLiv]
    arXMLiv build system, http://arxmliv.kwarc.info (visited on May 15, 2010)
  2. [Ber]
    Berkeley, D.B.: http://www.oracle.com/technology/products/berkeley-db/index.html (visited on March 03, 2010)
  3. [BF06]
    Borwein, J.M., Farmer, W.M. (eds.): MKM 2006. LNCS (LNAI), vol. 4108. Springer, Heidelberg (2006)zbMATHGoogle Scholar
  4. [Dav+11]
    Davenport, J.H., Farmer, W.M., Urban, J., Rabe, F. (eds.): MKM 2011 and Calculemus 2011. LNCS (LNAI), vol. 6824. Springer, Heidelberg (2011)zbMATHGoogle Scholar
  5. [Gra96]
    Graf, P.: Term Indexing. LNCS, vol. 1053. Springer, Heidelberg (1996)Google Scholar
  6. [GSK11]
    Ginev, D., Stamerjohanns, H., Miller, B.R., Kohlhase, M.: The LaTeXML Daemon: Editable Math on the Collaborative Web. In: Davenport, J.H., Farmer, W.M., Urban, J., Rabe, F. (eds.) MKM 2011 and Calculemus 2011. LNCS (LNAI), vol. 6824, pp. 292–294. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  7. [Kau+07]
    Kauers, M., Kerber, M., Miner, R., Windsteiger, W. (eds.): MKM/CALCULEMUS 2007. LNCS (LNAI), vol. 4573. Springer, Heidelberg (2007)zbMATHGoogle Scholar
  8. [KK07]
    Kohlhase, A., Kohlhase, M.: Reexamining the MKM Value Proposition: From Math Web Search to Math Web ReSearch. In: Kauers, M., Kerber, M., Miner, R., Windsteiger, W. (eds.) MKM/CALCULEMUS 2007. LNCS (LNAI), vol. 4573, pp. 313–326. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  9. [Koh+11]
    Kohlhase, M., et al.: The Planetary System: Web 3.0 & Active Documents for STEM. Procedia Computer Science 4, 598–607 (2011); Sato, M., et al. (eds.) Special issue: Proceedings of the International Conference on Computational Science (ICCS). Finalist at the Executable Papers Challenge, doi:10.1016/j.procs.2011.04.063 Google Scholar
  10. [KP]
    Kohlhase, M., Prodescu, C.: Mathwebsearch manual. Web manual. Jacobs UniversityGoogle Scholar
  11. [KŞ06]
    Kohlhase, M., Sucan, I.: A Search Engine for Mathematical Formulae. In: Calmet, J., Ida, T., Wang, D. (eds.) AISC 2006. LNCS (LNAI), vol. 4120, pp. 241–253. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  12. [LM06]
    Libbrecht, P., Melis, E.: Methods to Access and Retrieve Mathematical Content in ActiveMath. In: Iglesias, A., Takayama, N. (eds.) ICMS 2006. LNCS (LNAI), vol. 4151, pp. 331–342. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. [Mic]
    GNU MicroHTTPd library (July 2011), http://www.gnu.org/software/libmicrohttpd/ (visited on November 07, 2011)
  14. [Miz]
    Mizar mathematical library, http://www.mizar.org/library (visited on February 12, 2009)
  15. [MM06]
    Munavalli, R., Miner, R.: Mathfind: a math-aware search engine. In: SIGIR 2006: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 735–735. ACM Press, New York 1148348 (2006), doi: http://doi.acm.org/10.1145/1148170
  16. [MWS]
    Math Web Search (January 2011), https://trac.mathweb.org/MWS/
  17. [NK07]
    Normann, I., Kohlhase, M.: Extended Formula Normalization for ε-Retrieval and Sharing of Mathematical Knowledge. In: Kauers, M., Kerber, M., Miner, R., Windsteiger, W. (eds.) MKM/CALCULEMUS 2007. LNCS (LNAI), vol. 4573, pp. 356–370. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  18. [Pos]
    IEEE POSIX, ISO/IEC 9945 (1988)Google Scholar
  19. [SK08]
    Stamerjohanns, H., Kohlhase, M.: Transforming the arχiv to XML. In: Autexier, S., Campbell, J., Rubio, J., Sorge, V., Suzuki, M., Wiedijk, F. (eds.) AISC 2008, Calculemus 2008, and MKM 2008. LNCS (LNAI), vol. 5144, pp. 574–582. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  20. [SL11]
    Sojka, P., Líška, M.: Indexing and Searching Mathematics in Digital Libraries. In: Davenport, J.H., Farmer, W.M., Urban, J., Rabe, F. (eds.) MKM 2011 and Calculemus 2011. LNCS (LNAI), vol. 6824, pp. 228–243. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  21. [SS]
    Sutcliffe, G., Sutner, C.: The TPTP problem library for automated theorem proving, http://www.tptp.org (visited on December 12, 2011)
  22. [Tee11]
    Teetor, P.: R Cookbook, 2nd edn. O’Reilly (2011) ISBN: 978-3486705171Google Scholar
  23. [Vei]
    Veillard, D.: The XML c parser and toolkit of gnome; libxmlGoogle Scholar
  24. [You06a]
    Youssef, A.: Methods of Relevance Ranking and Hit-content Generation in Math Search. In: Borwein, J.M., Farmer, W.M. (eds.) MKM 2006. LNCS (LNAI), vol. 4108, pp. 393–406. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  25. [You06b]
    Youssef, A.M.: Roles of Math Search in Mathematics. In: Borwein, J.M., Farmer, W.M. (eds.) MKM 2006. LNCS (LNAI), vol. 4108, pp. 2–16. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Michael Kohlhase
    • 1
  • Bogdan A. Matican
    • 1
  • Corneliu-Claudiu Prodescu
    • 1
  1. 1.Computer ScienceJacobs University BremenGermany

Personalised recommendations