MathSeer: A Math-Aware Search Interface with Intuitive Formula Editing, Reuse, and Lookup
- 3.1k Downloads
There has been growing interest in math-aware search engines that support retrieval using both formulas and keywords. An important unresolved issue is the design of search interfaces: for wide adoption, they must be engaging and easy-to-use, particularly for non-experts. The MathSeer interface addresses this with straightforward formula creation, editing, and lookup. Formulas are stored in ‘chips’ created using handwriting, Open image in new window, and images. MathSeer sessions are also stored at automatically generated URLs that save all chips and their editing history. To avoid re-entering formulas, chips can be reused, edited, or used in creating other formulas. As users enter formulas, our novel autocompletion facility returns entity cards searchable by formula or entity name, making formulas easy to (re)locate, and descriptions of symbols and notation available before queries are issued.
KeywordsMathematical information retrieval User interface design Multimodal input
Math-aware search engines supporting keyword and formula search have been around since at least 2003, when the Digital Library of Mathematical Functions1 supported Open image in new window in queries . The new information that sophisticated math retrieval would provide, such as more easily locating definitions of symbols and other notations, finding usage, proofs and mathematical properties across disciplines, and compiling information on applications (e.g., variations of the log loss for machine learning) has stimulated work in math-aware search, alongside parallel developments in math question answering within the Natural Language Processing community . To realize their full potential, math-aware search interfaces must be engaging and easy-to-use for different levels of expertise, and particularly for non-experts (e.g., students in middle school).
2 Interface Design Elements
Two challenges for text-based input are (1) most users are unfamiliar with Open image in new window (even fewer know MathML), and (2) rendered formulas are shown separately from input, leading to users having difficulty locating entry errors . Appealing solutions to these issues are handwritten formula input, formula image upload, and supporting the analogy of physically moving symbols around on a page . These are key design elements in the MathSeer interface. In one study, a majority of the undergraduate participants reported preferring drawing over typing formulas given a choice between the two . They also expressed formulas with handwriting that they could not using a keyboard (e.g., \(4 \atopwithdelims ()2\)).
To address these issues, our MathSeer search interface (see Fig. 1) allows formula input using a combination of typing Open image in new window, uploading formula images, and drawing formulas by hand.2 In MathSeer handwritten symbols are recognized each time a user stops drawing for a short time. Pressing a button recognizes formula structure, and copies the Open image in new window result into the panel at the bottom-left of the interface. The Open image in new window can then be edited, with a rendering of the formula updated in real-time (e.g., to quickly change ‘p’ to ‘P’). At bottom-right, palettes containing symbols and structures may be used to insert corresponding Open image in new window at the cursor position in the Open image in new window panel.
Images may be dragged-and-dropped on the canvas or uploaded using a button that presents a file navigation pop-up window. This produces a formula ‘chip’ on the canvas, which can be used directly in a query, edited, or used in constructing other formulas. A line-of-sight graph-based parsing technique is used to recognize formula images and handwritten formulas .
Formula Containers and Reuse (‘Chips’). Handwritten formula entry is convenient for small expressions, but for large expressions such as Eq. 1 handwriting is slow , and accurate recognition is challenging . Users may also want to avoid re-entering formulas, and to share formulas with others .
MathSeer introduces a new model for formula reuse, flexible containers that we call formula ‘chips’. Figure 1 shows a chip in the query bar, and there are two ‘favorite’ chips in a list at bottom-right. Chips can be created and used in a number of ways. In addition to the formula creation operations described above, chips can be created by selecting symbols on the canvas and ‘popping’ them up into a chip. All formula chips have their creation history automatically recorded, and are stored in a ‘history’ menu in the symbol palette panel. On the canvas, chips may be easily moved, resized, and ‘pushed’ onto the canvas (i.e., the symbols on the chip are added to the canvas, and the chip disappears).
Chips have two possible states: ‘recognized’ chips containing a Open image in new window string, and ‘template’ chips representing only symbols on a canvas. Chips that are ‘favorites’ are shown using an orange border, and are either a recognized or template chip. As an example use for template (grey) chips, ‘\(\int dx\)’ with a large space in the middle can be used as a template to quickly create other formulas with an integral, by dragging and dropping the chip from the favorites or history tab in the palette panel to the canvas. Recognized (blue) chips in the history and favorites tabs in the palette panel can also be used like palette buttons - clicking on them inserts their interpretation in the Open image in new window panel, making it easy to re-use and insert large formulas. Chips may also be exported as images with metadata containing all chip data, allowing chip images to be later reused in MathSeer (e.g., using drag-and-drop) or shared with others (e.g., over email). Using chips for formula containers was inspired by the Approach0 interface.3
MathSeer records the entire editing session, including all formula chips using an automatically generated URL that users can revisit later. The idea to use a URL to record editing state came from discussions with the creators of 2dsearch .
3 Conclusion and Future Work
The MathSeer interface addresses limitations of the standard text box + symbol palette formula entry technique common in math-aware search interfaces. MathSeer’s interface supports multimodal formula editing through handwritten, Open image in new window, and image input. We have introduced formula chips, a new container to support storage, reuse, editing, and sharing of formulas. The chip creation history and favorites list support quick query reformulation and reuse. In future work, we are considering manual editing operations to define spatial relationships between symbols and/or sub-expressions to avoid recognizing complex formulas.
- 1.Dmello, A.: Representing mathematical concepts associated with formulas using math entity cards. Master’s thesis, Rochester Institute of Technology (2019). https://scholarworks.rit.edu/theses/10238
- 2.Hopkins, M., Bras, R.L., Petrescu-Prahova, C., Stanovsky, G., Hajishirzi, H., Koncel-Kedziorski, R.: SemEval-2019 Task 10: Math question answering. In: Proceedings of SemEval@NAACL-HLT 2019, Minneapolis, MN, USA, pp. 893–899 (2019)Google Scholar
- 3.Kohlhase, M., Matican, B., Prodescu, C.: MathWebSearch 0.5: Scaling an open formula search engine. In: Proceedings of CICM, Bremen, Germany, pp. 342–357 (2012)Google Scholar
- 4.Mahdavi, M., Condon, M., Davila, K., Zanibbi, R.: LPGA: Line-of-sight parsing with graph-based attention for math formula recognition. In: Proceedings of ICDAR, Sydney, Australia, pp. 647–654 (2019)Google Scholar
- 5.Mahdavi, M., Zanibbi, R., Mouchère, H., Viard-Gaudin, C., Garain, U.: ICDAR 2019 CROHME + TFD: Competition on Recognition of Handwritten Mathematical Expressions and Typeset Formula Detection. In: Proceedings of ICDAR, Sydney, Australia, pp. 1533–1538 (2019)Google Scholar
- 6.Mansouri, B., Zanibbi, R., Oard, D.W.: Characterizing searches for mathematical concepts. In: 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL), pp. 57–66 (2019)Google Scholar
- 7.Mansouri, B., Rohatgi, S., Oard, D.W., Wu, J., Giles, C.L., Zanibbi, R.: Tangent-CFT: An embedding model for mathematical formulas. In: Proceedings of ICTIR 2019, pp. 11–18 (2019)Google Scholar
- 9.Russell-Rose, T., Chamberlain, J., Kruschwitz, U.: Rethinking ‘advanced search’: A new approach to complex query formulation. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds.) ECIR 2019. LNCS, vol. 11438, pp. 236–240. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15719-7_31CrossRefGoogle Scholar
- 10.Sojka, P., Ruzicka, M., Novotný, V.: MiaS: math-aware retrieval in digital mathematical libraries. In: Proceedings of CIKM, Torino, Italy, pp. 1923–1926 (2018)Google Scholar
- 11.Wangari, K.: Discovering real-world usage scenarios for a multimodal math search interface. Master’s thesis, Rochester Institute of Technology, December (2013)Google Scholar
- 12.Wangari, K., Zanibbi, R., Agarwal, A.: Discovering real-world use cases for a multimodal math search interface. In: Proceedings of SIGIR, Gold Coast, Australia, pp. 947–950 (2014)Google Scholar
- 14.Zanibbi, R., Novins, K., Arvo, J., Zanibbi, K.: Aiding manipulation of handwritten mathematical expressions through style-preserving morphs. In: Proceedings of Graphics Interface, Ottawa, Canada, pp. 127–134 (2001)Google Scholar
- 15.Zanibbi, R., Orakwue, A.: Math search for the masses: multimodal search interfaces and appearance-based retrieval. In: Kerber, M., Carette, J., Kaliszyk, C., Rabe, F., Sorge, V. (eds.) CICM 2015. LNCS (LNAI), vol. 9150, pp. 18–36. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20615-8_2CrossRefGoogle Scholar