A Relation Aware Search Engine for Materials Science

  • Sapan ShahEmail author
  • Dhwani Vora
  • B. P. Gautham
  • Sreedhar Reddy
Technical Article


Knowledge of material properties, microstructure, underlying material composition, and manufacturing process parameters that the material has undergone is of significant interest to materials scientists and engineers. A large amount of information of this nature is available in publications in the form of experimental measurements, simulation results, etc. However, getting to the right information of this kind that is relevant for a given problem on hand is a non-trivial task. First, an engineer has to go through a large collection of documents to select the right ones. Then, the engineer has to scan through these selected documents to extract relevant pieces of information. Our goal is to help automate some of these steps. Traditional search engines are not of much help here, as they are keyword centric and weak on relation processing. In this paper, we present a domain-specific search engine that processes relations to significantly improve search accuracy. The engine preprocesses material publication repositories to extract entities such as material compositions, material properties, manufacturing processes, process parameters, and their values and builds an index using these entities and values. The engine then uses this index to process user queries to retrieve relevant publication fragments. It provides a domain-specific query language with relational and logical operators to compose complex queries. We have conducted an experiment on a small library of publications on steel on which searches such as “get the list of publications which have carbon composition between 0.2 and 0.3 and on which tempering is carried out for about 30 to 40 min” are performed. We compare the results of our search engine with the results of a keyword-based search engine.


Materials science Domain-specific search engine Information retrieval system Information extraction 


  1. 1.
    National Research Council (2008) Integrated Computational Materials Engineering: a transformational discipline for improved competitiveness and national security. The National Academies Press, Washington, D.C.Google Scholar
  2. 2.
    Joseph T, Saiprasad V, Raghavan GS, Srinivasan R, Rao A, Kotte S, Sivadasan N (2012) TPX: biomedical literature search made easy. Bioinformation 8(12):578–580. CrossRefGoogle Scholar
  3. 3.
    Azizi-Alizamini H, Militzer M, Poole WJ (2011) Formation of ultrafine grained dual phase steels through rapid heating. ISIJ Int 51(6):958–964. CrossRefGoogle Scholar
  4. 4.
    Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press.
  5. 5.
    Sarawagi S (2008) Information extraction. Found Trend Database 1(3):261–377CrossRefGoogle Scholar
  6. 6.
    Mccallum, A., Nigam, K., Rennie, J., & Seymore, K (1999) Building domain-specific search engines with machine learning techniques. Proc. AAAI-99 Spring Symposium on Intelligent Agents in CyberspaceGoogle Scholar
  7. 7.
    Lindberg D, Humphreys B, McCray A (1993) The unified medical language system. Methods Inf Med 32(4):281–291Google Scholar
  8. 8.
    Simpson MS, Demner-Fushman D (2012) Biomedical text mining: a survey of recent progress. In: Aggarwal CC, Zhai C (eds) Mining text data.
  9. 9.
    Mitra, P., Giles, C. L., Sun, B., & Liu, Y (2007) ChemXSeer: a digital library and data repository for chemical kinetics. Proceedings of the ACM first workshop on CyberInfrastructure. Lisbon, Portugal: ACMGoogle Scholar
  10. 10.
    Krallinger M, Leitner F, Rabal O, Vazquez M, Oyarzabal J, Valencia A (2015) CHEMDNER: the drugs and chemical names extraction challenge. J Cheminform 7(1):S1. CrossRefGoogle Scholar
  11. 11.
    Swain MC, Cole JM (2016) ChemDataExtractor: a toolkit for automated extraction of chemical information from the scientific literature. J Chem Inf Model 56(10):1894–1904. CrossRefGoogle Scholar
  12. 12.
    Kim E, Huang K, Tomala A, Matthews S, Strubell E, Saunders A, McCallum A, Olivetti E (2017) Machine-learned and codified synthesis parameters of oxide materials. Sci Data 4:170127. CrossRefGoogle Scholar
  13. 13.
    Yang L, Chang-Jun H, Zhang J-L (2013) Matsearch: a search engine in materials science distributed data-intensive environment. J Internet Technol 14(5):799–806Google Scholar
  14. 14.
    Yang, L., & Hu, C. (2013). A new evaluation model to building materials science domain-specific search engine. Fourth International Conference on EIDWT, (pp. 527–534). Xi'an, Shaanxi, ChinaGoogle Scholar
  15. 15.
    Exegenix (2016) -PDF to XML Retrieved from Exegenix - PDF to XML conversion
  16. 16.
    Manning CD, Surdeanu M, Bauer J, Finkel J, Bethard SJ, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. Association for Computational Linguistics, Baltimore, pp 55–60 Retrieved from
  17. 17.
    Chang AX, Manning CD (2014) TokensRegex: defining cascaded regular expressions over tokens. Department of Computer Science, Stanford University Technical ReportGoogle Scholar
  18. 18.
    Chambers N, Cer D, Grenager T, Hall D, Kiddon C, MacCartney B et al (2007) Learning alignments and leveraging natural logic. Association for Computational Linguistics, Prague, pp 165–170Google Scholar
  19. 19.
    Adamczyk J, Grajcar A (2007) Heat treatment and mechanical properties of low-carbon steel with dual-phase microstructure. J Achiev Mater Manuf Eng 22(1):13–20Google Scholar
  20. 20.
    McCandless M, Hatcher E, Gospodnetic O (2010) Lucene in action, 2nd edn. Manning Publications Co., ISBN: 1933988177, 9781933988177Google Scholar
  21. 21.
    Apache L (2016)
  22. 22.
    Lee W-S, Su T-T (1999) Mechanical properties and microstructural features of AISI 4340 high-strength alloy steel under quenched and tempered conditions. J Mater Process Technol 87(1–3):198–206. CrossRefGoogle Scholar
  23. 23.
    Shah S., Vora D., Reddy S., Gautham BP (2017) Dictionaries for material properties, compositions and processing conditions used for intelligent search on steel related publications.

Copyright information

© The Minerals, Metals & Materials Society 2018

Authors and Affiliations

  1. 1.TRDDC, TCS Research, Tata Consultancy ServicesPuneIndia

Personalised recommendations