Advertisement

Automatic Extraction of IDM-Related Information in Scientific Articles and Online Science News Websites

  • Oriane NédeyEmail author
  • Achille Souili
  • Denis Cavallucci
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 541)

Abstract

Previous studies have made it possible to extract information related to IDM (Inventive Design Method) out of patents. IDM is an ontology-defined method derived from TRIZ. As its mother theory, IDM is primarily based on patent’s observation and aims at finding inventive solutions on the basis of contradictions. In this paper, we present a new approach for extracting knowledge, this time out of other types of science-related documents: scientific papers and science news articles. This approach is based on sets of linguistics features which have been selected and evaluated semi-automatically with techniques of Natural Language Processing as well as Machine Learning.

Keywords

TRIZ IDM Inventive Design Machine learning Knowledge extraction Text mining NLP 

References

  1. 1.
    World Intellectual Property Organization: World Intellectual Property Indicators. WIPO, Geneva (2017)Google Scholar
  2. 2.
    Publish or perish? The rise of the fractional author…, Research Trends. https://www.researchtrends.com/issue-38-september-2014/publish-or-perish-the-rise-of-the-fractional-author/
  3. 3.
    Altshuller, G.: And Suddenly the Inventor Appeared: TRIZ, the Theory of Inventive Problem Solving. Technical Innovation Center, Inc., Worcester (1996)Google Scholar
  4. 4.
    Bonino, D., Ciaramella, A., Corno, F.: Review of the state-of-the-art in patent information and forthcoming evolutions in intelligent patent informatics. World Pat. Inf. 32, 30–38 (2010).  https://doi.org/10.1016/j.wpi.2009.05.008CrossRefGoogle Scholar
  5. 5.
    Cavallucci, D.: The theory of inventive problem solving: current research and trends in French academic institutions. Springer, Heidelberg (2017).  https://doi.org/10.1007/978-3-319-56593-4CrossRefGoogle Scholar
  6. 6.
    Cavallucci, D., Strasbourg, I.: From TRIZ to Inventive Design Method (IDM): towards a formalization of Inventive Practices in R&D Departments, no. 2 (2012)Google Scholar
  7. 7.
    Rousselot, F., Zanni-Merk, C., Cavallucci, D.: Towards a formal definition of contradiction in inventive design. Comput. Ind. 63, 231–242 (2012).  https://doi.org/10.1016/j.compind.2012.01.001CrossRefGoogle Scholar
  8. 8.
    Cavallucci, D., Rousselot, F., Zanni, C.: Initial situation analysis through problem graph. CIRP J. Manuf. Sci. Technol. 2, 310–317 (2010).  https://doi.org/10.1016/j.cirpj.2010.07.004CrossRefGoogle Scholar
  9. 9.
    Andrade, M.A., Valencia, A.: Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families. Bioinformatics 14, 600–607 (1998).  https://doi.org/10.1093/bioinformatics/14.7.600CrossRefGoogle Scholar
  10. 10.
    Krallinger, M., Valencia, A., Hirschman, L.: Linking genes to literature: text mining, information extraction, and retrieval applications for biology. Genome Biol. 9, S8 (2008).  https://doi.org/10.1186/gb-2008-9-s2-s8CrossRefGoogle Scholar
  11. 11.
    Müller, H.-M., Kenny, E.E., Sternberg, P.W.: Textpresso: an ontology-based information retrieval and extraction system for biological literature. PLoS Biol. 2, e309 (2004).  https://doi.org/10.1371/journal.pbio.0020309CrossRefGoogle Scholar
  12. 12.
    Yakushiji, A., Tateisi, Y., Miyao, Y., Tsujii, J.: Event extraction from biomedical papers using a full parser. In: Biocomputing 2001, pp. 408‑419. World Scientific (2000)Google Scholar
  13. 13.
    Gelbukh, A. (ed.): CICLing 2005. LNCS, vol. 3406. Springer, Heidelberg (2005).  https://doi.org/10.1007/b105772CrossRefGoogle Scholar
  14. 14.
    Lopez, P., Romary, L.: HUMB: automatic key term extraction from scientific articles in GROBID, p. 4 (2010)Google Scholar
  15. 15.
    Krapivin, M., Autaeu, A., Marchese, M.: Large dataset for keyphrases extraction. University of Trento (2009)Google Scholar
  16. 16.
    Kim, S.N., Medelyan, O., Kan, M.-Y., Baldwin, T.: SemEval-2010 task 5: automatic keyphrase extraction from scientific articles. 6 (2010)Google Scholar
  17. 17.
    Bazerman, C.: Shaping Written Knowledge: the Genre and Activity of the Experimental Article in Science. University of Wisconsin Press, Madison (1988)Google Scholar
  18. 18.
    Pontille, D.: Matérialité des écrits scientifiques et travail de frontières: le cas du format IMRAD. 16 (2007)Google Scholar
  19. 19.
    Gosden, H.: Discourse functions of marked theme in scientific research articles. Engl. Specif. Purp. 11, 207–224 (1992).  https://doi.org/10.1016/S0889-4906(05)80010-9CrossRefGoogle Scholar
  20. 20.
    Pho, P.D.: Research article abstracts in applied linguistics and educational technology: a study of linguistic realizations of rhetorical structure and authorial stance. Discourse Stud. 10, 231–250 (2008).  https://doi.org/10.1177/1461445607087010CrossRefGoogle Scholar
  21. 21.
    Brossard, D.: New media landscapes and the science information consumer. Proc. Natl. Acad. Sci. 110, 14096–14101 (2013).  https://doi.org/10.1073/pnas.1212744110CrossRefGoogle Scholar
  22. 22.
    Brumfiel, G.: Supplanting the old media, (2009)Google Scholar
  23. 23.
    Puschmann, C.: (Micro)blogging science? notes on potentials and constraints of new forms of scholarly communication. In: Bartling, S., Friesike, S. (eds.) Opening Science, pp. 89–106. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-00026-8_6CrossRefGoogle Scholar
  24. 24.
    Allgaier, J., Dunwoody, S., Brossard, D., Lo, Y.-Y., Peters, H.P.: Journalism and social media as means of observing the contexts of science. Bioscience 63, 284–287 (2013).  https://doi.org/10.1525/bio.2013.63.4.8CrossRefGoogle Scholar
  25. 25.
    Minol, K., Spelsberg, G., Schulte, E., Morris, N.: Portals, blogs and co.: the role of the Internet as a medium of science communication. Biotechnol. J. 2, 1129–1140 (2007).  https://doi.org/10.1002/biot.200700163CrossRefGoogle Scholar
  26. 26.
    Mahrt, M., Puschmann, C.: Science blogging: an exploratory study of motives, styles, and audience reactions. J. Sci. Commun. 13(3), A05 (2014).  https://doi.org/10.22323/2.13030205CrossRefGoogle Scholar
  27. 27.
    Brossard, D., Scheufele, D.A.: Science, new media, and the public. Science 339, 40–41 (2013).  https://doi.org/10.1126/science.1232329CrossRefGoogle Scholar
  28. 28.
  29. 29.
  30. 30.
    Phys.org: News and Articles on Science and Technology. https://phys.org/
  31. 31.
    Research & Development. https://www.rdmag.com/
  32. 32.
    ScienceDaily: Your source for the latest research news. https://www.sciencedaily.com
  33. 33.
  34. 34.
    Science News for Students: News and feature articles from all fields of science. https://www.sciencenewsforstudents.org/home
  35. 35.
    Accounts of Chemical Research. ACS Publications. https://pubs.acs.org/journal/achre4
  36. 36.
    Annual Review of Condensed Matter Physics. https://www.annualreviews.org/journal/conmatphys
  37. 37.
    Chemistry of Materials. ACS Publications. https://pubs.acs.org/journal/cmatex
  38. 38.
  39. 39.
    FiftForce: Sumnotes - summarize PDF annotations. https://www.sumnotes.net/
  40. 40.
    Beautiful Soup Documentation—Beautiful Soup 4.4.0 documentation. https://www.crummy.com/software/BeautifulSoup/bs4/doc/
  41. 41.
    Lopez, P.: GROBID: a machine learning software for extracting information from scholarly documents (2018)Google Scholar
  42. 42.
    Frank, E., Hall, M.A., Witten, I.H., Pal, C.J.: The WEKA workbench. In: Online Appendix for Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann (2016)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2018

Authors and Affiliations

  • Oriane Nédey
    • 1
  • Achille Souili
    • 1
  • Denis Cavallucci
    • 1
  1. 1.Laboratoire CSIP: Conception, Système d’Information et Processus InventifsStrasbourgFrance

Personalised recommendations