What You Use, Not What You Do: Automatic Classification of Recipes

  • Hanna Kicherer
  • Marcel Dittrich
  • Lukas Grebe
  • Christian Scheible
  • Roman Klinger
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10260)


Social media data is notoriously noisy and unclean. Recipe collections built by users are no exception, particularly when it comes to cataloging them. However, consistent and transparent categorization is vital to users who search for a specific entry. Similarly, curators are faced with the same challenge given a large collection of existing recipes: They first need to understand the data to be able to build a clean system of categories. This paper presents an empirical study on the automatic classification of recipes on the German cooking website Chefkoch. The central question we aim at answering is: Which information is necessary to perform well at this task? In particular, we compare features extracted from the free text instructions of the recipe to those taken from the list of ingredients. On a sample of 5,000 recipes with 87 classes, our feature analysis shows that a combination of nouns from the textual description of the recipe with ingredient features performs best (48% \(\text {F}_1\)). Nouns alone achieve 45% \(\text {F}_1\) and ingredients alone 46% \(\text {F}_1\). However, other word classes do not complement the information from nouns. On a bigger training set of 50,000 instances, the best configuration shows an improvement to 57% highlighting the importance of a sizeable data set.


Recipe Cooking Food Classification Multi-label Text mining 


  1. 1.
    Media, T.F.: Wenn Sie kochen, woher beziehen Sie Anregungen für Ihre Gerichte? (2012) (inGerman).
  2. 2.
    Cadene, R.: Deep learning and image classification on a medium dataset of cooking recipes (2015).
  3. 3.
    Chung, Y.: Finding food entity relationships using user-generated data in recipe service. In: CIKM (2012)Google Scholar
  4. 4.
    Kim, S.D., Lee, Y.J., Kang, S.H., Cho, H.G., Yoon, S.M.: Constructing cookery network based on ingredient entropy measure. Indian J. Sci. Technol. vol. 8, no. 23 (2015)Google Scholar
  5. 5.
    Ghewari, R., Raiyani, S.: Predicting cuisine from ingredients (2015).
  6. 6.
    Naik, J., Polamreddi, V.: Cuisine classification and recipe generation (2015).
  7. 7.
    Hendrickx, I., Lefever, E., Croijmans, I., Majid, A., van den Bosch, A.: Very quaffable and great fun: applying NLP to wine reviews. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, Association for Computational Linguistics, vol. 2, pp. 306–312, August 2016Google Scholar
  8. 8.
    Xie, H., Yu, L., Li, Q.: A hybrid semantic item model for recipe search by example. In: IEEE International Symposium on Multimedia (2010)Google Scholar
  9. 9.
    Nanba, H., doi, Y., Tsujita, M., Takezawa, T., Sumiya, K.: Construction of a cooking ontology from cooking recipes and patents. In: UbiComp Adjunct (2014)Google Scholar
  10. 10.
    Ribeiro, R., Batista, F., Pardal, J.P., Mamede, N.J., Pinto, H.S.: Cooking an ontology. In: Euzenat, J., Domingue, J. (eds.) AIMSA 2006. LNCS (LNAI), vol. 4183, pp. 213–221. Springer, Heidelberg (2006). doi: 10.1007/11861461_23 CrossRefGoogle Scholar
  11. 11.
    Cordier, A., Lieber, J., Molli, P., Nauer, E., Skaf-Molli, H., Toussaint, Y.: WIKITAAABLE: A semantic wiki as a blackboard for a textual case-based reasoning system. In: SemWiki 2009, ESWC (2009)Google Scholar
  12. 12.
    Gaillard, E., Nauer, E., Lefevre, M., Cordier, A.: Extracting generic cooking adaptation knowledge for the TAAABLE case-based reasoning system. In: Cooking with Computers Workshop, ECAI (2012)Google Scholar
  13. 13.
    Mota, S.G., Agudo, B.D.: ACook: recipe adaptation using ontologies, case-based reasoning systems and knowledge discovery. In: Cooking with Computers workshop, ECAI (2012)Google Scholar
  14. 14.
    Shidochi, Y., Takahashi, T., Ide, I., Murase, H.: Finding replaceable materials in cooking recipe texts considering characteristic cooking actions. In: Workshop on Multimedia for Cooking and Eating Activities, CEA (2009)Google Scholar
  15. 15.
    Teng, C.Y., Lin, Y.R., Adamic, L.A.: Recipe recommendation using ingredient networks. In: WebSci (2012)Google Scholar
  16. 16.
    Greene, E.: The New York Times Open Blog: Extracting structured data from recipes using conditional random fields (2015). Accessed 20 June 2016
  17. 17.
    Greene, E., McKaig, A.: The New York Times Open Blog: Our tagged ingredients data is now on GitHub (2016). Accessed 31 Jan 2016
  18. 18.
    Jonsson, E.: Semantic word classification and temporal dependency detection on cooking recipes Thesis, Linköpings universitet (2015)Google Scholar
  19. 19.
    Wang, L., Li, Q.: A personalized recipe database system with user-centered adaptation and tutoring support. In: SIGMOD 2007 Ph.D. Workshop on Innovative Database Research (2007)Google Scholar
  20. 20.
    Wang, L., Li, Q., Li, N., Dong, G., Yang, Y.: Substructure similarity measurement in chinese recipes. In: WWW (2008)Google Scholar
  21. 21.
    Maeta, H., Sasada, T., Mori, S.: A framework for recipe text interpretation. In: UbiComp Adjunct (2014)Google Scholar
  22. 22.
    Kiddon, C., Ponnuraj, G.T., Zettlemoyer, L., Choi, Y.: Mise en place: Unsupervised interpretation of instructional recipes. In: EMNLP (2015)Google Scholar
  23. 23.
    Mori, S., Sasada, T., Yamakata, Y., Yoshino, K.: A machine learning approach to recipe text processing. In: Cooking with Computers workshop, ECAI (2012)Google Scholar
  24. 24.
    Mori, S., Maeta, H., Yamakata, Y., Sasada, T.: Flow graph corpus from recipe texts. In: LREC (2014)Google Scholar
  25. 25.
    Yamakata, Y., Imahori, S., Sugiyama, Y., Mori, S., Tanaka, K.: Feature extraction and summarization of recipes using flow graph. In: Jatowt, A., Lim, E.-P., Ding, Y., Miura, A., Tezuka, T., Dias, G., Tanaka, K., Flanagin, A., Dai, B.T. (eds.) SocInfo 2013. LNCS, vol. 8238, pp. 241–254. Springer, Cham (2013). doi: 10.1007/978-3-319-03260-3_21 CrossRefGoogle Scholar
  26. 26.
    Wiegand, M., Roth, B., Klakow, D.: Web-based relation extraction for the food domain. In: Bouma, G., Ittoo, A., Métais, E., Wortmann, H. (eds.) NLDB 2012. LNCS, vol. 7337, pp. 222–227. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-31178-9_25 CrossRefGoogle Scholar
  27. 27.
    Wiegand, M., Roth, B., Klakow, D.: Data-driven knowledge extraction for the food domain. In: KONVENS, pp. 21–29 (2012)Google Scholar
  28. 28.
    Reiplinger, M., Wiegand, M., Klakow, D.: Relation extraction for the food domain without labeled training data – is distant supervision the best solution? In: Przepiórkowski, A., Ogrodniczuk, M. (eds.) NLP 2014. LNCS, vol. 8686, pp. 345–357. Springer, Cham (2014). doi: 10.1007/978-3-319-10888-9_35 Google Scholar
  29. 29.
    Cox, D.R.: The regression analysis of binary sequences. J. Roy. Stat. Soc. Ser. B (Methodological) 20(2), 215–242 (1958)MathSciNetzbMATHGoogle Scholar
  30. 30.
    Quinlan, J.R.: C4.5: Programming for Machine Learning. Morgan Kauffmann, CA (1993)Google Scholar
  31. 31.
    Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: ACL Demo (2014)Google Scholar
  32. 32.
    Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 1–48 (2008)zbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Hanna Kicherer
    • 1
  • Marcel Dittrich
    • 2
  • Lukas Grebe
    • 2
  • Christian Scheible
    • 1
  • Roman Klinger
    • 1
  1. 1.Institut Für Maschinelle SprachverarbeitungUniversität StuttgartStuttgartGermany
  2. 2.Chefkoch GmbH, Rheinwerk 3BonnGermany

Personalised recommendations