Abstract
We present the preliminary results of an ongoing work aimed at using morpho-syntactic patterns to extract information from process descriptions in a semi-supervised manner. The experiments have been designed for generic information extraction tasks and evaluated on detecting ingredients from cooking recipes in French using a large gold standard corpus. The proposed method uses bi-lexical dependency oriented syntactic analysis of the text and extracts relevant morpho-syntactic patterns. Those patterns are then used as features for different machine learning methods to acquire the final ingredient list. Furthermore, this approach may easily be adapted to similar tasks since it relies on mining generic morpho-syntactic patterns from the documents automatically. The method itself is language independent, considering language specific parsers being used. The performance of our method on the DEFT 2013 data set is nevertheless satisfactory since it significantly outperforms the best system from the original challenge (0.75 vs 0.66 MAP).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Higgins, F.R.: The pseudo-cleft construction in English. PhD thesis. Massachusetts Institute of Technology (MIT), Cambridge (1973)
Heycock, C.: Specification, equation, and agreement in copular sentences. The Canadian Journal of Linguistics / La Revue Canadienne De Linguistique 57, 209–240 (2012)
Galton, A., Mizoguchi, R.: The water falls but the waterfall does not fall: New perspectives on objects, processes and events. Applied Ontology 4, 71–107 (2009)
Faure, D., Nédellec, C.: Knowledge acquisition of predicate argument structures from technical texts using machine learning: The system asium. In: Fensel, D., Studer, R. (eds.) EKAW 1999. LNCS (LNAI), vol. 1621, pp. 329–334. Springer, Heidelberg (1999)
Grouin, C., Zweigenbaum, P., Paroubek, P.: DEFT2013 se met à table: présentation du défi et résultats. In: Actes de DEFT 2013: 9e DÉfi Fouille de Textes, Les Sables d’Olonne, France, pp. 1–14 (2013)
Dini, L., Bittar, A., Ruhlmann, M.: Approches hybrides pour l’analyse de recettes de cuisine DEFT, TALN-RECITAL 2013. In: Actes de DEFT 2013: 9e DÉfi Fouille de Textes, Les Sables d’Olonne, France, pp. 53–65 (2013)
Pak, A., Paroubek, P.: Text representation using dependency tree subgraphs for sentiment analysis. In: Xu, J., Yu, G., Zhou, S., Unland, R. (eds.) DASFAA Workshops 2011. LNCS, vol. 6637, pp. 323–332. Springer, Heidelberg (2011)
Nouvel, D., Antoine, J.Y., Friburger, N.: Pattern mining for named entity recognition. LNCS/LNAI Series 8387 (post-proceedings LTC 2011) (2014)
Candito, M., Nivre, J., Denis, P., Henestroza Anguiano, E.: Benchmarking of statistical dependency parsers for french. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, COLING 2010, Beijing, China, pp. 108–116 (2010)
Petrov, S., Barrett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, ACL-44, pp. 433–440. Association for Computational Linguistics, Stroudsburg (2006)
Candito, M., Crabé, B., Pascal, D.: Statistical french dependency parsing: Treebank conversion and first results. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), pp. 1840–1847. European Language Resources Association (ELRA), Valletta (2010)
Candito, M., Crabbé, B., Falco, M.: Dépendances syntaxiques de surface pour le français – Schéma d’annotation pour un corpus en dépendances obtenu par conversion du FrenchTreebank (2011)
Tesnière, L.: Éléments de Syntaxe Structurale. Éditions Klinksieck, Paris (1959)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Asadullah, M., Nouvel, D., Paroubek, P. (2014). Using Verb-Noun Patterns to Detect Process Inputs. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2014. Lecture Notes in Computer Science(), vol 8655. Springer, Cham. https://doi.org/10.1007/978-3-319-10816-2_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-10816-2_23
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10815-5
Online ISBN: 978-3-319-10816-2
eBook Packages: Computer ScienceComputer Science (R0)