Skip to main content

SMILLE for Portuguese: Annotation and Analysis of Grammatical Structures in a Pedagogical Context

  • 674 Accesses

Part of the Lecture Notes in Computer Science book series (LNAI,volume 11122)


In Second Language Acquisition (SLA), the exposure of learners to authentic material is an important learning step, but the use of raw text may pose problems, because the information that the learner should be focusing on may be overlooked. In this paper, we present SMILLE for Portuguese, a system for detecting pedagogically relevant grammatical structures in raw texts. SMILLE’s rules for recognizing grammatical structures were evaluated in random sentences from three different genres, achieving an overall precision of 84%. The automatic recognition of pedagogically relevant grammatical structures can help teachers and course coordinators to better inform the choice of texts to be used in language courses, while also allowing for the analysis of grammar profiles for SLA. As a case study, we used SMILLE to analyze pedagogical material used in a Portuguese as foreign language course and to observe how the predominance of grammatical content in the texts is related to the described grammatical focus of the language levels.


  • Second Language Acquisition
  • Grammatical structures
  • Natural Language Processing
  • Grammatical parsing for Portuguese

Supported by the Walloon Region (Projects BEWARE 1510637 and 1610378) and Altissia International.

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-99722-3_2
  • Chapter length: 11 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   69.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-99722-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   89.99
Price excludes VAT (USA)


  1. 1.

    The system is available for testing at

  2. 2.

    Our grammatical structures were based on the course developed by Altissia International (

  3. 3.

    SMILLE for Portuguese makes use of the PassPort system [21].

  4. 4.

    We do not present here the 71 rules because many of the grammatical structures are divided along the CEFR levels, presenting some basic content in lower levels and reinforcing them in higher levels, and others are divided in different categories, such as the verb tenses, the comparative forms, the types of adverbs, etc.

  5. 5.

    Selected romances from

  6. 6.

    This corpus was compiled in the scope of the project PorPopular (

  7. 7.

    Sentences with more than one instance of the selected structure were evaluated only based on the first instance.

  8. 8.


  1. Azab, M., Salama, A., Oflazer, K., Shima, H., Araki, J., Mitamura, T.: An english reading tool as a NLP showcase. In: The Companion Volume of the Proceedings of IJCNLP 2013: System Demonstrations, pp. 5–8. Asian Federation of Natural Language Processing, Nagoya, Japan, October 2013.

  2. Azab, M., Salama, A., Oflazer, K., Shima, H., Araki, J., Mitamura, T.: An NLP-based reading tool for aiding non-native english readers. Recent Advances in Natural Language Processing, p. 41 (2013)

    Google Scholar 

  3. Brown, J., Eskenazi, M.: Retrieval of authentic documents for reader-specific lexical practice. In: InSTIL/ICALL Symposium 2004 (2004)

    Google Scholar 

  4. Chinkina, M., Kannan, M., Meurers, D.: Online information retrieval for language learning. In: ACL 2016, p. 7 (2016)

    Google Scholar 

  5. Cross, J.: Noticing’in sla: Is it a valid concept. TESL-EJ 6(3), 1–9 (2002)

    MathSciNet  Google Scholar 

  6. Doughty, C.: Second language instruction does make a difference. Stud. Second Lang. Acquisition 13(04), 431–469 (1991)

    CrossRef  Google Scholar 

  7. Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Association for Computational Linguistics (ACL) System Demonstrations, pp. 55–60 (2014).

  8. Marujo, L., et al.: Porting reap to european portuguese. In: SLaTE, pp. 69–72 (2009)

    Google Scholar 

  9. Meurers, D., et al.: Enhancing authentic web pages for language learners. In: Proceedings of the NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 10–18. Association for Computational Linguistics (2010)

    Google Scholar 

  10. Plonsky, L., Ziegler, N.: The CALL-SLA interface: Insights from a second-order synthesis (2016)

    Google Scholar 

  11. Reinders, H.: Towards a definition of intake in second language acquisition (2012)

    Google Scholar 

  12. Schmidt, R.: The role of consciousness in second language learning1. Appl. Linguistics 11(2), 129–158 (1990)

    CrossRef  Google Scholar 

  13. Schmidt, R.: Attention, awareness, and individual differences in language learning. Perspect. Indiv. Characteristics Foreign Lang. Educ. 6, 27 (2012)

    Google Scholar 

  14. Simard, D.: Differential effects of textual enhancement formats on intake. System 37(1), 124–135 (2009)

    CrossRef  Google Scholar 

  15. Smith, M.S.: Input enhancement in instructed sla. Stud. Second Lang. Acquisition 15(02), 165–179 (1993)

    CrossRef  Google Scholar 

  16. Smith, M.S., Truscott, J.: Explaining input enhancement: a mogul perspective. Int. Rev. Appl. Linguistics Lang. Teach. 52(3), 253–281 (2014)

    Google Scholar 

  17. Tiedemann, J.: Finding alternative translations in a large corpus of movie subtitle. In: International Conference on Language Resources and Evaluation (2016)

    Google Scholar 

  18. Truscott, J.: Noticing in second language acquisition: a critical review. Second Lang. Res. 14(2), 103–135 (1998)

    CrossRef  Google Scholar 

  19. Verhelst, N., Van Avermaet, P., Takala, S., Figueras, N., North, B.: Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Cambridge University Press, Cambridge (2009)

    Google Scholar 

  20. Zilio, L., Fairon, C.: Adaptive system for language learning. In: 2017 IEEE 17th International Conference on Advanced Learning Technologies (ICALT), pp. 47–49. IEEE (2017)

    Google Scholar 

  21. Zilio, L., Wilkens, R., Fairon, C.: Passport: a dependency parsing model for portuguese

    Google Scholar 

  22. Zilio, L., Wilkens, R., Fairon, C.: Enhancing grammatical structures in web-based texts. In: Proceedings of the 25th EUROCALL, pp. 839–846, Accepted, 2017

    Google Scholar 

  23. Zilio, L., Wilkens, R., Fairon, C.: Using NLP for enhancing second language acquisition. In: Proceedings of Recent Advances in Natural Language Processing, pp. 839–846 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Leonardo Zilio .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Zilio, L., Wilkens, R., Fairon, C. (2018). SMILLE for Portuguese: Annotation and Analysis of Grammatical Structures in a Pedagogical Context. In: , et al. Computational Processing of the Portuguese Language. PROPOR 2018. Lecture Notes in Computer Science(), vol 11122. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99721-6

  • Online ISBN: 978-3-319-99722-3

  • eBook Packages: Computer ScienceComputer Science (R0)