A Platform Development for Multilingual Law Collection and Comparative-Law Support Services: ASEAN Laws as a Case Study

  • Vee SatayamasEmail author
  • Asanee Kawtrakul
  • Takahiro Yamakoshi
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1197)


Lawmakers in the ASEAN countries need to investigate statutes of neighbor countries to draft consistent, uniform, and reasonable statutes. Moreover, the non-lawyers, who would like to invest or work oversea, should understand the statutes of the countries under consideration and compare the regulation requirements before making decision which country is good for investment or for working. This work proposes a platform for collecting and comparing laws. It consists of three modules: the first one is a Web crawling for gathering the statutes from ASEAN countries’ law archives, the second module is Document preprocessing for extracting the regulations from each statute of each country and aligning them across the text, and the last module is a service with a tool for highlighting the relevant parts of text. This paper proposes to use existing text processing tools, such as, word/word-group segmentation and document section parsing, to use Wikidata’s ontological concept for annotating those entities, and then align them across the text. However, there are two problems of concept selection, i.e. concept ambiguity and concept granularity. A near-threshold of maximum distance to the least common ancestor is computed for selecting a proper concept for entity alignment. This work did an experiment on Malaysia and Thailand’s labor law to compare the minimum wages. By testing with a several of thresholds, the threshold value two gives the most proper concept where the precision and recall of related entities alignment are 48% and 67%, respectively.


Multilingual legal documents collection Automatic translation Concept annotation Platform for law comparison Ontology-based entity alignment 


  1. 1.
    ASEAN Legal Database (2019). Accessed 29 Mar 2019
  2. 2.
    Attorney General’s Chambers (2019). Site Pages/INDEX TO THE LAWS OF BRUNEI.aspx. Accessed 29 Mar 2019
  3. 3.
    Industrial-Strength Natural Language Processing (2019). Accessed 29 Mar 2019
  4. 4.
    Krisdika (2019). Accessed 29 Mar 2019
  5. 5.
    Lembaran Negara (2019). Accessed 10 July 2019
  6. 6.
    Mundaneum (2019). Accessed 22 Sept 2019
  7. 7.
    Nokogiri (2019). Accessed 29 Mar 2019
  8. 8.
    Official Gazette (2019). Accessed 29 Mar 2019
  9. 9.
    Official Portal Attorney General’s Chambers of Malaysia (2019). Accessed 29 Mar 2019
  10. 10.
    Optical Character Recognition (OCR): Tutorial—cloud functions document—Google cloud (2019). Accessed 29 Mar 2019
  11. 11.
    Singapore statutes online (2019). Accessed 29 Mar 2019
  12. 12.
    Socialist Republic of Vietname Government Portal (2019). Accessed 28 Sept 2019
  13. 13.
    Thai word segmentation library in Rust (2019). Accessed 29 Mar 2019
  14. 14.
    The national assembly of the Lao people’s democratic republic (2019). Accessed 29 Mar 2019
  15. 15.
    The official Gazette of the Republic of the Philipines (2019). Accessed 29 Mar 2019
  16. 16.
    Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)CrossRefGoogle Scholar
  17. 17.
    Dev, S.: Slimerjs (2019). Accessed 12 July 2019
  18. 18.
    Eberle, E.J.: The method and role of comparative law. Wash. Univ. Glob. Stud. Law Rev. 8, 451 (2009)Google Scholar
  19. 19.
    Harris, S., Seaborne, A., Prud’hommeaux, E.: SPARQL 1.1 query language. W3C recommendation (2013). Accessed 23 Sept 2019Google Scholar
  20. 20.
    Iacobacci, I., Pilehvar, M.T., Navigli, R.: Embeddings for word sense disambiguation: an evaluation study. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 897–907 (2016)Google Scholar
  21. 21.
    Koehn, P., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pp. 177–180 (2007)Google Scholar
  22. 22.
    Mozilla: Mozilla Firefox (2019). Accessed 12 July 2019
  23. 23.
    Tiedemann, J.: Parallel data, tools and interfaces in OPUS. In: LREC, vol. 2012, pp. 2214–2218 (2012)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Vee Satayamas
    • 1
    Email author
  • Asanee Kawtrakul
    • 1
  • Takahiro Yamakoshi
    • 2
  1. 1.Kasetsart UniversityBangkokThailand
  2. 2.Nagoya UniversityNagoyaJapan

Personalised recommendations