Skip to main content

Sentence-Based Plagiarism Detection for Japanese Document Based on Common Nouns and Part-of-Speech Structure

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 513))

Abstract

Plagiarism by the copy and paste of documents written by other authors has recently become a large problem as electronic documents have increased. In higher educational institutions, it is also of great concern in student reports. In this paper, we have proposed a novel approach to automatically detect plagiarism, especially for student experimental reports in Japanese and focusing on the common nouns and the structure of parts of speech for each sentence. We have also performed experiments to evaluate our approach with actual Japanese experimental reports written by our students with the measures such as precision, recall and F-value. As the experimental results, our proposed approach has succeeded to detect plagiarized pairs of sentences within high accuracy. In addition, we also discuss the parts where our proposed approach miss-detected and couldn’t detect.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://turnitin.com.

References

  1. Maurer, H., Kappe, F., Zaka, B.: Plagiarism - a survey. J. Univers. Comput. Sci. 12(8), 1050–1084 (2006)

    Google Scholar 

  2. Gustafson, N., Pera, M.S., Ng, Y.K.: Nowhere to hide: finding plagiarized documents based on sentence similarity. In: Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence (WI 2008), pp. 690–696 (2008)

    Google Scholar 

  3. See, C.K., Wog, K.S., Woon, W.L.: Text plagiarism detection method based on path patterns. Int. J. Bus. Intell. Data Min. 3(2), 136–146 (2008)

    Article  Google Scholar 

  4. Son, J.W., Noh, T.G., Song, H.J., Park, S.B.: An application for plagiarized source code detection based on a parse tree kernel. Eng. Appl. Artif. Intell. 26(8), 1911–1918 (2013)

    Article  Google Scholar 

  5. Brin, S., Davis, J., Garcia, H.M.: Copy detection mechanisms for digital documents. In: Proceedings of the ACM SIGMOD Annual Conference, pp. 398–409 (1995)

    Google Scholar 

  6. Osman, A.H., Salim, N., Binwahlan, M.S., Alteeb, R., Abuobieda, A.: An improved plagiarism detection scheme based on semantic role labeling. J. Appl. Soft Comput. 12(5), 1493–1502 (2012)

    Article  Google Scholar 

  7. White, D.R., Joy, M.S.: Sentence-based natural language plagiarism detection. ACM J. Educ. Resour. Comput. 4(4), 1–20 (2004)

    Article  Google Scholar 

  8. Kudo, T., Yamamoto, K., Matsumoto, Y.: Applying conditional random fields to japanese morphological analysis. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP-2004), pp. 230–237 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takeru Yokoi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Yokoi, T. (2015). Sentence-Based Plagiarism Detection for Japanese Document Based on Common Nouns and Part-of-Speech Structure. In: Fujita, H., Selamat, A. (eds) Intelligent Software Methodologies, Tools and Techniques. SoMeT 2014. Communications in Computer and Information Science, vol 513. Springer, Cham. https://doi.org/10.1007/978-3-319-17530-0_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-17530-0_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-17529-4

  • Online ISBN: 978-3-319-17530-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics