Skip to main content

A Holistic Approach to Duplicate Publication and Plagiarism Detection Using Probabilistic Ontologies

  • Conference paper
Advanced Machine Learning Technologies and Applications (AMLTA 2012)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 322))

Abstract

Duplicate publication and plagiarism are two major problems in scholarly world and even they are called the cancer of academia. Plagiarism detection systems try to find similar publications of a specific article; yet, there is a little advance in holistic plagiarism detection systems. Text similarity services, without a human manual confirmation, are not capable to confirm duplication; nonetheless, it is achievable to develop a system that determines the probability of an infringement. In this paper we introduce a technique to develop such systems by using probabilistic ontologies and reasoning. The output of this system can be used for statistical surveys about rate of prevalence of plagiarism. As well, it can hit on the most probable cases of plagiarism for further investigation by human.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Encyclopedia of Database Systems, 1st edn. Springer (2009)

    Google Scholar 

  2. Poole, D., Smyth, C., Sharma, R.: Ontology design for scientific theories that make probabilistic predictions. IEEE Intelligent Systems 24(1), 27–36 (2009)

    Google Scholar 

  3. Haase, P., Völker, J.: Ontology Learning and Reasoning — Dealing with Uncertainty and Inconsistency. In: da Costa, P.C.G., d’Amato, C., Fanizzi, N., Laskey, K.B., Laskey, K.J., Lukasiewicz, T., Nickles, M., Pool, M. (eds.) URSW 2005 - 2007. LNCS (LNAI), vol. 5327, pp. 366–384. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  4. Calì, A., Lukasiewicz, T., Predoiu, L., Stuckenschmidt, H.: Rule-Based Approaches for Representing Probabilistic Ontology Mappings. In: da Costa, P.C.G., d’Amato, C., Fanizzi, N., Laskey, K.B., Laskey, K.J., Lukasiewicz, T., Nickles, M., Pool, M. (eds.) URSW 2005 - 2007. LNCS (LNAI), vol. 5327, pp. 66–87. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  5. Haarslev, V., Pai, H.-I., Shiri, N.: Uncertainty Reasoning for Ontologies with General TBoxes in Description Logic. In: da Costa, P.C.G., d’Amato, C., Fanizzi, N., Laskey, K.B., Laskey, K.J., Lukasiewicz, T., Nickles, M., Pool, M. (eds.) URSW 2005 - 2007. LNCS (LNAI), vol. 5327, pp. 385–402. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  6. Foudeh, P., Salim, N.: Probabilistic ontologies and probabilistic ontology learning: Significance and challenges. In: 2011 International Conference on Research and Innovation in Information Systems (ICRIIS), pp. 1–4. IEEE (2011)

    Google Scholar 

  7. Halpern, J.Y.: Reasoning about uncertainty. MIT Press (2005)

    Google Scholar 

  8. Pearl, J.: Bayesian networks: A model of Self-Activated memory for evidential reasoning. In: Proceedings of the 7th Conference of the Cognitive Science Society, pp. 329–334. University of California, Irvine (1985)

    Google Scholar 

  9. Laskey, K.B.: MEBN: A language for first-order bayesian knowledge bases. Artificial Intelligence 172(2-3), 140–178 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  10. Costa, P.C.G., Laskey, K.B.: PR-OWL: A framework for probabilistic ontologies. In: Proceeding of the 2006 conference on Formal Ontology in Information Systems: Proceedings of the Fourth International Conference (FOIS 2006), pp. 237–249. IOS Press, Amsterdam (2006)

    Google Scholar 

  11. Carvalho, R.N., Laskey, K., Costa, P.: Compatibility formalization between PR-OWL and OWL. In: First International Workshop on Uncertainty in Description Logics (2010)

    Google Scholar 

  12. Costa, P.C.G., Ladeira, M., Carvalho, R.N., Santos, L.L., Matsumoto, S., Laskey, K.B.: A First-Order bayesian tool for probabilistic ontologies. In: Proceedings of the Twenty-First International Florida Artificial Intelligence Research Society Conference, pp. 631–636. AAAI Press, Menlo Park (2008)

    Google Scholar 

  13. Carvalho, R.N., Ladeira, M., Santos, L.L., Matsumoto, Costa, P.C.G.: UnBBayes-MEBN: Comments on Implementing a Probabilistic Ontology Tool. In: Proceedings of the IADIS International Conference on Applied Computing, Algarve, Portugal, pp. 211–218 (2008)

    Google Scholar 

  14. Carvalho, R.N., Laskey, K.B., Costa, P., Ladeira, M., Santos, L.L., Matsumoto, S.: Probabilistic knowledge fusion for procurement fraud detection in Brazil (2009)

    Google Scholar 

  15. Deja vuresearch website, http://dejavu.vbi.vt.edu/dejavu

  16. eTBLAST research website, http://etest.vbi.vt.edu/etblast3

  17. Errami, M., Hicks, J.M., Fisher, W., Trusty, D., Wren, J.D., Long, T.C., Garner, H.R.: Déjà vu-a study of duplicate citations in medline. Bioinformatics 24(2), 243–249 (2008)

    Article  Google Scholar 

  18. Errami, M., Sun, Z., Long, T.C., George, A.C., Garner, H.R.: Deja vu: a database of highly similar citations in the scientific literature. Nucleic Acids Research 37(Database issue), D921–D924 (2009)

    Article  Google Scholar 

  19. Errami, M., Sun, Z., George, A.C., Long, T.C., Skinner, M.A., Wren, J.D., Garner, H.R.: Identifying duplicate content using statistically improbable phrases. Bioinformatics 26(11), 1453–1457 (2010)

    Article  Google Scholar 

  20. Long, T.C., Errami, M., George, A.C., Sun, Z., Garner, H.R.: Scientific integrity responding to possible plagiarism. Science 323(5919), 1293–1294 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Foudeh, P., Salim, N. (2012). A Holistic Approach to Duplicate Publication and Plagiarism Detection Using Probabilistic Ontologies. In: Hassanien, A.E., Salem, AB.M., Ramadan, R., Kim, Th. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2012. Communications in Computer and Information Science, vol 322. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35326-0_56

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35326-0_56

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35325-3

  • Online ISBN: 978-3-642-35326-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics