A Study of String Matching System Based on Database Set Operation

Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 308)

Abstract

In recent years, people can easily get various data and information through Internet. People can copy the entire downloaded data, digitized information into own paper work, and form some plagiarism problems. Previous studies use of statistics, vectors matrices to compare string among documents. When someone change the location of words, and add some superfluous words or sentences between strings, it will be greatly reduced the accurate rate of matching system; Moreover, it may cause students keep plagiarism if matching system cannot find the alignments correctly. This study uses Chinese Word Segmentation and Database Set Operation as a base to construct a string matching system to solve the excessive superfluous words and order problems. Database Set Operation may be more efficient than the program with lots of words inside its memory. This study creates a prototype system, and the result of the prototype shows that the accuracy performance is performed well.

Keywords

String Matching Database Set Operation Chinese Word Segmentation Matching System 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Lee, J.T.: Development of XML-Based Geo-Spatial Information Distri-bution System. Journal of Cartography 16, 191–204 (2006) Google Scholar
  2. 2.
    Maarouf, M.Y., Chung, S.M.: XML Integrated Environment for Service-Oriented Data Management. In: IEEE International Conference on Tools with Artificial Intelligence, vol. 2, pp. 361–368 (2008) Google Scholar
  3. 3.
    Zhang, J., Lang, B., Duan, Y.: An XML Data Placement Strategy for Dis-tributed XML Storage and Parallel Query. In: 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 433–439 (2011) Google Scholar
  4. 4.
    Lin, C.X., Chen, Z.J., Ling, C.C.: Combined with a long term priority sequence labeled with Chinese word segmentation research. Information Security Communications 15(3-4), 161–179 (2010) Google Scholar
  5. 5.
    Wu, D., Zhou, X., Zhang, H.: The Pattern Matching Algorithms Formalized Analyze in Chinese Strings. Intelligent Information Technology Application 1, 403–407 (2008) Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.Department of Information ManagementChung Hua UniversityHsinChuTaiwan

Personalised recommendations