Data compression in machine learning applied to natural language

Grzymala-Busse, Jerzy W.; Than, Soe

doi:10.3758/BF03204518

Data compression in machine learning applied to natural language

13. Symposium On Natural Language Computing: Recent Developments
Published: June 1993

Volume 25, pages 318–321, (1993)
Cite this article

Download PDF

Behavior Research Methods, Instruments, & Computers Aims and scope Submit manuscript

Data compression in machine learning applied to natural language

Download PDF

Jerzy W. Grzymala-Busse¹ &
Soe Than¹

756 Accesses
6 Citations
Explore all metrics

Abstract

In this paper, we investigate the possibility of applying machine learning methods to data derived from the area of natural language and show how rules, induced by machine learning, are changed after the original data are compressed by grouping together entries, attributes, and attribute values. Also shown is how excessive compression of input data may affect the accuracy of induced rules.

Article PDF

Semantic Compression for Text Document Processing

Data Compression

Sentence Compression as a Supervised Learning with a Rich Feature Space

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Brady, J. (1991). Towards automatic categorization of concordances usingRoget’s International Thesaurus. InProceedings of the Third Midwest Artificial Intelligence and Cognitive Science Society Conference (pp. 93–97). Carbondale, IL.
Grzymala-Busse, J. W. (1989). An overview of the LERS1 learning system. InProceedings of the Second International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (pp. 838–844). New York: ACM Press.
Google Scholar
Grzymala-Busse, J. W. (1990). On the reduction of instance space in learning from examples. InProceedings of the Fifth International Symposium on Methodologies for Intelligent Systems (pp. 388–395). Amsterdam: North-Holland.
Google Scholar
Grzymala-Busse, J. W., &Than, S. (1992). On the compression of instance space in inductive learning. InProceedings of the Fourth Midwest Artificial Intelligence and Cognitive Science Society Conference (pp. 92–96), Utica, IL.
Hartmanis, J., &Stearns, R. E. (1966).Algebraic structure theory of sequential machines. Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Kibler, D., &Aha, D. W. (1987). Learning representative exemplars of concepts: An initial case study. InProceedings of the Fourth International Workshop on Machine Learning (pp. 24–30). Los Altos, CA: Morgan Kaufmann.
Google Scholar
Michalski, R. S., &Chilausky, R. L. (1980). Knowledge acquisition by encoding expert rules versus computer induction from examples: A case study involving soybean pathology.International Journal of Man-Machine Studies,12, 63–87.
Article Google Scholar
Old, J. (1991). Analysis of polysemy and homography of the word ’’ lead’’ inRoget’s International Thesaurus. InProceedings of the Third Midwest Artificial Intelligence and Cognitive Science Society Conference (pp. 98–102), Carbondale, IL.
Pawlak, Z. (1982). Rough sets. International Journal of Computer & Information Sciences,11, 341–356.
Article Google Scholar
Pawlak, Z., Slowinski, K., &Slowinski, R. (1986). Rough classification of patients after highly selective vagotomy for duodenal ulcer.International Journal of Man-Machine Studies,24, 413–433.
Article Google Scholar
Sedelow, W., &Sedelow, S. (1992). Toward generic artificial intelligence: A different tack. InProceedings of the Fourth Midwest Artificial Intelligence and Cognitive Science Society Conference (pp. 122–130). Utica, IL.
Van de Velde, W. (1988). Learning through progressive refinement. InProceedings of the EWSL 88, Third European Working Session on Learning (pp. 211–226). Marshfield, MA: Pitman.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Kansas, 66045, Lawrence, KS
Jerzy W. Grzymala-Busse & Soe Than

Authors

Jerzy W. Grzymala-Busse
View author publications
You can also search for this author in PubMed Google Scholar
Soe Than
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jerzy W. Grzymala-Busse.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Grzymala-Busse, J.W., Than, S. Data compression in machine learning applied to natural language. Behavior Research Methods, Instruments, & Computers 25, 318–321 (1993). https://doi.org/10.3758/BF03204518

Download citation

Issue Date: June 1993
DOI: https://doi.org/10.3758/BF03204518

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Data compression in machine learning applied to natural language

Abstract

Article PDF

Similar content being viewed by others

Semantic Compression for Text Document Processing

Data Compression

Sentence Compression as a Supervised Learning with a Rich Feature Space

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Data compression in machine learning applied to natural language

Abstract

Article PDF

Similar content being viewed by others

Semantic Compression for Text Document Processing

Data Compression

Sentence Compression as a Supervised Learning with a Rich Feature Space

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation