Use of Elliptic Curves in Term Discrimination

Vilariño, Darnes; Pinto, David; Balderas, Carlos; Tovar, Mireya; Beltrán, Beatriz; Paniagua, Sofia

doi:10.1007/978-3-642-21587-2_37

Darnes Vilariño²⁰,
David Pinto²⁰,
Carlos Balderas²⁰,
Mireya Tovar²⁰,
Beatriz Beltrán²⁰ &
…
Sofia Paniagua²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6718))

Included in the following conference series:

Mexican Conference on Pattern Recognition

1288 Accesses

Abstract

Detection of discriminant terms allow us to improve the performance of natural language processing systems. The goal is to be able to find the possible term contribution in a given corpus and, thereafter, to use the terms of high contribution for representing the corpus. In this paper we present various experiments that use elliptic curves with the purpose of discovering discriminant terms of a given textual corpus. Different experiments led us to use the mean and variance of the corpus terms for determining the parameters of a Weierstrass reduced equation (elliptic curve). We use the elliptic curves in order to graphically visualize the behavior of the corpus vocabulary. Thereafter, we use the elliptic curve parameters in order to cluster those terms that share characteristics. These clusters are then used as discriminant terms in order to represent the original document collection. Finally, we evaluated all these corpus representations in order to determine those terms that best discrimine each document.

This work has been partially supported by the projects: CONACYT #106625, VIEP #VIAD-ING11-I, #PIAD-ING11-I, #BEMB-ING11-I, as well as by the PROMEP/103.5/09/4213 grant.

Download to read the full chapter text

Chapter PDF

Language Independent Extraction of Key Terms: An Extensive Comparison of Metrics

Information-theoretic term weighting schemes for document clustering and classification

Article 30 July 2014

Exploration of a Rich Feature Set for Automatic Term Extraction

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Can, F., Ozkarahan, E.A.: Computation of term/document discrimination values by use of the cover coefficient concept. Journal of the American Society for Information Science 38(3), 171–183 (1987)
Article Google Scholar
Manning, D.C., Schütze, H.: Foundations of statistical natural language processing. MIT Press, Cambridge (1999)
MATH Google Scholar
Pinto, D.: On Clustering and Evaluation of Narrow Domain Short-Text Corpora. Phd thesis, Department of Information Systems and Computation, UPV (2008)
Google Scholar
Salton, G., Wong, A., Yang, C.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1975)
Article MATH Google Scholar
Montemurro, M.A., Zanette, D.H.: Entropic analysis of the role of words in literary texts. Advances in Complex Systems (ACS) 05(01), 7–17 (2002)
Article MATH Google Scholar
Pons-Porrata, A., Berlanga-Llavori, R., Ruiz-Shulchloper, J.: Topic discovery based on text mining techniques. Information Processing and Management 43(3), 752–768 (2007)
Article Google Scholar
Santiesteban, Y., Pons-Porrata, A.: LEX: a new algorithm for the calculus of typical testors. Mathematics Sciences Journal 21(1), 85–95 (2003)
Google Scholar
Hankerson, D., Menezes, A.J., Vanstone, S.: Guide to Elliptic Curve Cryptography. Springer, New York (2003)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer Science, Benemérita Universidad Autónoma de Puebla, Mexico
Darnes Vilariño, David Pinto, Carlos Balderas, Mireya Tovar, Beatriz Beltrán & Sofia Paniagua

Authors

Darnes Vilariño
View author publications
You can also search for this author in PubMed Google Scholar
David Pinto
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Balderas
View author publications
You can also search for this author in PubMed Google Scholar
Mireya Tovar
View author publications
You can also search for this author in PubMed Google Scholar
Beatriz Beltrán
View author publications
You can also search for this author in PubMed Google Scholar
Sofia Paniagua
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Optics and Electronics (INAOE), Computer Science Department, National Institute of Astrophysics, Luis Enrique Erro No. 1, 72840 Sta. Maria Tonantzintla, Puebla, Mexico
José Francisco Martínez-Trinidad
Optics and Electronics (INAOE), Computer Science Department, National Institute for Astrophysics, Luis Enrique Erro No. 1, 72840 Sta. Maria Tonantzintla, Puebla, Mexico
Jesús Ariel Carrasco-Ochoa
Cancun Technological Institute (ITC), Av. Kabah, Km. 3,, 77515, Cancun, Qintana Roo, Mexico
Cherif Ben-Youssef Brants
Department of Computer Science, University of York, UK
Edwin Robert Hancock

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vilariño, D., Pinto, D., Balderas, C., Tovar, M., Beltrán, B., Paniagua, S. (2011). Use of Elliptic Curves in Term Discrimination. In: Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A., Ben-Youssef Brants, C., Hancock, E.R. (eds) Pattern Recognition. MCPR 2011. Lecture Notes in Computer Science, vol 6718. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21587-2_37

Download citation

DOI: https://doi.org/10.1007/978-3-642-21587-2_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21586-5
Online ISBN: 978-3-642-21587-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Use of Elliptic Curves in Term Discrimination

Abstract

Chapter PDF

Similar content being viewed by others

Language Independent Extraction of Key Terms: An Extensive Comparison of Metrics

Information-theoretic term weighting schemes for document clustering and classification

Exploration of a Rich Feature Set for Automatic Term Extraction

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Use of Elliptic Curves in Term Discrimination

Abstract

Chapter PDF

Similar content being viewed by others

Language Independent Extraction of Key Terms: An Extensive Comparison of Metrics

Information-theoretic term weighting schemes for document clustering and classification

Exploration of a Rich Feature Set for Automatic Term Extraction

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation