Skip to main content

Merging Validity and Coverage for Measuring Quality of Data Summaries

  • Conference paper
  • First Online:
Information Technology and Computational Physics (CITCEP 2016)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 462))

Abstract

Data summarization by quantified sentences of natural language simulates human reasoning in summing up from the data. Linguistic summaries are focused either on a whole data set, or on a part of a data set delimited by the flexible restrictions expressed as fuzzy sets. First, the paper examines influences of t-norms in compound predicates merged by the and connective and constructed fuzzy sets on the validity (truth value) of summaries. Further, linguistic summaries with restriction may express mined knowledge from the outliers and therefore be of low quality, even though the validity of summary could be high. The main aim of this paper is building a quality measure based on validity and coverage. Finally, additional possibilities related to the suggested measure and perspective topics for future research are outlined.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Yager, R.R., Ford, M., Cañas, A.J.: An approach to the linguistic summarization of data. In: 3rd International Conference on Information Processing and Management of Uncertainty in Knowledge-based Systems (IPMU ’90), Paris, France, July 2–6, pp. 456–468 (1990)

    Google Scholar 

  2. Lesot, M.-J., Moyse G., Bouchon-Meunier, B.: Interpretability of fuzzy linguistic summaries. Fuzzy Sets Syst. (In press) 292(1), 307–317 (2016)

    Google Scholar 

  3. Yu, J., Reiter, E., Hunter, J., Sripada, S.: Sumtime-turbine: a knowledge-based system to communicate gas turbine time-series data. In: Chung, P.W.H., Hinde, C.J., Ali, M. (eds.) Lecture Notes in Computer Science, LNAI, vol. 2718, pp. 379–384. Springer, Berlin, Heidelberg (2003)

    Google Scholar 

  4. Arguelles, L., Triviño, G.: I-struve: automatic linguistic descriptions of visual double stars. Eng. Appl. Artif. Intell. 26(9), 2083–2092 (2013)

    Article  Google Scholar 

  5. Yager, R.R.: A new approach to the summarization of data. Inf. Sci. 28(1), 69–86 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bouchon-Meunier, B., Moyse, G.: Fuzzy linguistic summaries: where are we, where can we go? In: 2012 IEEE Conference on Computational Intelligence for Financial Engineering and Economics (CIFEr 2012), New York, USA, March 29–30, pp. 1–8 (2012)

    Google Scholar 

  7. George, R., Srikanth, R.: Data summarization using genetic algorithms and fuzzy logic. In: Herrera, F., Verdegay, J.L. (eds.) Genetic Algorithms and Soft Computing, pp. 599–611. PhysicaVerlag, Heidelberg (1996)

    Google Scholar 

  8. Hudec, M.: Issues in construction of linguistic summaries. In: Mesiar, R., Bacigál, T. (eds.) Proceedings of Uncertainty Modelling 2013, pp. 35–44. STU, Bratislava (2013)

    Google Scholar 

  9. Kacprzyk, J., Zadrożny, S.: Protoforms of linguistic database summaries as a human consistent tool for using natural language in data mining. Int. J. Software Sci. Comput. Intell. 1(1), 1–11 (2009)

    Article  Google Scholar 

  10. Kacprzyk, J., Yager, R.R.: Linguistic summaries of data using fuzzy logic. Int. J. General Syst. 30(2), 133–154 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  11. Kacprzyk, J., Wilbik, A., Zadrożny, S.: Linguistic summarization of time series using a fuzzy quantifier driven aggregation. Fuzzy Sets Syst. 159(12), 1485–1499 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  12. Niewiadomski, A., Ochelska, J., Szczepaniak, P.S.: Interval-valued linguistic summaries of databases. Control Cybern. 35, 415–443 (2006)

    MATH  Google Scholar 

  13. Raschia, G., Mouaddib, N.: SAINTETIQ: a fuzzy set-based approach to database summarization. Fuzzy Sets Syst. 129(2), 137–162 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  14. Rasmussen, D., Yager, R.R.: Summary SQL—A fuzzy tool for data mining. Intell. Data Anal. 1(1–4), 49–58 (1997)

    Article  Google Scholar 

  15. Wu, D., Mendel, J.M., Joo, J.: Linguistic summarization using if-then rules. In: 2010 IEEE International Conference on Fuzzy Systems, Barcelona, Spain, July 18–23, pp. 1–8 (2010)

    Google Scholar 

  16. Klement, E.P., Mesiar, R., Pap, E.: Triangular norms: basic notions and properties. In: Klement, E.P., Mesiar, R. (eds.) Logical, Algebraic, Analytic, and Probabilistic Aspects of Triangular Norms, pp. 17–60. Elsevier, Amsterdam (2005)

    Chapter  Google Scholar 

  17. Hirota, K., Pedrycz, W.: Fuzzy computing for data mining. Proc. IEEE 87(9), 1575–1600 (1999)

    Article  Google Scholar 

  18. Castillo-Ortega, R., Marín, N., Sánchez, D., Tettamanzi, A.: Quality assessment in linguistic summaries of data. In: 14th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2012), Catania, Italy, July 9–13, pp. 285–294 (2012)

    Google Scholar 

  19. Pereira-Fariña, M., Eciolaza, L., Triviño, G.: Quality assessment of linguistic description of data. In: ESTYLF, Valladolid, Spain, February 1–3, pp. 608–612 (2012)

    Google Scholar 

  20. Hudec, M.: Merging validity and coverage for measuring quality of data summaries. In: Congress on Information Technology, Computational and Experimental Physics, Cracow, Poland, December 18–20, pp. 149–153 (2015)

    Google Scholar 

  21. Zadrożny, S., Kacprzyk, J.: Issues in the practical use of the OWA operators in fuzzy querying. J. Intell. Inf. Syst. 33(3), 307–325 (2009)

    Article  Google Scholar 

  22. Dubois, D., Prade, H.: Fuzzy Sets and Systems: Theory and Applications. Academic Press, New York (1980)

    MATH  Google Scholar 

  23. Hudec, M., Vučetić, M., Vujošević, M.: Synergy of linguistic summaries and fuzzy functional dependencies for mining knowledge in the data. In: 18th International Conference on System Theory, Control and Computing (IEEE ICSTCC), Sinaia, Romaina, October 17–19, pp. 335–340 (2014)

    Google Scholar 

  24. Garibaldi, J.M., John, R.I.: Choosing membership functions of linguistic terms. In: 12th IEEE International Conference on Fuzzy Systems (FUZZ ‘03), St. Louis, USA, May 25–28, pp. 578–583 (2003)

    Google Scholar 

  25. Hudec, M.: Linguistically summarizing hierarchical data. In: 16th IEEE International Symposium on Computational Intelligence and Informatics (CINTI 2015), Budapest, Hungary, November 19–21, pp. 141–145 (2015)

    Google Scholar 

  26. Kacprzyk, J., Ziółkowski, A.: Database queries with fuzzy linguistic quantifiers. IEEE Trans. Syst. Man Cyber. SMC-16(3):pp. 474–479 (1986)

    Google Scholar 

  27. Hudec, M.: Linguistic summaries applied on statistics—case of municipal statistics. Austrian J. Stat. 43(1), 63–75 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miroslav Hudec .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing Switzerland

About this paper

Cite this paper

Hudec, M. (2017). Merging Validity and Coverage for Measuring Quality of Data Summaries. In: Kulczycki, P., Kóczy, L., Mesiar, R., Kacprzyk, J. (eds) Information Technology and Computational Physics. CITCEP 2016. Advances in Intelligent Systems and Computing, vol 462. Springer, Cham. https://doi.org/10.1007/978-3-319-44260-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-44260-0_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-44259-4

  • Online ISBN: 978-3-319-44260-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics