Data samples with interval uncertainty are analyzed. It is proposed to use the Jaccard measure (index), which is widely used when comparing sets in various problem areas, as a measure (functional) of the consistency of interval values and their samples. Information about interval analysis, classical and complete (Kaucher) interval arithmetic is presented. For interval quantities, the necessary concepts and definitions of operations are introduced, in particular, generalizations of the concepts of intersection and union of sets. The Jaccard measure is generalized to the case of data with interval uncertainty and samples of interval data. Various variants of interval relations are described in detail — from their coincidence to incompatible cases. Various definitions of the Jaccard measure are given, both symmetric and nonsymmetric with respect to the operands. The connections of the proposed measure with the interval mode and the results of calculations with tweens are considered. A practical example of finding the information set of an interval problem using a new measure is given. Two areas of application of both symmetric and asymmetric measures are presented — computational processes (for characterizing iterative computational processes) and data analysis (for characterizing measurement workspaces and classifying data by a set of features).
Similar content being viewed by others
Change history
04 August 2023
A Correction to this paper has been published: https://doi.org/10.1007/s11018-023-02223-8
Notes
M. Z. Schwartz, working materials. URL: https://github.com/AlexanderBazhenov/Solar-Data (date of access: 11/18/2022).
References
B. I. Semkin, On the Relation Between Mean Values of Two Measures of Inclusion and Measures of Similarity, Byull. Botanicheskogo sada-instituta DVO RAS, No. 3, 91–101 (2009).
R. B. Kearfott, M. T. Nakao, A. Neumaier, S. M. Rump, S. P. Shary, and P. van Hentenryck, Standardized Notation in Interval Analysis, Comput. Technol., 15, No. 1, 7–13 (2010).
S. Shary, Numerical Computation of Formal Solutions to Interval Linear Systems of Equations, arXiv:1903.10272v1 [math.NA], https://doi.org/10.48550/arXiv.1903.10272.
S. Kabir, C. Wagner, T. C. Havens, D. T. Anderson, and U. Aickelin, IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE 2017), 2017, https://doi.org/10.1109/FUZZ-IEEE.2017.8015623.
T. Wilkin and G. Beliakov, IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE 2019), 1–6 (2019), 10.1109/ FUZZ-IEEE.2019.8858850.
S. Kabir, C. Wagner, and Z. Ellerby, Towards Handling Uncertainty-at-Source in AI — A Review and Next Steps for Interval Regression, arXiv:2104.07245 [cs.LG], https://doi.org/10.48550/arXiv.2104.07245.
A. N. Bazhenov, S. I. Zhilin, S. I. Kumkov, and S. P. Sharyj, Processing and Analysis of Data with Interval Uncertainty, 2022, available at: http://www.nsc.ru/interval/Library/ApplBooks/InteData Processing.pdf (accessed: 10.11.2022).
C. Hu and Z. H. Hu, On Statistics, Probability, and Entropy of Interval-Valued Datasets, Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2020. Communications in Computer and Information Science, M. J. Lesot et al. Eds., Cham, Springer, 1239, 2020, https://doi.org/10.1007/978-3-030-50153-2_31.
V. M. Nesterov, Tween Arithmetics and Their Application in Methods and Algorithms of Two-Sided Interval Estimation, Diss. Doct. Phis. Math. Sci. St. Petersburg (St. Petersburg Institute of Informatics and Automation of the Russian Academy of Sciences), 1999.
S. Shary, Comput. Technol., 2, No. 2, 150–172 (2017), https://doi.org/10.14529/mmph170105.
S. Shary, J. Comput. Syst. Sci. Int., 56, No. 6, 897–913 (2017), https://doi.org/10.7868/S0002338817060014.
S. Shary, Identification of Outliers in the Maximum Mat-Ching Method in the Analysis of Interval Data, Proc. All-Russian Conf. on Mathematics Int. Participation "MAC-2018," Barnaul, AltGU Publishing House, 2018, pp. 215–218.
S. Shary, On a Variability Measure for Estimates of Parameters in the Statistics of Interval Data, Comput. Techol., 24, No. 5, 90–108 (2019), https://doi.org/10.25743/ICT.2019.24.5.008.
S. P. Shary, Data Fitting Problem under Interval Uncertainty in Data, Ind. Lab. Diagn. Mater., 86, No. 1, 62–74, (2020), https://doi.org/10.26896/1028-6861-2020-86-1-62-74.
S. I. Zhilin, Reliab. Comput., 11, 433–442 (2005), https://doi.org/10.1007/s11155-005-0050-3.
S. I. Zhilin, Chemometr. Intell. Lab. Syst., 88, No. 1, 60–68 (2007), https://doi.org/10.1016/j.chemolab.2006.10.004.
S. I. Kumkov, Processing of Experimental Data on Ionic Conductivity of Molten Electrolyte by Methods of Interval Analysis, Russian Metallurgy (METALLY), No. 3, 79–89 (2010).
S. I. Kumkov and Yu. V. Mikushina, Reliab. Comput., 19, 197–214 (2013).
H. T. Nguyen, V. Kreinovich, B. Wu, and G. Xiang, Computing Statistics under Interval and Fuzzy Uncertainty. Applications to Computer Science and Engineering, Springer, Berlin-Heidelberg, 2012, https://doi.org/10.1007/978-3-642-24905-1.
Author information
Authors and Affiliations
Corresponding author
Additional information
Translated from Izmeritel'naya Tekhnika, No. 12, pp. 15–22, December, 2022.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bazhenov, A.N., Telnova, A.Y. Generalization of Jaccard Index for Interval Data Analysis. Meas Tech 65, 882–890 (2023). https://doi.org/10.1007/s11018-023-02180-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11018-023-02180-2