Abstract
Analysis of criteria for the solvability/regularity of problems and of the correctness of algorithms is applied here to the problem of prediction of the values of numerical variables. It is shown that partial regularity is a necessary and sufficient condition for the solvability of the corresponding system of the classification problems. Cross-validation experiments conducted on several datasets from the field of biomedicine (non-invasive diagnostics of magnesium concentration in blood plasma), bioinformatics (prediction of the protein secondary structure), and solid-state physics (prediction of the properties of high-temperature superconductors) have demonstrated the effectiveness of the developed methods for generating “synthetic” informative numerical features and for increasing the accuracy of prediction of the numerical target variables.
Similar content being viewed by others
REFERENCES
I. Yu. Torshin and K. V. Rudakov, “On the theoretical basis of metric analysis of poorly formalized problems of recognition and classification,” Pattern Recogn. Image Anal. 25 (4), 577–587 (2015).
Yu. I. Zhuravlev, “Correct algebras over sets of incorrect (heuristic) algorithms. I,” Cybern. 13 (4), 489–497 (1977).
K. V. Rudakov, “On some universal constraints for classification algorithms”, USSR Comput. Math. Math. Phys. 26 (6), 75–81 (1986).
I. Yu. Torshin and K. V. Rudakov, “Combinatorial analysis of the solvability properties of the problems of recognition and completeness of algorithmic models. Part 1: Factorization approach,” Pattern Recogn. Image Anal. 27 (1), 16–28 (2017).
I. Yu. Torshin and K. V. Rudakov, “Combinatorial analysis of the solvability properties of the problems of recognition and completeness of algorithmic models. Part 2: Metric approach within the framework of the theory of classification of feature values,” Pattern Recogn. Image Anal. 27 (2), 184–199 (2017).
I. Yu. Torshin and K. V. Rudakov, “On metric spaces arising during formalization of recognition and classification problems. Part 1: Properties of compactness,” Pattern Recogn. Image Anal. 26 (2), 274–284 (2016).
I. Yu. Torshin and K. V. Rudakov, “On metric spaces arising during formalization of problems of recognition and classification. Part 2: Density properties,” Pattern Recogn. Image Anal. 26 (3), 483–496 (2016).
A. G. Ivakhnenko and V. G. Lapa, Cybernetic Predictive Devices (Naukova Dumka, Kiev, 1965) [in Russian].
K. V. Vorontsov, Combinatorial Theory of Reliability of Learning by Precedents, Doctoral Dissertation in Mathematics and Physics (Dorodnicyn Computing Centre, Russian Academy of Sciences, Moscow, 2010).
A. N. Kolmogorov, “Combinatorial foundations of information theory and the calculus of probabilities,” Russ. Math. Surv. 38 (4), 29–40 (1983).
R. J. Solomonoff, “A formal theory of inductive inference. Part I,” Inf. Control 7 (1), 1–22 (1964). https://doi.org/10.1016/S0019-9958(64)90223-2
I. Yu. Torshin, “On solvability, regularity, and locality of the problem of genome annotation,” Pattern Recogn. Image Anal. 20 (3), 386–395 (2010).
I. Yu. Torshin, “The study of the solvability of the genome annotation problem on sets of elementary motifs,” Pattern Recogn. Image Anal. 21 (4), 652–662 (2011).
K. V. Rudakov and I. Yu. Torshin, “The motif information analysis based on the solvability criterion for the protein secondary structure recognition,” Inform. Primen. (Inf. Appl.) 6 (1), 79–90 (2012) [in Russian].
N. L. Bol’shev and N. V. Smirnov, Mathematical Statistics Tables (Nauka, Moscow, 1983) [in Russian].
A. N. Kolmogoroff, “Sulla determinazione empirica di una legge di distribuzione,” Giorn. Ist. Ital. Attuari 4 (1), 83–91 (1933).
I. Yu. Torshin, “Optimal dictionaries of the final information on the basis of the solvability criterion and their applications in bioinformatics,” Pattern Recogn. Image Anal. 23 (2), 319–327 (2013).
M. B. Nevel’son and R. Z. Has’minskii, Stochastic Approximation and Recursive Estimation, Translations of Math. Monographs, Vol. 47 (American Mathematical Society, Providence, RI, 1973; Nauka, Moscow, 1972).
E. Yu. Egorova, I. Yu. Torshin, O. A. Gromova, A. I. Martynov, “The use of cardiointervalography for diagnostic screening and evaluation of the efficiency of correction of magnesium deficiency and comorbid conditions,” Terapevticheskiy Arkhiv (Ther. Arch.) 87 (8), 16–28 (2015) [in Russian].
I. Yu. Torshin, Sensing The Change: From Molecular Genetics To Personalized Medicine, in Bioinformatics in the Post-Genomic Era Series (Nova Science Publ., New York, 2009). ISBN 1-60692-217-0
O. A. Gromova and I. Yu. Torshin, Magnesium and the “diseases of civilization” (GEOTAR-Media, Moscow, 2018) [in Russian]. ISBN 978-5-9704-4527-3
I. Yu. Torshin, V. A. Aleshin, and E. V. Antipov, “Synthesis and properties of the high-temperature superconductor HgBa2CuO4+d,” Sverkhprovodimost’: Fizika, Khimiya, Tekhnika (Supercond.: Phys., Chem., Technol.) 7 (10–12), 1579–1587 (1994) [in Russian].
ACKNOWLEDGMENTS
We are grateful to Prof. O.A. Gromova for useful discussions on the expert analysis of biomedical data.
Funding
This work was supported by the Russian Foundation for Basic Research, project nos. 19-07-00356, 18-07-01022, 17-07-01419, 16-07-01129, and 18-07-00944.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
We declare that we have no conflict of interests related to the preparation and publication of this article.
Additional information
Ivan Yur’evich Torshin. Born 1972. Graduated from the Department of Chemistry, Moscow State University, in 1995. Received candidates degrees in chemistry in 1997 and in physics and mathematics in 2011. Currently is a senior researcher at Dorodnicyn Computing Centre, an associate professor at Moscow Institute of Physics and Technology, lecturer at the Faculty of Computational Mathematics and Cybernetics, Moscow State University, leading scientist at the Russian Branch of the Trace Elements Institute for UNESCO, and a member of the Center of Forecasting and Recognition. Author of 450 publications in peer-reviewed journals in biology, chemistry, medicine, and informatics and of 9 monographs: 5 in Russian and 4 in English (the series “Bioinformatics in Post-genomic Era”, Nova Biomedical Publishers, NY, 2006-2009).
Konstantin Vladimirovich Rudakov. Born 1954. Russian mathematician, corresponding member of the Russian Academy of Sciences, Head of the Department of Computational Methods of Forecasting at the Dorodnicyn Computing Centre, Informatics and Control Federal Research Center, Russian Academy of Sciences, and Head of the Intelligent Systems Chair at the Moscow Institute of Physics and Technology.
Translated by I. Nikitin
Rights and permissions
About this article
Cite this article
Torshin, I.Y., Rudakov, K.V. On the Procedures of Generation of Numerical Features over Partitions of Sets of Objects in the Problem of Predicting Numerical Target Variables. Pattern Recognit. Image Anal. 29, 654–667 (2019). https://doi.org/10.1134/S1054661819040175
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1054661819040175