Advertisement

A snapshot on nonstandard supervised learning problems: taxonomy, relationships, problem transformations and algorithm adaptations

  • David CharteEmail author
  • Francisco Charte
  • Salvador García
  • Francisco Herrera
Review

Abstract

Machine learning is a field which studies how machines can alter and adapt their behavior, improving their actions according to the information they are given. This field is subdivided into multiple areas, among which the best known are supervised learning (e.g., classification and regression) and unsupervised learning (e.g., clustering and association rules). Within supervised learning, most studies and research are focused on well-known standard tasks, such as binary classification, multi-class classification and regression with one dependent variable. However, there are many other less known problems. These are what we generically call nonstandard supervised learning problems. The literature about them is much more sparse, and each study is directed to a specific task. Therefore, the definitions, relations and applications of this kind of learners are hard to find. The goal of this paper is to provide the reader with a broad view on the distinct variations of nonstandard supervised problems. A comprehensive taxonomy summarizing their traits is proposed. A review of the common approaches followed to accomplish them, and their main applications are provided as well.

Keywords

Machine learning Supervised learning Nonstandard learning 

Mathematics Subject Classification

68T05 68T10 

Notes

Acknowledgements

D. Charte is supported by the Spanish Ministry of Science, Innovation and Universities under the FPU National Program (Ref. FPU17/04069). This work has been partially supported by projects TIN2017-89517-P (FEDER Founds) of the Spanish Ministry of Economy and Competitiveness and TIN2015-68454-R of the Spanish Ministry of Science, Innovation and Universities.

References

  1. 1.
    Alvarez, M.A., Rosasco, L., Lawrence, N.D.: Kernels for vector-valued functions: a review. In: Jordan, M. (ed.) Foundations and Trends in Machine Learning. vol. 4, no. 3, pp. 195–266. Now Publishers (2011).  https://doi.org/10.1561/2200000036
  2. 2.
    Amini, M., Usunier, N., Goutte, C.: Learning from multiple partially observed views-an application to multilingual text categorization. In: Bengio, Y., Schuurmans, D., Lafferty, J.D., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22. Curran Associates, Inc., pp. 28–36 (2009). http://papers.nips.cc/paper/3690-learning-from-multiple-partially-observedviews-an-application-to-multilingual-text-categorization.pdf
  3. 3.
    Amores, J.: Multiple instance classification: review, taxonomy and comparative study. Artif. Intell. 201, 81–105 (2013).  https://doi.org/10.1016/j.artint.2013.06.003 MathSciNetzbMATHGoogle Scholar
  4. 4.
    Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems 15. MIT Press, pp. 577–584 (2003). http://papers.nips.cc/paper/2232-support-vector-machines-for-multiple-instance-learning.pdf
  5. 5.
    Baccianella, S., Esuli, A., Sebastiani, F.: Feature selection for ordinal text classification. Neural Comput. 26(3), 557–591 (2014).  https://doi.org/10.1162/NECO_a_00558 MathSciNetGoogle Scholar
  6. 6.
    Barlow, R.E.: Statistical Inference Under Order Restrictions; the Theory and Application of Isotonic Regression. Wiley, Hoboken (1972)zbMATHGoogle Scholar
  7. 7.
    Bender, R., Grouven, U.: Ordinal logistic regression in medical research. J. R. College Physicians Lond. 31(5), 546–551 (1997)Google Scholar
  8. 8.
    Bielza, C., Li, G., Larranaga, P.: Multi-dimensional classification with bayesian networks. Int. J. Approx. Reason. 52(6), 705–727 (2011)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Błaszczyński, J., Słowiński, R., Szelag, M.: Sequential covering rule induction algorithm for variable consistency rough set approaches. Inf. Sci. 181(5), 987–1002 (2011).  https://doi.org/10.1016/j.ins.2010.030 MathSciNetGoogle Scholar
  10. 10.
    Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: Feature Selection for High-Dimensional Data. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-21858-8 Google Scholar
  11. 11.
    Borchani, H., Varando, G., Bielza, C., Larrañaga, P.: A survey on multi-output regression. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 5(5), 216–233 (2015).  https://doi.org/10.1002/widm.1157 Google Scholar
  12. 12.
    Boutell, M., Luo, J., Shen, X., Brown, C.: Learning multi-label scene classification. Pattern Recognit. 37(9), 1757–1771 (2004).  https://doi.org/10.1016/j.patcog.2004.03.009 Google Scholar
  13. 13.
    Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., Hullender, G.: Learning to rank using gradient descent. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 89–96. ACM (2005).  https://doi.org/10.1145/1102351.1102363
  14. 14.
    Cardoso, J.S., Sousa, R.: Classification models with global constraints for ordinal data. In: 2010 Ninth International Conference on Machine Learning and Applications, pp. 71–77. IEEE (2010).  https://doi.org/10.1109/ICMLA.2010.18
  15. 15.
    Chang, K.Y., Chen, C.S., Hung, Y.P.: Ordinal hyperplanes ranker with cost sensitivities for age estimation. In: 2011 IEEE Conference on Computer vision and pattern recognition (CVPR) , pp. 585–592. IEEE (2011).  https://doi.org/10.1109/CVPR.2011.5995437
  16. 16.
    Chapelle, O., Schlkopf, B., Zien, A.: Semi-Supervised Learning, 1st edn. The MIT Press, Cambridge (2010)Google Scholar
  17. 17.
    Charte, F., Rivera, A.J., del Jesus, M.J., Herrera, F.: Quinta: A question tagging assistant to improve the answering ratio in electronic forums. In: EUROCON 2015 - International Conference on Computer as a Tool (EUROCON), IEEE, pp. 1–6 (2015).  https://doi.org/10.1109/EUROCON.2015.7313677
  18. 18.
    Charte, F., Rivera, A.J., del Jesus, M.J., Herrera, F.: Dealing with difficult minority labels in imbalanced mutilabel data sets. Neurocomputing (2017).  https://doi.org/10.1016/j.neucom.2016.08.158
  19. 19.
    Chaudhuri, K., Kakade, S.M., Livescu, K., Sridharan, K.: Multi-view clustering via canonical correlation analysis. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 129–136. ACM (2009).  https://doi.org/10.1145/1553374.1553391
  20. 20.
    Chen, Q., Sun, S.: Hierarchical multi-view fisher discriminant analysis. In: International Conference on Neural Information Processing, pp. 289–298. Springer (2009).  https://doi.org/10.1007/978-3-642-10684-2_32
  21. 21.
    Cheng, J., Wang, Z., Pollastri, G.: A neural network approach to ordinal regression. In: IEEE International Joint Conference on Neural Networks, 2008. IJCNN 2008.(IEEE World Congress on Computational Intelligence), pp. 1279–1284. IEEE (2008).  https://doi.org/10.1109/IJCNN.2008.4633963
  22. 22.
    Cheng, W., Hüllermeier, E., Dembczynski, K.J.: Graded multilabel classification: the ordinal case. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 223–230 (2010)Google Scholar
  23. 23.
    Chu, W., Ghahramani, Z.: Gaussian processes for ordinal regression. J. Mach. Learn. Res. 6, 1019–1041 (2005)MathSciNetzbMATHGoogle Scholar
  24. 24.
    Clare, A., King, R.D.: Knowledge discovery in multi-label phenotype data. In: European Conference on Principles of Data Mining and Knowledge Discovery, pp. 42–53. Springer (2001).  https://doi.org/10.1007/3-540-44794-6_4
  25. 25.
    Costa, M.: Probabilistic interpretation of feedforward network outputs, with relationships to statistical prediction of ordinal quantities. Int. J. Neural Syst. 7(05), 627–637 (1996).  https://doi.org/10.1142/S0129065796000610 Google Scholar
  26. 26.
    De Waal, P.R., Van Der Gaag, L.C.: Inference and learning in multi-dimensional bayesian network classifiers. In: European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty, pp. 501–511. Springer (2007).  https://doi.org/10.1007/978-3-540-75256-1_45
  27. 27.
    De’Ath, G.: Multivariate regression trees: a new technique for modeling species-environment relationships. Ecology 83(4), 1105–1117 (2002).  https://doi.org/10.1890/0012-9658(2002)083[1105:MRTANT]2.0.CO;2 Google Scholar
  28. 28.
    Dekel, O., Singer, Y., Manning, C.D.: Log-linear models for label ranking. In: Thrun, S., Saul, L.K.,Schölkopf, B. (eds.) Advances in Neural Information Processing Systems 16. MIT Press, pp. 497–504 (2004). http://papers.nips.cc/paper/2531-log-linear-models-for-label-ranking.pdf
  29. 29.
    Dembczyński, K., Kotłowski, W., Słowiński, R.: Ensemble of decision rules for ordinal classification with monotonicity constraints. In: International Conference on Rough Sets and Knowledge Technology, pp. 260–267. Springer (2008).  https://doi.org/10.1007/978-3-540-79721-0_38
  30. 30.
    Deng, W.Y., Zheng, Q.H., Lian, S., Chen, L., Wang, X.: Ordinal extreme learning machine. Neurocomputing 74(1–3), 447–456 (2010).  https://doi.org/10.1016/j.neucom.2010.08.022 Google Scholar
  31. 31.
    Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1–2), 31–71 (1997).  https://doi.org/10.1016/S0004-3702(96)00034-3 zbMATHGoogle Scholar
  32. 32.
    Diplaris, S., Tsoumakas, G., Mitkas, P., Vlahavas, I.: Protein classification with multiple algorithms. In: Proc. 10th Panhellenic Conference on Informatics, Volos, Greece, PCI05, pp. 448–456 (2005).  https://doi.org/10.1007/11573036_42
  33. 33.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, Hoboken (2012)zbMATHGoogle Scholar
  34. 34.
    Duivesteijn, W., Feelders, A.: Nearest neighbour classification with monotonicity constraints. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 301–316. Springer (2008).  https://doi.org/10.1007/978-3-540-87479-9_38
  35. 35.
    Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. 95(25), 14863–14868 (1998)Google Scholar
  36. 36.
    Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems 14. MIT Press, pp. 681–687 (2002). http://papers.nips.cc/paper/1964-a-kernel-method-for-multilabelled-classification.pdf
  37. 37.
    Elkan, C., Noto, K.: Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 213–220. ACM (2008).  https://doi.org/10.1145/1401890.1401920
  38. 38.
    Farquhar, J., Hardoon, D., Meng, H., Shawe-taylor, J.S., Szedmak, S.: Two view learning: Svm-2k, theory and practice. In: Weiss, Y., Schölkopf, B., Platt, J.C. (eds.) Advances in Neural Information Processing Systems 18. MIT Press, pp. 355–362 (2006). http://papers.nips.cc/paper/2829-two-viewlearning-svm-2k-theory-and-practice.pdf
  39. 39.
    Fe-Fei, L., et al.: A bayesian approach to unsupervised one-shot learning of object categories. In: Ninth IEEE International Conference on Computer Vision, 2003. Proceedings., pp. 1134–1141. IEEE (2003).  https://doi.org/10.1109/ICCV.2003.1238476
  40. 40.
    Fernández, A., García, S., Galar, M., Prati, R.C., Krawczyk, B., Herrera, F.: Learning from Imbalanced Data Sets. Springer, Berlin (2018).  https://doi.org/10.1007/978-3-319-98074-4 Google Scholar
  41. 41.
    Foulds, J., Frank, E.: A review of multi-instance learning assumptions. Knowl. Eng. Rev. 25(1), 1–25 (2010).  https://doi.org/10.1017/S026988890999035X Google Scholar
  42. 42.
    Frank, E., Hall, M.: A simple approach to ordinal classification. In: European Conference on Machine Learning, pp. 145–156. Springer (2001).  https://doi.org/10.1007/3-540-44795-4_13
  43. 43.
    Fukunaga, K.: Introduction to Statistical Pattern Recognition. Elsevier, Amsterdam (2013)zbMATHGoogle Scholar
  44. 44.
    Fürnkranz, J., Hüllermeier, E., Mencía, E.L., Brinker, K.: Multilabel classification via calibrated label ranking. Mach. Learn. 73(2), 133–153 (2008).  https://doi.org/10.1007/s10994-008-5064-8 Google Scholar
  45. 45.
    Gama, J.: Knowledge Discovery from Data Streams. Chapman and Hall/CRC, Boca Raton (2010)zbMATHGoogle Scholar
  46. 46.
    Geng, X.: Label distribution learning. IEEE Trans. Knowl. Data Eng. 28(7), 1734–1748 (2016).  https://doi.org/10.1109/TKDE.2016.2545658 Google Scholar
  47. 47.
    Gibaja, E., Ventura, S.: A tutorial on multilabel learning. ACM Comput. Surv. 47(3), 52 (2015).  https://doi.org/10.1145/2716262 Google Scholar
  48. 48.
    Greco, S., Matarazzo, B., Slowinski, R.: A new rough set approach to evaluation of bankruptcy risk. In: Zopounidis C (ed.) Operational Tools in the Management of Financial Risks, pp. 121–136. Springer, Boston, MA (1998).  https://doi.org/10.1007/978-1-4615-5495-0_8
  49. 49.
    Greco, S., Matarazzo, B., Słowiński, R.: Rough set approach to customer satisfaction analysis. In: International Conference on Rough Sets and Current Trends in Computing, pp. 284–295. Springer (2006).  https://doi.org/10.1007/11908029_31
  50. 50.
    Gutiérrez, P.A., García, S.: Current prospects on ordinal and monotonic classification. Prog. Artif. Intell. 5(3), 171–179 (2016).  https://doi.org/10.1007/s13748-016-0088-y Google Scholar
  51. 51.
    Gutiérrez, P.A., Pérez-Ortiz, M., Sánchez-Monedero, J., Fernández-Navarro, F., Hervás-Martínez, C.: Ordinal regression methods: survey and experimental study. IEEE Trans. Knowl. Data Eng. 28(1), 127–146 (2016).  https://doi.org/10.1109/TKDE.2015.2457911 Google Scholar
  52. 52.
    Har-Peled, S., Roth, D., Zimak, D.: Constraint classification for multiclass classification and ranking. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems 15. MIT Press, pp. 809–816 (2003). http://papers.nips.cc/paper/2295-constraint-classification-for-multiclass-classification-and-ranking.pdf
  53. 53.
    Hernández-González, J., Inza, I., Lozano, J.A.: Weak supervision and other non-standard classification problems: a taxonomy. Pattern Recognit. Lett. 69, 49–55 (2016).  https://doi.org/10.1016/j.patrec.2015.10.008 Google Scholar
  54. 54.
    Herrera, F., Charte, F., Rivera, A.J., Del Jesus, M.J.: Multilabel Classification. Springer, Berlin (2016)Google Scholar
  55. 55.
    Herrera, F., Ventura, S., Bello, R., Cornelis, C., Zafra, A., Sánchez-Tarragó, D., Vluymans, S.: Multiple Instance Learning: Foundations and Algorithms. Springer, Berlin (2016).  https://doi.org/10.1007/978-3-319-47759-6 zbMATHGoogle Scholar
  56. 56.
    Hüllermeier, E., Fürnkranz, J., Cheng, W., Brinker, K.: Label ranking by learning pairwise preferences. Artif. Intell. 172(16–17), 1897–1916 (2008).  https://doi.org/10.1016/j.artint.2008.08.002 MathSciNetzbMATHGoogle Scholar
  57. 57.
    Hyndman, R.J., Athanasopoulos, G.: Forecasting: Principles and Practice, 2nd edn. OTexts, Melbourne, Australia (2018)Google Scholar
  58. 58.
    Izenman, A.J.: Reduced-rank regression for the multivariate linear model. J. Multivar. Anal. 5(2), 248–264 (1975).  https://doi.org/10.1016/0047-259X(75)90042-1 MathSciNetzbMATHGoogle Scholar
  59. 59.
    Jain, A.K., Duin, R.P., Mao, J.: Statistical pattern recognition: a review. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 4–37 (2000)Google Scholar
  60. 60.
    James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning: With Applications in R. Springer, New York (2013).  https://doi.org/10.1007/978-1-4614-7138-7 zbMATHGoogle Scholar
  61. 61.
    Katakis, I., Tsoumakas, G., Vlahavas, I.: Multilabel text classification for automated tag suggestion. In: Proc. ECML PKDD08 Discovery Challenge, Antwerp, Belgium, pp. 75–83 (2008)Google Scholar
  62. 62.
    Kocev, D., Džeroski, S., White, M.D., Newell, G.R., Griffioen, P.: Using single-and multi-target regression trees and ensembles to model a compound index of vegetation condition. Ecol. Model. 220(8), 1159–1168 (2009).  https://doi.org/10.1016/j.ecolmodel.2009.01.037 Google Scholar
  63. 63.
    Kocev, D., Vens, C., Struyf, J., Džeroski, S.: Tree ensembles for predicting structured outputs. Pattern Recognit. 46(3), 817–833 (2013).  https://doi.org/10.1016/j.patcog.2012.09.023 Google Scholar
  64. 64.
    Kotlowski, W., Slowinski, R.: On nonparametric ordinal classification with monotonicity constraints. IEEE Trans. Knowl. Data Eng. 25(11), 2576–2589 (2013).  https://doi.org/10.1109/TKDE.2012.204 Google Scholar
  65. 65.
    Kotsiantis, S., Kanellopoulos, D., Tampakas, V.: Financial application of multi-instance learning: two greek case studies. J. Converg. Inf. Technol. 5(8), 42–53 (2010)Google Scholar
  66. 66.
    Krawczyk, B.: Learning from imbalanced data: open challenges and future directions. Prog. Artif. Intell. 5(4), 221–232 (2016).  https://doi.org/10.1007/s13748-016-0094-0 Google Scholar
  67. 67.
    Kumar, A., Rai, P., Daume, H.: Co-regularized multi-view spectral clustering. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L., Pereira, F., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 24. Curran Associates, Inc., pp. 1413–1421 (2011). http://papers.nips.cc/paper/4360-co-regularized-multi-view-spectral-clustering.pdf
  68. 68.
    Kuznar, D., Mozina, M., Bratko, I.: Curve prediction with kernel regression. In: Proceedings of the 1st Workshop on Learning from Multi-Label Data, pp. 61–68 (2009)Google Scholar
  69. 69.
    Kwon, Y.S., Han, I., Lee, K.C.: Ordinal pairwise partitioning (opp) approach to neural networks training in bond rating. Intell. Syst. Account. Finance Manag. 6(1), 23–40 (1997).  https://doi.org/10.1002/(SICI)1099-1174(199703)6:1<23::AID-ISAF113>3.0.CO;2-4
  70. 70.
    Laghmari, K., Marsala, C., Ramdani, M.: An adapted incremental graded multi-label classification model for recommendation systems. Prog. Artif. Intell. 7(1), 15–29 (2018).  https://doi.org/10.1007/s13748-017-0133-5 Google Scholar
  71. 71.
    Li, S.Z., Zhu, L., Zhang, Z., Blake, A., Zhang, H., Shum, H.: Statistical learning of multi-view face detection. In: European Conference on Computer Vision, pp. 67–81. Springer (2002).  https://doi.org/10.1007/3-540-47979-1_5
  72. 72.
    Lin, H.T., Li, L.: Combining ordinal preferences by boosting. In: Proceedings ECML/PKDD 2009 Workshop on Preference Learning, pp. 69–83 (2009)Google Scholar
  73. 73.
    Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.S.: Building text classifiers using positive and unlabeled examples. In: Third IEEE International Conference on Data Mining, 2003. ICDM 2003, pp. 179–186. IEEE (2003).  https://doi.org/10.1109/ICDM.2003.1250918
  74. 74.
    López-Cruz, P.L., Bielza, C., Larrañaga, P.: Learning conditional linear Gaussian classifiers with probabilistic class labels. In: Conference of the Spanish Association for Artificial Intelligence, pp. 139–148. Springer (2013).  https://doi.org/10.1007/978-3-642-40643-0_15
  75. 75.
    Lyons, M., Akamatsu, S., Kamachi, M., Gyoba, J.: Coding facial expressions with gabor wavelets. In: Automatic Face and Gesture Recognition, 1998. Proceedings. Third IEEE International Conference on, pp. 200–205. IEEE (1998).  https://doi.org/10.1109/AFGR.1998.670949
  76. 76.
    Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Jordan, M.I., Kearns, M.J., Solla, S.A. (eds.) Advances in Neural Information Processing Systems 10. MIT Press, pp. 570–576 (1998). http://papers.nips.cc/paper/1346-a-framework-for-multiple-instance-learning.pdf
  77. 77.
    Marsland, S.: Machine Learning: An Algorithmic Perspective. Chapman & Hall, Boca Raton (2014)Google Scholar
  78. 78.
    Micchelli, C.A., Pontil, M.: On learning vector-valued functions. Neural Comput. 17(1), 177–204 (2005).  https://doi.org/10.1162/0899766052530802 MathSciNetzbMATHGoogle Scholar
  79. 79.
    Mitchell, T.M.: Machine Learning. McGraw Hill Series in Computer Science. McGraw-Hill, New York City (1997)zbMATHGoogle Scholar
  80. 80.
    Moya, M.M., Koch, M.W., Hostetler, L.D.: One-class classifier networks for target recognition applications. NASA STI/Recon Technical Report N, vol. 93 (1993)Google Scholar
  81. 81.
    Moyano, J.M., Gibaja, E.L., Cios, K.J., Ventura, S.: Review of ensembles of multi-label classifiers: models, experimental study and prospects. Inf. Fusion 44, 33–45 (2018).  https://doi.org/10.1016/j.inffus.2017.12.001 Google Scholar
  82. 82.
    Murphy, K.P.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)zbMATHGoogle Scholar
  83. 83.
    Nguyen, C.T., Wang, X., Liu, J., Zhou, Z.H.: Labeling complicated objects: multi-view multi-instance multi-label learning. In: AAAI, pp. 2013–2019 (2014)Google Scholar
  84. 84.
    Nilsson, N.J.: Learning Machines: Foundations of Trainable Pattern-Classifying Systems. McGraw-Hill, New York City (1965)zbMATHGoogle Scholar
  85. 85.
    Palatucci, M., Pomerleau, D., Hinton, G.E., Mitchell, T.M.: Zero-shot learning with semantic output codes. In: Bengio, Y., Schuurmans, D., Lafferty, J.D., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22. Curran Associates, Inc., pp. 1410–1418 (2009). http://papers.nips.cc/paper/3650-zero-shot-learning-with-semantic-output-codes.pdf
  86. 86.
    Pan, F.: Multi-dimensional Fragment Classification in Biomedical Text. Queen’s University, Kingston (2006)Google Scholar
  87. 87.
    Pan, S.J., Kwok, J.T., Yang, Q., Pan, J.J.: Adaptive localization in a dynamic wifi environment through multi-view learning. In: AAAI, pp. 1108–1113 (2007)Google Scholar
  88. 88.
    Potharst, R., Feelders, A.J.: Classification trees for problems with monotonicity constraints. ACM SIGKDD Explor. Newsl. 4(1), 1–10 (2002).  https://doi.org/10.1145/568574.568577 Google Scholar
  89. 89.
    Ramon, J., De Raedt, L.: Multi instance neural networks. In: Proceedings of the ICML-2000 Workshop on Attribute-Value and Relational Learning, pp. 53–60 (2000)Google Scholar
  90. 90.
    Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333 (2011).  https://doi.org/10.1007/s10994-011-5256-5 MathSciNetGoogle Scholar
  91. 91.
    Ryu, Y.U., Chandrasekaran, R., Jacob, V.S.: Breast cancer prediction using the isotonic separation technique. Eur. J. Oper. Res. 181(2), 842–854 (2007).  https://doi.org/10.1016/j.ejor.2006.06.031 zbMATHGoogle Scholar
  92. 92.
    Sánchez-Fernández, M., de Prado-Cumplido, M., Arenas-García, J., Pérez-Cruz, F.: Svm multiregression for nonlinear channel estimation in multiple-input multiple-output systems. IEEE Trans. Signal Process. 52(8), 2298–2307 (2004).  https://doi.org/10.1109/TSP.2004.831028 MathSciNetzbMATHGoogle Scholar
  93. 93.
    Sánchez-Monedero, J., Gutiérrez, P.A., Hervás-Martínez, C.: Evolutionary ordinal extreme learning machine. In: International Conference on Hybrid Artificial Intelligence Systems, pp. 500–509. Springer (2013).  https://doi.org/10.1007/978-3-642-40846-5_50
  94. 94.
    Shalev-Shwartz, S., Singer, Y.: A unified algorithmic approach for efficient online label ranking. In: Artificial Intelligence and Statistics, pp. 452–459 (2007)Google Scholar
  95. 95.
    Shatkay, H., Pan, F., Rzhetsky, A., Wilbur, W.J.: Multi-dimensional classification of biomedical text: toward automated, practical provision of high-utility text to diverse users. Bioinformatics 24(18), 2086–2093 (2008).  https://doi.org/10.1093/bioinformatics/btn381 Google Scholar
  96. 96.
    Sill, J.: Monotonic networks. In: Jordan, M.I., Kearns, M.J., Solla, S.A. (eds.) Advances in Neural Information Processing Systems 10. MIT Press, pp. 661–667 (1998). http://papers.nips.cc/paper/1358-monotonic-networks.pdf
  97. 97.
    Silla, C.N., Freitas, A.A.: A survey of hierarchical classification across different application domains. Data Min. Knowl. Discov. 22(1–2), 31–72 (2011)MathSciNetzbMATHGoogle Scholar
  98. 98.
    Silva, J.A., Faria, E.R., Barros, R.C., Hruschka, E.R., De Carvalho, A.C., Gama, J.: Data stream clustering: a survey. ACM Comput. Surv. 46(1), 13 (2013)zbMATHGoogle Scholar
  99. 99.
    Smola, A.J., Schölkopf, B.: On a kernel-based method for pattern recognition, regression, approximation, and operator inversion. Algorithmica 22(1–2), 211–231 (1998)MathSciNetzbMATHGoogle Scholar
  100. 100.
    Sousa, R., Gama, J.: Multi-label classification from high-speed data streams with adaptive model rules and random rules. Prog. Artif. Intell. 7(3), 177–187 (2018).  https://doi.org/10.1007/s13748-018-0142-z Google Scholar
  101. 101.
    Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., Vlahavas, I.: Multi-label classification methods for multi-target regression. arXiv preprint arXiv 1211 (2012)Google Scholar
  102. 102.
    Sun, S., Chao, G.: Multi-view maximum entropy discrimination. In: IJCAI, pp. 1706–1712 (2013)Google Scholar
  103. 103.
    Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 455–465. Association for Computational Linguistics (2012)Google Scholar
  104. 104.
    Taskar, B., Chatalbashev, V., Koller, D., Guestrin, C.: Learning structured prediction models: a large margin approach. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 896–903. ACM (2005).  https://doi.org/10.1145/1102351.1102464
  105. 105.
    Tax, D.M., Duin, R.P.: Using two-class classifiers for multiclass classification. In: 16th International Conference on Pattern Recognition, 2002. Proceedings, vol. 2, pp. 124–127. IEEE (2002)Google Scholar
  106. 106.
    Thabtah, F.A., Cowling, P., Peng, Y.: Mmac: A new multi-class, multi-label associative classification approach. In: Fourth IEEE International Conference on Data Mining, 2004. ICDM’04. , pp. 217–224. IEEE (2004).  https://doi.org/10.1109/ICDM.2004.10117
  107. 107.
    Tian, Q., Chen, S., Tan, X.: Comparative study among three strategies of incorporating spatial structures to ordinal image regression. Neurocomputing 136, 152–161 (2014).  https://doi.org/10.1016/j.neucom.2014.01.017 Google Scholar
  108. 108.
    Tsoumakas, G., Vlahavas, I.: Random k-labelsets: an ensemble method for multilabel classification. In: European Conference on Machine Learning, pp. 406–417. Springer (2007).  https://doi.org/10.1007/978-3-540-74958-5_38
  109. 109.
    Tuia, D., Verrelst, J., Alonso, L., Pérez-Cruz, F., Camps-Valls, G.: Multioutput support vector regression for remote sensing biophysical parameter estimation. IEEE Geosci. Remote Sens. Lett. 8(4), 804–808 (2011).  https://doi.org/10.1109/LGRS.2011.2109934 Google Scholar
  110. 110.
    Tzortzis, G., Likas, A.: Kernel-based weighted multi-view clustering. In: 2012 IEEE 12th International Conference on Data Mining (ICDM), pp. 675–684. IEEE (2012).  https://doi.org/10.1109/ICDM.2012.43
  111. 111.
    Van Der Merwe, A., Zidek, J.: Multivariate regression analysis and canonical variates. Can. J. Stat. 8(1), 27–39 (1980).  https://doi.org/10.2307/3314667 MathSciNetzbMATHGoogle Scholar
  112. 112.
    Vazquez, E., Walter, E.: Multi-output support vector regression. In: 13th IFAC Symposium on System Identification, pp. 1820–1825. Citeseer (2003)Google Scholar
  113. 113.
    Vembu, S., Gärtner, T.: Label ranking algorithms: a survey. In: Preference learning, pp. 45–64. Springer (2010).  https://doi.org/10.1007/978-3-642-14125-6_3
  114. 114.
    Wang, J., Zucker, J.D.: Solving multiple-instance problem: a lazy learning approach. In: International Conference on Machine Learning, pp. 1119–1126. Morgan Kaufmann Publishers (2000)Google Scholar
  115. 115.
    Williams, C.K., Barber, D.: Bayesian classification with gaussian processes. IEEE Trans. Pattern Anal. Mach. Intell. 20(12), 1342–1351 (1998)Google Scholar
  116. 116.
    Wu, B., Zhong, E., Horner, A., Yang, Q.: Music emotion recognition by multi-label multi-layer multi-instance multi-view learning. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 117–126. ACM (2014).  https://doi.org/10.1145/2647868.2654904
  117. 117.
    Zhang, M.L., Zhou, Z.H.: Ml-knn: A lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007).  https://doi.org/10.1016/j.patcog.2006.12.019 zbMATHGoogle Scholar
  118. 118.
    Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014).  https://doi.org/10.1109/TKDE.2013.39 Google Scholar
  119. 119.
    Zhang, W., Liu, X., Ding, Y., Shi, D.: Multi-output ls-svr machine in extended feature space. In: 2012 IEEE International Conference on Computational Intelligence for Measurement Systems and Applications (CIMSA), pp. 130–134. IEEE (2012).  https://doi.org/10.1109/CIMSA.2012.6269600
  120. 120.
    Zhao, J., Xie, X., Xu, X., Sun, S.: Multi-view learning overview: recent progress and new challenges. Inf. Fus. 38, 43–54 (2017).  https://doi.org/10.1016/j.inffus.2017.02.007 Google Scholar
  121. 121.
    Zhou, Z.H., Sun, Y.Y., Li, Y.F.: Multi-instance learning by treating instances as non-iid samples. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1249–1256. ACM (2009).  https://doi.org/10.1145/1553374.1553534
  122. 122.
    Zhou, Z.H., Zhang, M.L., Huang, S.J., Li, Y.F.: Multi-instance multi-label learning. Artif. Intell. 176(1), 2291–2320 (2012).  https://doi.org/10.1016/j.artint.2011.10.002 MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer Science and Artificial IntelligenceUniversity of GranadaGranadaSpain
  2. 2.Department of Computer ScienceUniversity of JaénJaénSpain

Personalised recommendations