Abstract
It has become a basic precursor and facilitator to analyze the emergence of big data with the rise of cloud computing and cloud storage by means of the novel standardized technologies. Then, binary relevance method is carried out as one of the widely known classifier chain methods for multi-label classification. It achieves a higher predictive performance, but it still retains a complex process and takes much computation time. So, in this paper, we present a enhanced classifier chain algorithm with K-means cluster method to confirm the order of the binary classifiers. It has a different strategy that several times of K-means algorithms are employed to get the correlations between labels and to confirm the order of binary classifiers. The algorithm ensures the precise correlations to be transmitted persistently to improve the earlier predictions accuracy. The experiments on a sample data sets of Reuters-21578 show that the approach is effective and appealing in the common cases, it is accurate for a preliminary classification to provide a basis for the further refined classifications.
Similar content being viewed by others
References
Bezdek, J.C., Chuah, S.K., & Leep, D. (1986). Generalized k-nearest neighbor rules. Fuzzy Sets and Systems, 18(3), 237–256.
Cerri, R., Barros, R.C., & de Carvalho, A.C. (2014). Hierarchical multi-label classification using local neural networks. Journal of Computer and System Sciences, 80(1), 39–56.
Jiang, J., & McQuay, L. (2012). Predicting protein function by multi-label correlated semi-supervised learning. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9(4), 1059–1069.
Jing, Y., Hu, L., Ku, W.S., & Shahabi, C. (2014). Authentication of k nearest neighbor query on road networks. IEEE Transactions on Knowledge and Data Engineering, 26(6), 1494–1506.
Kumar, A., Vembu, S., Menon, A.K., & Elkan, C. (2012). Learning and inference in probabilistic classifier chains with beam search. In Machine learning and knowledge discovery in databases (pp. 665–680): Springer.
Kusner, M., Tyree, S., Weinberger, K.Q., & Agrawal, K. (2014). Stochastic neighbor compression. In Proceedings of the 31st international conference on machine learning (ICML-14) (pp. 622–630).
Lee, J., & Kim, D.W. (2015). Memetic feature selection algorithm for multi-label classification. Information Sciences, 293, 80–96.
Li, J., Qiu, M., Ming, Z., Quan, G., Qin, X., & Gu, Z. (2012). Online optimization for scheduling preemptable tasks on iaas cloud systems. Journal of Parallel and Distributed Computing, 72(5), 666–677.
Lo, H.Y., Lin, S.D., & Wang, H.M. (2014). Generalized k-labelsets ensemble for multi-label and cost-sensitive classification. IEEE Transactions on Knowledge and Data Engineering, 26(7), 1679–1691.
Nair-Benrekia, N.Y., Kuntz, P., & Meyer, F. (2015). Learning from multi-label data with interactivity constraints: an extensive experimental study. Expert Systems with Applications, 42(13), 5723–5736.
Qiu, M., Ming, Z., Li, J., Liu, J., Quan, G., & Zhu, Y. (2013). Informer homed routing fault tolerance mechanism for wireless sensor networks. Journal of Systems Architecture, 59(4), 260–270.
Qiu, M., Chen, Z., Ming, Z., Qin, X., & Niu, J. (2014). Energy-aware data allocation with hybrid memory for mobile cloud systems. IEEE Systems Journal, PP(99), 1–10.
Qiu, M., Ming, Z., Li, J., Gai, K., & Zong, Z. (2015). Phase-change memory optimization for green cloud with genetic algorithm. IEEE Transactions on Computers, 64(12), 3528–3540.
Read, J., Pfahringer, B., Holmes, G., & Frank, E. (2011). Classifier chains for multi-label classification. Machine learning, 85(3), 333–359.
Sarinnapakorn, K., & Kubat, M. (2007). Combining subclassifiers in text categorization: a dst-based solution and a case study. IEEE Transactions on Knowledge and Data Engineering, 19(12), 1638–1651.
Tao, D., Li, X., & Maybank, S. (2007). Negative samples analysis in relevance feedback. IEEE Transactions on Knowledge and Data Engineering, 19(4), 568–580.
Wu, G., Zhang, H., Qiu, M., Ming, Z., Li, J., & Qin, X. (2013). A decentralized approach for mining event correlations in distributed system monitoring. Journal of Parallel and Distributed Computing, 73(3), 330–340.
Zhang, M.L., & Zhang, K. (2010). Multi-label learning by exploiting label dependency. In Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, USA, KDD ’10 (pp. 999–1008).
Zhang, M.L., & Zhou, Z.H. (2007). Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognition, 40(7), 2038–2048.
Zhang, M.L., & Zhou, Z.H. (2014). A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 26(8), 1819–1837.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yu, Z., Hao, H., Zhang, W. et al. A Classifier Chain Algorithm with K-means for Multi-label Classification on Clouds. J Sign Process Syst 86, 337–346 (2017). https://doi.org/10.1007/s11265-016-1137-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-016-1137-2