Abstract
Prediction methods combining clustering and classification techniques have the potential of creating more accurate results than the individual techniques, particularly for large datasets. In this paper, a hybrid prediction method is proposed from combining weighted k-means clustering and linear regression. Weighted k-means is used to cluster the dataset. Then, linear regression is performed on each cluster to build the final predictors. The proposed method has been applied to the problem of municipal waste prediction and evaluated with a dataset including 63,000 records. The results showed that it outperforms the single application of linear regression and k-means clustering in terms of prediction accuracy and robustness. The prediction model is integrated into a decision support system for strategic and operational planning of waste and recycling services at the City of Calgary in Canada. The potential usage of the prediction model is to improve the resource utilization, like personnel and vehicles.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Han, J., Nishio, S., Kawano, H., Wang, W.: Generalization-based data mining in object-oriented databases using an object cube model. Data & Knowledge Engineering 25(1), 55–97 (1998)
Han, J., Kamber, M.: Data mining: Concepts and techniques. Morgan Kaufmann Publishers (2006)
Witten, H.I., Frank, E.: Data mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann Publishers (2005)
Patil, B.M., Joshi, R.C., Toshniwal, D.: Hybrid prediction model for Type-2 diabetic patients. Expert Systems with Applications 37(12), 8102–8108 (2010)
Tsai, C.-F., Lu, Y.-H.: Customer churn prediction by hybrid neural networks. Expert Systems with Applications 36(10), 12547–12553 (2009)
Lenard, M.J., Madey, G.R., Alam, P.: The design and validation of a hybrid information system for the auditor’s going concern decision. Journal of Management Information Systems 14(4), 219–237 (1998)
Jain, A., Murty, M., Flyn, P.: Data clustering: A review. ACM Computing Surveys 31, 264–323 (1999)
Purcell, M., Magette, W.L.: Prediction of household and commercial BMW generation according to socio-economic and other factors for the Dublin region. Waste Management 29(4), 1237–1250 (2009)
Daskalopoulos, E., Badr, O., Probert, S.D.: Municipal solid waste: a prediction methodology for the generation rate and composition in the European Union countries and the United States of America. Resources, Conservation and Recycling 24(2), 155–166 (1998)
Dyson, B., Chang, N.B.: Forecasting municipal solid waste generation in a fast-growing urban region with system dynamics modeling. Waste Management 25(7), 669–679 (2005)
Grieu, S., et al.: Prediction of parameters characterizing the state of a pollution removal biologic process. Engineering Applications of Artificial Intelligence 18(5), 559–573 (2005)
Goia, A., May, C., Fusai, G.: Functional clustering and linear regression for peak load forecasting. International Journal of Forecasting 26(4), 700–711 (2010)
Kusiak, A., Li, W.: Short-term prediction of wind power with a clustering approach. Renewable Energy 35(10), 2362–2369 (2010)
Márquez, M.Y., Ojeda, S., Hidalgo, H.: Identification of behavior patterns in household solid waste generation in Mexicali’s city: Study case. Resources, Conservation and Recycling 52(11), 1299–1306 (2008)
Jalili, M., Noori, R.: Prediction of municipal solid waste generation by use of artificial neural network: a case study of Mashhad. International Journal of Environmental Research 2, 13–22 (2008)
Noori, R., Abdoli, M.A., Farokhnia, A., Abbasi, M.: Results uncertainty of solid waste generation forecasting by hybrid of wavelet transform-ANFIS and wavelet transform-neural network. Expert Systems with Applications 36, 9991–9999 (2009)
Noori, R., Karbassi, A., Sabahi, M.: Evaluation of PCA and Gamma test techniques on ANN operation for weekly solid waste prediction. Journal of Environmental Management 91(3), 767–771 (2010)
Guojun, G., Chaoqu, M., Jianhong, W.: Data clustering theory algorithm and application, 1st edn. ASA-SIAM, Society for Industrial and Applied Mathematics (2007)
MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press (1967)
Tibshirani, R.: Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological) 58(1), 267–288 (1996)
Berk, R.A.: Regression Analysis: A Constructive Critique (Advanced Quantitative Techniques in the Social Sciences), 1st edn. Sage Publications, Inc. (2003)
Briand, L.C., Wieczorek, I.: Resource Estimation in Software Engineering. Encyclopedia of Software Engineering 2, 1160–1196 (2001)
Picard, R., Cook, D.: Cross-Validation of Regression Models. Journal of the American Statistical Association 79(387), 575–583 (1984)
Weka SVN Repository, https://svn.scms.waikato.ac.nz/svn/weka/ (accessed January 2013)
Livani, E.: http://www.ucalgary.ca/~elivani/HybridPrediction.xls (accessed January 2013)
Livani, E., Paikari, E., Ruhe, G.: Decision Support System for Cost-benefit Analysis in Service Provision. In: Proc. ICEIS, vol. 2, pp. 198–203 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Livani, E., Nguyen, R., Denzinger, J., Ruhe, G., Banack, S. (2013). A Hybrid Machine Learning Method and Its Application in Municipal Waste Prediction. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2013. Lecture Notes in Computer Science(), vol 7987. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39736-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-39736-3_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39735-6
Online ISBN: 978-3-642-39736-3
eBook Packages: Computer ScienceComputer Science (R0)