Abstract
The future Internet is expected to connect billions of people, things and services having the potential to deliver a new set of applications by deriving new insights from the data generated from these diverse data sources. This highly interconnected global network brings new types of challenges in analysing and making sense of data. This is why machine learning is expected to be a crucial technology in the future, in making sense of data, in improving business and decision making, and in doing so, providing the potential to solve a wide range of problems in health care, telecommunications, urban computing, and others. Machine learning algorithms can learn how to perform certain tasks by generalizing examples from a range of sampling. This is a totally different paradigm than traditional programming language approaches, which are based on writing programs that process data to produce an output. However, choosing a suitable machine learning algorithm for a particular application requires a substantial amount of time and effort that is hard to undertake even with excellent research papers and textbooks. In order to reduce the time and effort, this paper introduces the TCDC (train, compare, decide, and change) approach, which can be thought as a ‘Machine Learning as a Service’ approach, to aid machine learning researchers and practitioners to choose the optimum machine learning model to use for achieving the best trade-off between accuracy and interpretability, computational complexity, and ease of implementation. The paper includes the results of testing and evaluating the recommenders based on the TCDC approach (in comparison with the traditional default approach) applied to 12 datasets that are available as open-source datasets drawn from diverse domains including health care, agriculture, aerodynamics and others. Our results indicate that the proposed approach selects the best model in terms of predictive accuracy in 62.5 % for regression tests performed and 75 % for classification tests.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Assem, H, O'Sullivan D (2015) Towards bridging the gap between machine learning researchers and practitioners. In: 2015 IEEE international conference on Smart City/SocialCom/SustainCom (SmartCity). IEEE, pp 702–708
Bengio Y, Lamblin P, Popovici D, Larochelle H et al (2007) Greedy layer-wise training of deep networks. Adv Neural Inf Process Syst 19:153
Caruana R, Niculescu-Mizil A (2006) An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd international conference on Machine learning, pp 161–168. ACM
Cleveland WS (1979) Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc 74(368):829–836
Cover TM, Thomas JA (2012) Elements of information theory. Wiley, Hoboken
Domingos P (2012) A few useful things to know about machine learning. Commun ACM 55(10):78–87
Efron B, Tibshirani R (1986) Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Statistical science pp 54–75
Eugster MJA, Hothorn T, Leisch F (2008) Exploratory and inferential analysis of benchmark experiments. Department of Statistics, University of Munich. Tech Rep 30
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Hothorn T, Leisch F, Zeileis A, Hornik K (2005) The design and analysis of benchmark experiments. J Comput Gr Stat 14(3):675–699
Keaveney P (2001) Marketing for the voluntary sector: a guide to measuring marketing performance. Kogan Page Publishers, London
Kerr IR (2013) The internet of people? reflections on the future regulation of human-implantable radio frequency identification. In: Kerr IR, Steeves V, Lucock C (eds) Privacy, identity, and anonymity: lessons from the identity trail. Oxford University Press, Oxford (in press 2009)
Kuhn M, Johnson K (2013) Applied predictive modeling. Springer, New York
LeCun Y, Jackel L, Bottou L, Brunot A, Cortes C, Denker J, Drucker H, Guyon I, Muller U, Sackinger E et al (1995) Comparison of learning algorithms for handwritten digit recognition. Int Conf Artif Neural Netw 60:53–60
Lohr S (2012) The age of big data, vol 11. New York Times, New York
Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Byers AH (2011) Big data: the next frontier for innovation, competition, and productivity
Martin J, Hirschberg D (1996) Small sample statistics for classification error rates I: error rate measurements. Technical Report No. 96-21. Department of Information and Computer Science, University of California, Irvine
Molinaro AM, Simon R, Pfeiffer RM (2005) Prediction error estimation: a comparison of resampling methods. Bioinform 21:3301–3307
Olshen L, Stone CJ et al (1984) Classification and regression trees. Wadsworth Int Gr 93(99):101
Poultney C, Chopra S, Cun YL et al (2006) Efficient learning of sparse representations with an energy-based model. Adv Neural Inf Process Syst pp 1137–1144
Sundmaeker H, Guillemin P, Friess P, Woelfflé S (2010) Vision and challenges for realising the internet of things, European commission information society and media. Tech Rep. http://www.internet-of-things-research.eu/pdf/IoTClusterbookMarch2010.pdf. Accessed 26 July 2015
Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Nat Acad Sci 99(10):6567–6572
UCI Machine Learning Repository (2015). http://archive.ics.uci.edu/ml/datasets.html. Accessed 19 Dec 2015
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Burlington
Wolpert DH (1996) The lack of a priori distinctions between learning algorithms. Neural computation 8(7):1341–1390
Acknowledgments
This work was partially supported by the EC project CogNet, 671625 (H2020-ICT-2014-2, Research and Innovation action) and in part supported by the Science Foundation Ireland ADAPT centre (Grant 13/RC/2106).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Assem, H., Xu, L., Buda, T.S. et al. Machine learning as a service for enabling Internet of Things and People. Pers Ubiquit Comput 20, 899–914 (2016). https://doi.org/10.1007/s00779-016-0963-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00779-016-0963-3