Configurable Parallel Induction Machines

Ionkina, Karina; Hancock, Monte; Kannan, Raman

doi:10.1007/978-3-030-78114-9_28

Karina Ionkina¹⁰,
Monte Hancock^10,11 &
Raman Kannan¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12776))

Included in the following conference series:

International Conference on Human-Computer Interaction

1319 Accesses

Abstract

Machine Learning practice in general offers significant opportunities for parallel computing and practicing sound software engineering. More often than not, practitioners routinely write dataset specific scripts and learners focus on model building and refining. Focusing on particular models is not consistent with NFL, a fundamental theorem in Machine Learning. Not minding time-honored software engineering principles is inefficient. In this paper, we present our implementation of MISD machine, consistent with No Free Lunch Theorem, problems we encountered and our approach to solve those problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Wolpert, D.: The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7), 1341–1390 (1996). https://doi.org/10.1162/neco.1996.8.7.1341
Article Google Scholar
http://no-free-lunch.org/
https://www.kdnuggets.com/2019/09/no-free-lunch-data-science.html
Wolpert, D.H.: The supervised learning no-free-lunch theorems. In: Roy, R., Köppen, M., Ovaska, S., Furuhashi, T., Hoffmann, F. (eds.) Soft Computing and Industry, pp. 25–42. Springer, London (2002). https://doi.org/10.1007/978-1-4471-0123-9_3
Chapter Google Scholar
Kotsiantis, S.B.: Supervised machine learning: a review of classification techniques. https://datajobs.com/data-science-repo/SupervisedLearning-%5bSB-Kotsiantis%5d.pdf
https://hpc.llnl.gov/tutorials/introduction-parallel-computing/flynns-classical-taxonomy
Mosier, M.W.: I. Problems and design of cross-validation. Educ. Psychol. Measur. 11, 5–11 (1951)
Article Google Scholar
Gerber, F., Nychka, D.W.: Parallel cross validation: a scalable fitting method for Gaussian process models. Comput. Stat. Data Anal. 155, 107113 (2021). https://doi.org/10.1016/j.csda.2020.107113
Article MathSciNet MATH Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 26, 123–140 (1996)
MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article Google Scholar
Wolpert, D.H.: Stacked generalization. Neural Netw. 5, 241–259 (1992)
Article Google Scholar
Efron, B.: Bootstrap methods: another look at the jackknife. Ann. Stat. 7(1), 1–26 (1979). https://doi.org/10.1214/aos/1176344552
Article MathSciNet MATH Google Scholar
Zaharia, M.: Apache Spark: a unified engine for big data processing. Commun. ACM 59, 56–65 (2016). https://cacm.acm.org/magazines/2016/11/209116-apache-spark/abstract
Sommerville, Software Engineering, 10 edn, chap. 15
Google Scholar
http://www.cs.iastate.edu/~honavar/occam.pdf
He, H., Ma, Y. (eds.): Imbalanced learning: Foundations, Algorithms, and Applications. Wiley, New York (2013)
MATH Google Scholar
https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/categorical_encoding.html
https://en.wikipedia.org/wiki/Dimensionality_reduction
https://www.analyticsvidhya.com/blog/2020/04/feature-scaling-machine-learning-normalization-standardization/
https://www.r-bloggers.com/2020/07/comparing-variable-importance-functions-for-modeling/
https://www.itl.nist.gov/div898/handbook/prc/section1/prc16.htm
https://stats.idre.ucla.edu/spss/modules/missing-data/
https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering
https://archive.ics.uci.edu/ml/datasets/Heart+Disease
R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2018). https://www.R-project.org/
https://nceas.github.io/oss-lessons/parallel-computing-in-r/parallel-computing-in-r.html
Aly, M.: Survey on multiclass classification methods. Neural Netw. 19, 1–9 (2005)
Google Scholar
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining, pp. 479–501 (2017)
Google Scholar
Kuhn, M.: Building predictive models in R using the caret package. www.jstatsoft.org/article/view/v028i05/v28i05.pdf

Download references

Acknowledgements

IBM Power Systems Academic Initiative IBM PSAI for their generous support for all my courses.

Author information

Authors and Affiliations

Hunter College, 695 Park Avenue, New York, NY, 10065, USA
Karina Ionkina & Monte Hancock
Tandon School of Engineering, NYU, Brooklyn, NY, 11201, USA
Monte Hancock & Raman Kannan

Authors

Karina Ionkina
View author publications
You can also search for this author in PubMed Google Scholar
Monte Hancock
View author publications
You can also search for this author in PubMed Google Scholar
Raman Kannan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raman Kannan .

Editor information

Editors and Affiliations

Soar Technology Inc., Orlando, FL, USA
Dylan D. Schmorrow
Design Interactive, Inc., Orlando, FL, USA
Cali M. Fidopiastis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ionkina, K., Hancock, M., Kannan, R. (2021). Configurable Parallel Induction Machines. In: Schmorrow, D.D., Fidopiastis, C.M. (eds) Augmented Cognition. HCII 2021. Lecture Notes in Computer Science(), vol 12776. Springer, Cham. https://doi.org/10.1007/978-3-030-78114-9_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-78114-9_28
Published: 03 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78113-2
Online ISBN: 978-3-030-78114-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics