Abstract
Data science utilizes all mathematics and computer sciences. In this chapter, we give a brief review of the most fundamental concepts in data science: graph search algorithms, statistical methods especially principal component analysis (PCA), algorithms and data structures, and data mining and pattern recognition. This chapter will provide an overview for machine learning in relation to other mathematical tools. We will first introduce graphs and graph algorithms, which will be used as the foundation of a branch of artificial intelligence called search. The other three branches of artificial intelligence are learning, planning, and knowledge representation. Classification, also related to machine learning, is at the center of pattern recognition, which we will discuss in Chap. 4 Statistical methods especially PCA and regression will also be discussed. Finally, we introduce concepts of data structures and algorithm design including online search and matching.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
A.V. Aho, J.E. Hopcroft, J.D. Ullman, The Design and Analysis of Computer Algorithms (Addison-Wesley, Boston, 1974)
M. Belkin, P. Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
B. Bollobas, Random Graphs (Academic, London, 1985)
S. Brin, L. Page, The anatomy of a large-scale hypertextual Web search engine. Comput. Netw. ISDN Syst. 30, 107–117 (1998)
G. Carlsson, A. Zomorodian, Theory of multidimensional persistence. Discret. Comput. Geom. 42(1), 71–93 (2009)
L. Cayton, Algorithms for manifold learning. Technical Report CS2008-0923, UCSD (2005)
L. Chen, Discrete Surfaces and Manifolds: A Theory of Digital-Discrete Geometry and Topology (SP Computing, Rockville, 2004)
L. Chen, Digital Functions and Data Reconstruction (Springer, New York, 2013)
L.M. Chen, Digital and Discrete Geometry: Theory and Algorithms, NY Springer (2014)
L. Chen, Y. Rong, Digital topological method for computing genus and the Betti numbers. Topol. Appl. 157(12), 1931–1936 (2010)
L. Chen, H. Zhu, W. Cui, Very fast region-connected segmentation for spatial data: case study, in IEEE Conference on System, Man, and Cybernetics (2006). pp. 4001–4005
T.H. Cormen, C.E. Leiserson, R.L. Rivest, Introduction to Algorithms (MIT Press, Cambridge, 1993)
M. Demirbas, H. Ferhatosmanoglu, Peer-to-peer spatial queries in sensor networks, in Third International Conference on Peer-to-Peer Computing, Linkoping (2003)
D.L. Donoho, C. Grimes. Hessian Eigenmaps: new locally linear embedding techniques for high-dimensional data. Technical Report TR-2003-08, Department of Statistics, Stanford University (2003)
Afrati Foto and Jeffrey Ullman, (2009) Optimizing Joins in a Map-Reduce Environment. Technical Report. Stanford InfoLab. (2009)
K. Fukunaga, L.D. Hostetler, The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inf. Theory 21(1), 32–40 (1975)
F. Harary, Graph Theory (Addison-Wesley, Reading, 1969)
M. Hardt, A. Moitra. Algorithms and hardness for robust subspace recovery, in COLT, pp. 354–375 (2013)
R. Ghrist, Barcodes: the persistent topology of data. Bull. Am. Math. Soc. 45(1), 61–75 (2008)
R.C. Gonzalez, R. Wood, Digital Image Processing (Addison-Wesley, Reading, 1993)
J. Goodman, J. O’Rourke, Handbook of Discrete and Computational Geometry (CRC, Boca Raton, 1997)
J. Han, M. Kamber, Data Mining: Concepts and Techniques (Morgan Kaufmann, San Francisco, 2001)
H. Homann, Implementation of a 3D Thinning Algorithm (Oxford University, Wolf-son Medical Vision Lab., Oxford, 2007)
F.V. Jensen, Bayesian Networks and Decision Graphs (Springer, New York, 2001)
T. Kanungo, D.M. Mount, N. Netanyahu, C. Piatko, R. Silverman, A.Y. Wu, A local search approximation algorithm for k-means clustering. Comput. Geom. Theory Appl. 28, 89–112 (2004)
R. Klette, A. Rosenfeld, Digital Geometry, Geometric Methods for Digital Picture Analysis. Computer Graphics and Geometric Modeling (Morgan Kaufmann, San Francisco, 2004)
T.C. Lee, R.L. Kashyap, C.N. Chu, Building skeleton models via 3-D medial surface/axis thinning algorithms. Comput. Vis. Graphics Image Process. 56(6), 462–478 (1994)
T.M. Mitchell, Machine Learning (McGraw Hill, New York, 1997)
L. Page, S. Brin, R. Motwani, T. Winograd, The PageRank citation ranking: bringing order to the web. Technical Report. Stanford InfoLab. (1999)
T. Pavilidis, Algorithms for Graphics and Image Processing (Computer Science Press, Rockville, 1982)
W.H. Press, et al., Numerical Recipes in C: The Art of Scientific Computing, 2nd edn. (Cambridge University Press, Cambridge, 1993)
X. Ren, J. Malik, Learning a classification model for segmentation, in Proceedings of the IEEE International Conference on Computer Vision, pp. 10–17 (2003)
A. Rosenfeld, and A.C. Kak, Digital Picture Processing, 2nd edn. (Academic, New York, 1982)
S. Russell, P. Norvig, Artificial Intelligence: A Modern Approach, 3rd edn. (Pearson, Boston, 2009)
L.K. Saul, S.T. Roweis, Think globally, fit locally: unsupervised learning of low dimensional manifolds. J. Mach. Learn. Res. 4, 119–155 (2003)
H. Samet, The Design and Analysis of Spatial Data Structures (Addison Wesley, Reading, 1990)
J. Shi, J. Malik, Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
J.B. Tenenbaum, V. de Silva, J.C. Langford, A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)
S. Theodoridis, K. Koutroumbas, Pattern Recognition (Academic, Boston, 2003)
D.P. Williamson, D.B. Shmoys, The Design of Approximation Algorithms (Cambridge University Press, Cambridge, 2011)
Z. Zhang, H. Zha, Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J. Sci. Comput. 26(1), 313–338 (2005)
B. Zheng, W.-C. Lee, D.L. Lee. Spatial queries in wireless broadcast systems. Wirel. Netw. 10(6), 723–736 (2004)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Chen, L.M. (2015). Overview of Basic Methods for Data Science. In: Mathematical Problems in Data Science. Springer, Cham. https://doi.org/10.1007/978-3-319-25127-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-25127-1_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25125-7
Online ISBN: 978-3-319-25127-1
eBook Packages: Computer ScienceComputer Science (R0)