Skip to main content

Overview of Basic Methods for Data Science

  • Chapter
  • First Online:
Mathematical Problems in Data Science

Abstract

Data science utilizes all mathematics and computer sciences. In this chapter, we give a brief review of the most fundamental concepts in data science: graph search algorithms, statistical methods especially principal component analysis (PCA), algorithms and data structures, and data mining and pattern recognition. This chapter will provide an overview for machine learning in relation to other mathematical tools. We will first introduce graphs and graph algorithms, which will be used as the foundation of a branch of artificial intelligence called search. The other three branches of artificial intelligence are learning, planning, and knowledge representation. Classification, also related to machine learning, is at the center of pattern recognition, which we will discuss in Chap. 4 Statistical methods especially PCA and regression will also be discussed. Finally, we introduce concepts of data structures and algorithm design including online search and matching.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. A.V. Aho, J.E. Hopcroft, J.D. Ullman, The Design and Analysis of Computer Algorithms (Addison-Wesley, Boston, 1974)

    MATH  Google Scholar 

  2. M. Belkin, P. Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)

    Article  MATH  Google Scholar 

  3. B. Bollobas, Random Graphs (Academic, London, 1985)

    MATH  Google Scholar 

  4. S. Brin, L. Page, The anatomy of a large-scale hypertextual Web search engine. Comput. Netw. ISDN Syst. 30, 107–117 (1998)

    Article  Google Scholar 

  5. G. Carlsson, A. Zomorodian, Theory of multidimensional persistence. Discret. Comput. Geom. 42(1), 71–93 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  6. L. Cayton, Algorithms for manifold learning. Technical Report CS2008-0923, UCSD (2005)

    Google Scholar 

  7. L. Chen, Discrete Surfaces and Manifolds: A Theory of Digital-Discrete Geometry and Topology (SP Computing, Rockville, 2004)

    Google Scholar 

  8. L. Chen, Digital Functions and Data Reconstruction (Springer, New York, 2013)

    Book  MATH  Google Scholar 

  9. L.M. Chen, Digital and Discrete Geometry: Theory and Algorithms, NY Springer (2014)

    Book  MATH  Google Scholar 

  10. L. Chen, Y. Rong, Digital topological method for computing genus and the Betti numbers. Topol. Appl. 157(12), 1931–1936 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  11. L. Chen, H. Zhu, W. Cui, Very fast region-connected segmentation for spatial data: case study, in IEEE Conference on System, Man, and Cybernetics (2006). pp. 4001–4005

    Google Scholar 

  12. T.H. Cormen, C.E. Leiserson, R.L. Rivest, Introduction to Algorithms (MIT Press, Cambridge, 1993)

    MATH  Google Scholar 

  13. M. Demirbas, H. Ferhatosmanoglu, Peer-to-peer spatial queries in sensor networks, in Third International Conference on Peer-to-Peer Computing, Linkoping (2003)

    Google Scholar 

  14. D.L. Donoho, C. Grimes. Hessian Eigenmaps: new locally linear embedding techniques for high-dimensional data. Technical Report TR-2003-08, Department of Statistics, Stanford University (2003)

    Google Scholar 

  15. Afrati Foto and Jeffrey Ullman, (2009) Optimizing Joins in a Map-Reduce Environment. Technical Report. Stanford InfoLab. (2009)

    Google Scholar 

  16. K. Fukunaga, L.D. Hostetler, The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inf. Theory 21(1), 32–40 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  17. F. Harary, Graph Theory (Addison-Wesley, Reading, 1969)

    MATH  Google Scholar 

  18. M. Hardt, A. Moitra. Algorithms and hardness for robust subspace recovery, in COLT, pp. 354–375 (2013)

    Google Scholar 

  19. R. Ghrist, Barcodes: the persistent topology of data. Bull. Am. Math. Soc. 45(1), 61–75 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  20. R.C. Gonzalez, R. Wood, Digital Image Processing (Addison-Wesley, Reading, 1993)

    Google Scholar 

  21. J. Goodman, J. O’Rourke, Handbook of Discrete and Computational Geometry (CRC, Boca Raton, 1997)

    MATH  Google Scholar 

  22. J. Han, M. Kamber, Data Mining: Concepts and Techniques (Morgan Kaufmann, San Francisco, 2001)

    MATH  Google Scholar 

  23. H. Homann, Implementation of a 3D Thinning Algorithm (Oxford University, Wolf-son Medical Vision Lab., Oxford, 2007)

    Google Scholar 

  24. F.V. Jensen, Bayesian Networks and Decision Graphs (Springer, New York, 2001)

    Book  MATH  Google Scholar 

  25. T. Kanungo, D.M. Mount, N. Netanyahu, C. Piatko, R. Silverman, A.Y. Wu, A local search approximation algorithm for k-means clustering. Comput. Geom. Theory Appl. 28, 89–112 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  26. R. Klette, A. Rosenfeld, Digital Geometry, Geometric Methods for Digital Picture Analysis. Computer Graphics and Geometric Modeling (Morgan Kaufmann, San Francisco, 2004)

    MATH  Google Scholar 

  27. T.C. Lee, R.L. Kashyap, C.N. Chu, Building skeleton models via 3-D medial surface/axis thinning algorithms. Comput. Vis. Graphics Image Process. 56(6), 462–478 (1994)

    Google Scholar 

  28. T.M. Mitchell, Machine Learning (McGraw Hill, New York, 1997)

    MATH  Google Scholar 

  29. L. Page, S. Brin, R. Motwani, T. Winograd, The PageRank citation ranking: bringing order to the web. Technical Report. Stanford InfoLab. (1999)

    Google Scholar 

  30. T. Pavilidis, Algorithms for Graphics and Image Processing (Computer Science Press, Rockville, 1982)

    Book  Google Scholar 

  31. W.H. Press, et al., Numerical Recipes in C: The Art of Scientific Computing, 2nd edn. (Cambridge University Press, Cambridge, 1993)

    Google Scholar 

  32. X. Ren, J. Malik, Learning a classification model for segmentation, in Proceedings of the IEEE International Conference on Computer Vision, pp. 10–17 (2003)

    Google Scholar 

  33. A. Rosenfeld, and A.C. Kak, Digital Picture Processing, 2nd edn. (Academic, New York, 1982)

    MATH  Google Scholar 

  34. S. Russell, P. Norvig, Artificial Intelligence: A Modern Approach, 3rd edn. (Pearson, Boston, 2009)

    MATH  Google Scholar 

  35. L.K. Saul, S.T. Roweis, Think globally, fit locally: unsupervised learning of low dimensional manifolds. J. Mach. Learn. Res. 4, 119–155 (2003)

    MathSciNet  MATH  Google Scholar 

  36. H. Samet, The Design and Analysis of Spatial Data Structures (Addison Wesley, Reading, 1990)

    Google Scholar 

  37. J. Shi, J. Malik, Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  38. J.B. Tenenbaum, V. de Silva, J.C. Langford, A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)

    Article  Google Scholar 

  39. S. Theodoridis, K. Koutroumbas, Pattern Recognition (Academic, Boston, 2003)

    Book  MATH  Google Scholar 

  40. D.P. Williamson, D.B. Shmoys, The Design of Approximation Algorithms (Cambridge University Press, Cambridge, 2011)

    Book  MATH  Google Scholar 

  41. Z. Zhang, H. Zha, Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J. Sci. Comput. 26(1), 313–338 (2005)

    Article  MATH  Google Scholar 

  42. B. Zheng, W.-C. Lee, D.L. Lee. Spatial queries in wireless broadcast systems. Wirel. Netw. 10(6), 723–736 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li M Chen .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Chen, L.M. (2015). Overview of Basic Methods for Data Science. In: Mathematical Problems in Data Science. Springer, Cham. https://doi.org/10.1007/978-3-319-25127-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25127-1_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25125-7

  • Online ISBN: 978-3-319-25127-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics