Skip to main content

Data Science Applications

  • Chapter
  • First Online:
Communication Principles for Data Science

Part of the book series: Signals and Communication Technology ((SCT))

  • 328 Accesses

Abstract

In Parts I & II, we have considered a communication problem wherein the goal is to deliver a sequence of information bits from a transmitter to a receiver over a channel. Two prominent channels were taken into consideration: (i) the additive white Gaussian channel (in Part I); and (ii) the wireline ISI channel (in Part II). What we did repeatedly throughout is: Given an encoding strategy (e.g., sequential coding, repetition coding), we strive to: Derive the optimal receiver for decoding the bit string; Analyze the decoding error probability when using the optimal receiver.

Communication principles are prevalent in data science.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 84.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The upcoming sections of this book are updated from the author’s previous publication (Suh , 2022) but have been customized to suit the specific focus and logical flow of this book.

  2. 2.

    In machine learning, such a quantity is called the feature. This refers to a key component that well describes characteristics of data.

  3. 3.

    The word logistic comes from a Greek word which means a slow growth, like a logarithmic growth.

  4. 4.

    Sigmoid means resembling the lower-case Greek letter sigma, S-shaped.

  5. 5.

    We say that a symmetric matrix, say \(Q = Q^T \in {\mathbb {R}}^{d \times d}\), is positive semi-definite if \(v^T Q v \ge 0, \; \forall v \in {\mathbb {R}}^d\), i.e., all the eigenvalues of Q are non-negative. It is simply denoted by \(Q \succeq 0\).

References

  • Abbe, E. (2017). Community detection and stochastic block models: Recent developments. The Journal of Machine Learning Research, 18(1), 6446–6531.

    MathSciNet  Google Scholar 

  • Bansal, N., Blum, A., & Chawla, S. (2004). Correlation clustering. Machine Learning, 56(1), 89–113.

    Article  MathSciNet  MATH  Google Scholar 

  • Bertsekas, D., & Tsitsiklis, J. N. (2008). Introduction to probability (Vol. 1). Athena Scientific.

    Google Scholar 

  • Boyd, S. P., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Browning, S. R., & Browning, B. L. (2011). Haplotype phasing: Existing methods and new developments. Nature Reviews Genetics, 12(10), 703–714.

    Article  Google Scholar 

  • Calafiore, G. C., & El Ghaoui, L. (2014). Optimization models. Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Chartrand, G. (1977). Introductory graph theory. Courier Corporation.

    Google Scholar 

  • Chen, Y., Kamath, G., Suh, C., & Tse, D. (2016). Community recovery in graphs with locality. International Conference on Machine Learning (pp. 689–698).

    Google Scholar 

  • Chen, J., & Yuan, B. (2006). Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics, 22(18), 2283–2290.

    Article  Google Scholar 

  • Cover, T., & Joy, A. T. (2006). Elements of information theory. Wiley-Interscience.

    Google Scholar 

  • Das, S., & Vikalo, H. (2015). Sdhap: Haplotype assembly for diploids and polyploids via semi-definite programming. BMC Genomics, 16(1), 1–16.

    Article  Google Scholar 

  • ErdÅ‘s, P., Rényi, A., et al. (1960). On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Academy of Sciences, 5(1), 17–60.

    MathSciNet  MATH  Google Scholar 

  • Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3–5), 75–174.

    Article  MathSciNet  Google Scholar 

  • Freedman, D., Pisani, R., & Purves, R. (2007). Statistics. W.W. Norton & Co.

    Google Scholar 

  • Gallager, R. G. (2013). Stochastic processes: Theory for applications. Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Garnier, J.- G., & Quetelet, A. (1838). Correspondance mathématique et physique (Vol. 10). Impr. d’H. Vandekerckhove.

    Google Scholar 

  • Girvan, M., & Newman, M. E. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826.

    Article  MathSciNet  MATH  Google Scholar 

  • Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (pp. 315–323).

    Google Scholar 

  • Golub, G. H., & Van Loan, C. F. (2013). Matrix computations. JHU Press.

    Google Scholar 

  • Hamming, R. W. (1950). Error detecting and error correcting codes. The Bell System Technical Journal, 29(2), 147–160.

    Article  MathSciNet  MATH  Google Scholar 

  • Hinton, G., Srivastava, N., & Swersky, K. (2012). Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on, 14(8), 2.

    Google Scholar 

  • Ivakhnenko, A. G. (1971). Polynomial theory of complex systems. IEEE Transactions on Systems, Man, and Cybernetics(4), 364–378.

    Google Scholar 

  • Jalali, A., Chen, Y., Sanghavi, S., & Xu, H. (2011). Clustering partially observed graphs via convex optimization. In ICML.

    Google Scholar 

  • Kingma, D. P. & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980.

  • Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques. MIT Press.

    Google Scholar 

  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25.

    Google Scholar 

  • LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.

    Article  Google Scholar 

  • Lemaréchal, C. (2012). Cauchy and the gradient method. Documenta Mathematica Extra, 251(254), 10.

    MATH  Google Scholar 

  • Marsden, J. E., & Tromba, A. (2003). Vector calculus. Macmillan.

    Google Scholar 

  • Meta. (2022). Investor earnings report for 1q 2022.

    Google Scholar 

  • News, B. (2016). Artificial intelligence: Google’s alpha go beats go master lee se-dol.

    Google Scholar 

  • Nielsen, R., Paul, J. S., Albrechtsen, A., & Song, Y. S. (2011). Genotype and SNP calling from next-generation sequencing data. Nature Reviews Genetics, 12(6), 443–451.

    Article  Google Scholar 

  • Polyak, B. T. (1964). Some methods of speeding up the convergence of iteration methods. Ussr Computational Mathematics and Mathematical Physics, 4(5), 1–17.

    Article  Google Scholar 

  • Rabiner, L. & Juang, B. (1986). An introduction to hidden Markov models. IEEE ASSP Magazine, 3(1), 4–16.

    Google Scholar 

  • Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386.

    Article  Google Scholar 

  • Samuel, A. L. (1967). Some studies in machine learning using the game of checkers. ii–recent progress. IBM Journal of Research and Development, 11(6), 601–617.

    Google Scholar 

  • Shannon, C. E. (2001). A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review, 5(1), 3–55.

    Article  MathSciNet  Google Scholar 

  • Shen, J., Tang, T., & Wang, L.-L. (2011). Spectral methods: Algorithms, analysis and applications (Vol. 41). Springer Science & Business Media.

    Google Scholar 

  • Si, H., Vikalo, H., & Vishwanath, S. (2014). Haplotype assembly: An information theoretic view. 2014 ieee information theory workshop (itw 2014) (pp. 182–186).

    Google Scholar 

  • Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., et al. (2016). Mastering the game of go with deep neural networks and tree search. Nature, 529(7587), 484–489.

    Google Scholar 

  • Suh, C. (2022). Convex optimization for machine learning. Now Publishers.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Changho Suh .

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Suh, C. (2023). Data Science Applications. In: Communication Principles for Data Science. Signals and Communication Technology. Springer, Singapore. https://doi.org/10.1007/978-981-19-8008-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-8008-4_3

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-8007-7

  • Online ISBN: 978-981-19-8008-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics