Data Science Applications

Suh, Changho

doi:10.1007/978-981-19-8008-4_3

Changho Suh⁹

Part of the book series: Signals and Communication Technology ((SCT))

328 Accesses

Abstract

In Parts I & II, we have considered a communication problem wherein the goal is to deliver a sequence of information bits from a transmitter to a receiver over a channel. Two prominent channels were taken into consideration: (i) the additive white Gaussian channel (in Part I); and (ii) the wireline ISI channel (in Part II). What we did repeatedly throughout is: Given an encoding strategy (e.g., sequential coding, repetition coding), we strive to: Derive the optimal receiver for decoding the bit string; Analyze the decoding error probability when using the optimal receiver.

Communication principles are prevalent in data science.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Hardcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The upcoming sections of this book are updated from the author’s previous publication (Suh , 2022) but have been customized to suit the specific focus and logical flow of this book.
2.
In machine learning, such a quantity is called the feature. This refers to a key component that well describes characteristics of data.
3.
The word logistic comes from a Greek word which means a slow growth, like a logarithmic growth.
4.
Sigmoid means resembling the lower-case Greek letter sigma, S-shaped.
5.
We say that a symmetric matrix, say \(Q = Q^T \in {\mathbb {R}}^{d \times d}\), is positive semi-definite if \(v^T Q v \ge 0, \; \forall v \in {\mathbb {R}}^d\), i.e., all the eigenvalues of Q are non-negative. It is simply denoted by \(Q \succeq 0\).

References

Abbe, E. (2017). Community detection and stochastic block models: Recent developments. The Journal of Machine Learning Research, 18(1), 6446–6531.
MathSciNet Google Scholar
Bansal, N., Blum, A., & Chawla, S. (2004). Correlation clustering. Machine Learning, 56(1), 89–113.
Article MathSciNet MATH Google Scholar
Bertsekas, D., & Tsitsiklis, J. N. (2008). Introduction to probability (Vol. 1). Athena Scientific.
Google Scholar
Boyd, S. P., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.
Book MATH Google Scholar
Browning, S. R., & Browning, B. L. (2011). Haplotype phasing: Existing methods and new developments. Nature Reviews Genetics, 12(10), 703–714.
Article Google Scholar
Calafiore, G. C., & El Ghaoui, L. (2014). Optimization models. Cambridge: Cambridge University Press.
Book MATH Google Scholar
Chartrand, G. (1977). Introductory graph theory. Courier Corporation.
Google Scholar
Chen, Y., Kamath, G., Suh, C., & Tse, D. (2016). Community recovery in graphs with locality. International Conference on Machine Learning (pp. 689–698).
Google Scholar
Chen, J., & Yuan, B. (2006). Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics, 22(18), 2283–2290.
Article Google Scholar
Cover, T., & Joy, A. T. (2006). Elements of information theory. Wiley-Interscience.
Google Scholar
Das, S., & Vikalo, H. (2015). Sdhap: Haplotype assembly for diploids and polyploids via semi-definite programming. BMC Genomics, 16(1), 1–16.
Article Google Scholar
Erdős, P., Rényi, A., et al. (1960). On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Academy of Sciences, 5(1), 17–60.
MathSciNet MATH Google Scholar
Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3–5), 75–174.
Article MathSciNet Google Scholar
Freedman, D., Pisani, R., & Purves, R. (2007). Statistics. W.W. Norton & Co.
Google Scholar
Gallager, R. G. (2013). Stochastic processes: Theory for applications. Cambridge: Cambridge University Press.
Book MATH Google Scholar
Garnier, J.- G., & Quetelet, A. (1838). Correspondance mathématique et physique (Vol. 10). Impr. d’H. Vandekerckhove.
Google Scholar
Girvan, M., & Newman, M. E. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826.
Article MathSciNet MATH Google Scholar
Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (pp. 315–323).
Google Scholar
Golub, G. H., & Van Loan, C. F. (2013). Matrix computations. JHU Press.
Google Scholar
Hamming, R. W. (1950). Error detecting and error correcting codes. The Bell System Technical Journal, 29(2), 147–160.
Article MathSciNet MATH Google Scholar
Hinton, G., Srivastava, N., & Swersky, K. (2012). Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on, 14(8), 2.
Google Scholar
Ivakhnenko, A. G. (1971). Polynomial theory of complex systems. IEEE Transactions on Systems, Man, and Cybernetics(4), 364–378.
Google Scholar
Jalali, A., Chen, Y., Sanghavi, S., & Xu, H. (2011). Clustering partially observed graphs via convex optimization. In ICML.
Google Scholar
Kingma, D. P. & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980.
Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques. MIT Press.
Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25.
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Article Google Scholar
Lemaréchal, C. (2012). Cauchy and the gradient method. Documenta Mathematica Extra, 251(254), 10.
MATH Google Scholar
Marsden, J. E., & Tromba, A. (2003). Vector calculus. Macmillan.
Google Scholar
Meta. (2022). Investor earnings report for 1q 2022.
Google Scholar
News, B. (2016). Artificial intelligence: Google’s alpha go beats go master lee se-dol.
Google Scholar
Nielsen, R., Paul, J. S., Albrechtsen, A., & Song, Y. S. (2011). Genotype and SNP calling from next-generation sequencing data. Nature Reviews Genetics, 12(6), 443–451.
Article Google Scholar
Polyak, B. T. (1964). Some methods of speeding up the convergence of iteration methods. Ussr Computational Mathematics and Mathematical Physics, 4(5), 1–17.
Article Google Scholar
Rabiner, L. & Juang, B. (1986). An introduction to hidden Markov models. IEEE ASSP Magazine, 3(1), 4–16.
Google Scholar
Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386.
Article Google Scholar
Samuel, A. L. (1967). Some studies in machine learning using the game of checkers. ii–recent progress. IBM Journal of Research and Development, 11(6), 601–617.
Google Scholar
Shannon, C. E. (2001). A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review, 5(1), 3–55.
Article MathSciNet Google Scholar
Shen, J., Tang, T., & Wang, L.-L. (2011). Spectral methods: Algorithms, analysis and applications (Vol. 41). Springer Science & Business Media.
Google Scholar
Si, H., Vikalo, H., & Vishwanath, S. (2014). Haplotype assembly: An information theoretic view. 2014 ieee information theory workshop (itw 2014) (pp. 182–186).
Google Scholar
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., et al. (2016). Mastering the game of go with deep neural networks and tree search. Nature, 529(7587), 484–489.
Google Scholar
Suh, C. (2022). Convex optimization for machine learning. Now Publishers.
Google Scholar

Download references

Author information

Authors and Affiliations

KAIST, Daejeon, Korea (Republic of)
Changho Suh

Authors

Changho Suh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Changho Suh .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Suh, C. (2023). Data Science Applications. In: Communication Principles for Data Science. Signals and Communication Technology. Springer, Singapore. https://doi.org/10.1007/978-981-19-8008-4_3

Download citation

DOI: https://doi.org/10.1007/978-981-19-8008-4_3
Published: 13 June 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8007-7
Online ISBN: 978-981-19-8008-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics