Abstract
In Parts I & II, we have considered a communication problem wherein the goal is to deliver a sequence of information bits from a transmitter to a receiver over a channel. Two prominent channels were taken into consideration: (i) the additive white Gaussian channel (in Part I); and (ii) the wireline ISI channel (in Part II). What we did repeatedly throughout is: Given an encoding strategy (e.g., sequential coding, repetition coding), we strive to: Derive the optimal receiver for decoding the bit string; Analyze the decoding error probability when using the optimal receiver.
Communication principles are prevalent in data science.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The upcoming sections of this book are updated from the author’s previous publication (Suh , 2022) but have been customized to suit the specific focus and logical flow of this book.
- 2.
In machine learning, such a quantity is called the feature. This refers to a key component that well describes characteristics of data.
- 3.
The word logistic comes from a Greek word which means a slow growth, like a logarithmic growth.
- 4.
Sigmoid means resembling the lower-case Greek letter sigma, S-shaped.
- 5.
We say that a symmetric matrix, say \(Q = Q^T \in {\mathbb {R}}^{d \times d}\), is positive semi-definite if \(v^T Q v \ge 0, \; \forall v \in {\mathbb {R}}^d\), i.e., all the eigenvalues of Q are non-negative. It is simply denoted by \(Q \succeq 0\).
References
Abbe, E. (2017). Community detection and stochastic block models: Recent developments. The Journal of Machine Learning Research, 18(1), 6446–6531.
Bansal, N., Blum, A., & Chawla, S. (2004). Correlation clustering. Machine Learning, 56(1), 89–113.
Bertsekas, D., & Tsitsiklis, J. N. (2008). Introduction to probability (Vol. 1). Athena Scientific.
Boyd, S. P., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.
Browning, S. R., & Browning, B. L. (2011). Haplotype phasing: Existing methods and new developments. Nature Reviews Genetics, 12(10), 703–714.
Calafiore, G. C., & El Ghaoui, L. (2014). Optimization models. Cambridge: Cambridge University Press.
Chartrand, G. (1977). Introductory graph theory. Courier Corporation.
Chen, Y., Kamath, G., Suh, C., & Tse, D. (2016). Community recovery in graphs with locality. International Conference on Machine Learning (pp. 689–698).
Chen, J., & Yuan, B. (2006). Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics, 22(18), 2283–2290.
Cover, T., & Joy, A. T. (2006). Elements of information theory. Wiley-Interscience.
Das, S., & Vikalo, H. (2015). Sdhap: Haplotype assembly for diploids and polyploids via semi-definite programming. BMC Genomics, 16(1), 1–16.
Erdős, P., Rényi, A., et al. (1960). On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Academy of Sciences, 5(1), 17–60.
Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3–5), 75–174.
Freedman, D., Pisani, R., & Purves, R. (2007). Statistics. W.W. Norton & Co.
Gallager, R. G. (2013). Stochastic processes: Theory for applications. Cambridge: Cambridge University Press.
Garnier, J.- G., & Quetelet, A. (1838). Correspondance mathématique et physique (Vol. 10). Impr. d’H. Vandekerckhove.
Girvan, M., & Newman, M. E. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826.
Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (pp. 315–323).
Golub, G. H., & Van Loan, C. F. (2013). Matrix computations. JHU Press.
Hamming, R. W. (1950). Error detecting and error correcting codes. The Bell System Technical Journal, 29(2), 147–160.
Hinton, G., Srivastava, N., & Swersky, K. (2012). Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on, 14(8), 2.
Ivakhnenko, A. G. (1971). Polynomial theory of complex systems. IEEE Transactions on Systems, Man, and Cybernetics(4), 364–378.
Jalali, A., Chen, Y., Sanghavi, S., & Xu, H. (2011). Clustering partially observed graphs via convex optimization. In ICML.
Kingma, D. P. & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980.
Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques. MIT Press.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25.
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Lemaréchal, C. (2012). Cauchy and the gradient method. Documenta Mathematica Extra, 251(254), 10.
Marsden, J. E., & Tromba, A. (2003). Vector calculus. Macmillan.
Meta. (2022). Investor earnings report for 1q 2022.
News, B. (2016). Artificial intelligence: Google’s alpha go beats go master lee se-dol.
Nielsen, R., Paul, J. S., Albrechtsen, A., & Song, Y. S. (2011). Genotype and SNP calling from next-generation sequencing data. Nature Reviews Genetics, 12(6), 443–451.
Polyak, B. T. (1964). Some methods of speeding up the convergence of iteration methods. Ussr Computational Mathematics and Mathematical Physics, 4(5), 1–17.
Rabiner, L. & Juang, B. (1986). An introduction to hidden Markov models. IEEE ASSP Magazine, 3(1), 4–16.
Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386.
Samuel, A. L. (1967). Some studies in machine learning using the game of checkers. ii–recent progress. IBM Journal of Research and Development, 11(6), 601–617.
Shannon, C. E. (2001). A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review, 5(1), 3–55.
Shen, J., Tang, T., & Wang, L.-L. (2011). Spectral methods: Algorithms, analysis and applications (Vol. 41). Springer Science & Business Media.
Si, H., Vikalo, H., & Vishwanath, S. (2014). Haplotype assembly: An information theoretic view. 2014 ieee information theory workshop (itw 2014) (pp. 182–186).
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., et al. (2016). Mastering the game of go with deep neural networks and tree search. Nature, 529(7587), 484–489.
Suh, C. (2022). Convex optimization for machine learning. Now Publishers.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Suh, C. (2023). Data Science Applications. In: Communication Principles for Data Science. Signals and Communication Technology. Springer, Singapore. https://doi.org/10.1007/978-981-19-8008-4_3
Download citation
DOI: https://doi.org/10.1007/978-981-19-8008-4_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8007-7
Online ISBN: 978-981-19-8008-4
eBook Packages: Computer ScienceComputer Science (R0)