Mechanisms for Profiling

Singh, Rita

doi:10.1007/978-981-13-8403-5_8

Rita Singh²

791 Accesses

Abstract

So how is profiling actually done? Most of this book has been dedicated to developing the basic understanding needed for it. We have seen that the knowledge of how a parameter affects the vocal production mechanism can help us identify the most relevant representations from which we may extract the information needed for profiling. We have also seen how such knowledge can help us reason out why certain parameters may exert confusable influences on the voice signal. All of this knowledge can then help us design more targeted methods to discover features that are highly effective for profiling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The variables are more appropriately called explanatory variables, since they may not be independent of one another.

References

Gath, I., & Yair, E. (1988). Analysis of vocal tract parameters in Parkinsonian speech. The Journal of the Acoustical Society of America, 84(5), 1628–1634.
Article Google Scholar
Grenier, Y., & Omnes-Chevalier, M. C. (1988). Autoregressive models with time-dependent log area ratios. IEEE Transactions on Acoustics, Speech, and Signal Processing, 36(10), 1602–1612.
Article MATH Google Scholar
Adeli, H., & Hung, S. L. (1994). Machine learning: Neural networks, genetic algorithms, and fuzzy systems. New Jersey: Wiley.
MATH Google Scholar
Dietterich, T. G. (2000). Ensemble methods in machine learning. In Proceedings of the International Workshop on Multiple Classifier Systems (pp. 1–15). Berlin: Springer.
Google Scholar
Kotsiantis, S. B., Zaharakis, I. D., & Pintelas, P. E. (2006). Machine learning: A review of classification and combining techniques. Artificial Intelligence Review, 26(3), 159–190.
Article Google Scholar
Nasrabadi, N. M. (2007). Pattern recognition and machine learning. Journal of Electronic Imaging, 16(4), 049901.
Article MathSciNet Google Scholar
Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: Rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological Methods, 14(4), 323.
Article Google Scholar
Childers, D. G., Hicks, D. M., Moore, G. P., & Alsaka, Y. A. (1986). A model for vocal fold vibratory motion, contact area, and the electroglottogram. The Journal of the Acoustical Society of America, 80(5), 1309–1320.
Article Google Scholar
Wilhelms-Tricarico, R. (1995). Physiological modeling of speech production: Methods for modeling soft-tissue articulators. The Journal of the Acoustical Society of America, 97(5), 3085–3098.
Article Google Scholar
Steinecke, I., & Herzel, H. (1995). Bifurcations in an asymmetric vocal-fold model. The Journal of the Acoustical Society of America, 97(3), 1874–1884.
Article Google Scholar
Deng, L. (1999). Computational models for speech production. In Computational Models of Speech Pattern Processing (K. Ponting Ed.) (pp. 199–213). Berlin: Springer.
Chapter MATH Google Scholar
Alipour, F., Berry, D. A., & Titze, I. R. (2000). A finite-element model of vocal-fold vibration. The Journal of the Acoustical Society of America, 108(6), 3003–3012.
Article Google Scholar
Drechsel, J. S., & Thomson, S. L. (2008). Influence of supraglottal structures on the glottal jet exiting a two-layer synthetic, self-oscillating vocal fold model. The Journal of the Acoustical Society of America, 123(6), 4434–4445.
Article Google Scholar
Sagisaka, Y., Campbell, N., & Higuchi, N. (Eds.). (2012). Computing prosody: Computational models for processing spontaneous speech. Berlin: Springer Science & Business Media.
Google Scholar
Stouten, V. (2009). Automatic voice onset time estimation from reassignment spectra. Speech Communication, 51(12), 1194–1205.
Article Google Scholar
Lin, C. Y., & Wang, H. C. (2011). Automatic estimation of voice onset time for word-initial stops by applying random forest to onset detection. The Journal of the Acoustical Society of America, 130(1), 514–525.
Article MathSciNet Google Scholar
Hansen, J. H., Gray, S. S., & Kim, W. (2010). Automatic voice onset time detection for unvoiced stops (/p/,/t/,/k/) with application to accent classification. Speech Communication, 52(10), 777–789.
Article Google Scholar
Sonderegger, M., & Keshet, J. (2012). Automatic measurement of voice onset time using discriminative structured prediction. The Journal of the Acoustical Society of America, 132(6), 3965–3979.
Article Google Scholar
Keshet, J., Shalev-Shwartz, S., Singer, Y., & Chazan, D. (2007). A large margin algorithm for speech-to-phoneme and music-to-score alignment. IEEE Transactions on Audio, Speech, and Language Processing, 15(8), 2373–2382.
Article Google Scholar
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.
Google Scholar
Breiman, L. (2017). Classification and regression trees. Routledge Press, Taylor & Francis Group.
Google Scholar
Torgo, L., & Gama, J. (1997). Regression using classification algorithms. Intelligent Data Analysis, 1(4), 275–292.
Article Google Scholar
Memon, S. A., Zhao, W., Raj, B., & Singh, R. (2018). Neural regression trees. arXiv:1810.00974.
Shashanka, M., Raj, B., & Smaragdis, P. (2008). Probabilistic latent variable models as nonnegative factorizations. Computational Intelligence and Neuroscience. Article ID 947438.
Google Scholar
Lee, D. D., & Seung, H. S. (2001). Algorithms for non-negative matrix factorization. In Advances in Neural Information Processing Systems 13 (T.K. Leen, T.G. Dietterich & V. Tresp (Eds.)), Proceedings of the Neural Information Processing Systems (NIPS) (pp. 556–562).
Google Scholar
Gaussier, E., & Goutte, C. (2005). Relation between PLSA and NMF and implications. In Proceedings of the Twenty-Eighth Annual International Conference on Research and Development in Information Retrieval (SIGIR) (pp. 601–602). Salvador, Brazil: ACM.
Google Scholar
Cichocki, A., Zdunek, R., & Amari, S. I. (2006). Csiszar’s divergences for non-negative matrix factorization: Family of new algorithms. In International Conference on Independent Component Analysis and Blind Signal Separation (ICA) (pp. 32–39). Charleston, SC, USA.
Google Scholar
Cichocki, A., Lee, H., Kim, Y. D., & Choi, S. (2008). Non-negative matrix factorization with \(\alpha \)-divergence. Pattern Recognition Letters, 29(9), 1433–1440.
Google Scholar
Heiler, M. & Schnörr, C. (2006). Controlling sparseness in non-negative tensor factorization. In Proceedings of the European Conference on Computer Vision (ECCV) (56–67). Graz, Austria.
Google Scholar
Virtanen, T. (2007). Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Transactions on Audio, Speech, and Language Processing, 15(3), 1066–1074.
Article Google Scholar
Huang, P. S., Kim, M., Hasegawa-Johnson, M., & Smaragdis, P. (2014). Deep learning for monaural speech separation. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1562–1566). lorence, Italy: IEEE.
Google Scholar
Kumar, A. (2018). Acoustic Intelligence in Machines, Doctoral dissertation. School of Computer Science: Carnegie Mellon University.
Google Scholar
Lewis, D. D. (1998). Naive (Bayes) at forty: The independence assumption in information retrieval. In Proceedings of the European Conference on Machine Learning (V. Barr, & Z. Markov (Eds.)) (pp. 4–15). Heidelberg: Springer.
Google Scholar
Zhang, H. (2004). The optimality of naive Bayes. In Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference (FLAIRS). Florida, USA: AAAI.
Google Scholar
Ng, A. Y., & Jordan, M. I. (2002). On discriminative versus generative classifiers: A comparison of logistic regression and naive Bayes. In Advances in Neural Information Processing Systems (pp. 841–848).
Google Scholar
Scholkopf, B., & Smola, A. J. (2001). Learning with kernels: Support vector machines, regularization, optimization, and beyond. Massachusetts, USA: MIT press.
Google Scholar
Ho, T.K. (1995). Random decision forests. In Proceedings of the Third International Conference on Document Analysis and Recognition (Vol. 1, pp. 278–282). Montreal, Canada: IEEE.
Google Scholar
Myers, R. H., & Myers, R. H. (1990). Classical and Modern Regression with Applications (Vol. 2). Belmont, California: Duxbury Press.Classical and Modern Regression with Applications.
Google Scholar
Frank, E., Wang, Y., Inglis, S., Holmes, G., & Witten, I. H. (1998). Using model trees for classification. Machine Learning, 32(1), 63–76.
Article MATH Google Scholar
Landwehr, N., Hall, M., & Frank, E. (2003). Logistic model trees. In Proceedings of the European Conference on Machine Learning (ECML) (pp. 241–252). Cavtat-Dubrovnik, Coratia.
Google Scholar
Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67.
Article MATH Google Scholar
Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, 32(2), 407–499.
Article MathSciNet MATH Google Scholar
Park, T., & Casella, G. (2008). The Bayesian lasso. Journal of the American Statistical Association, 103(482), 681–686.
Article MathSciNet MATH Google Scholar
Bedard, A., & Georges, T. (2000). Atmospheric infrasound. Acoustics Australia, 28(2), 47–52.
Google Scholar

Download references

Author information

Authors and Affiliations

Carnegie Mellon University, Pittsburgh, PA, USA
Rita Singh

Authors

Rita Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rita Singh .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Singh, R. (2019). Mechanisms for Profiling. In: Profiling Humans from their Voice. Springer, Singapore. https://doi.org/10.1007/978-981-13-8403-5_8

Download citation

DOI: https://doi.org/10.1007/978-981-13-8403-5_8
Published: 19 June 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8402-8
Online ISBN: 978-981-13-8403-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics