The nonverbal vocal aspects of communication precede language acquisition in humans, are crucial in nonhuman animals, and go beyond the semantic aspects conveyed through language; they are a critical component of social interactions. Not only do they accentuate or complete the meaning of words, but they can also modify, modulate, or change the interpretation of words (Friedman 1982). These nonverbal components of vocalizations are vital to conveying emotions and other affective states and are able to reinforce emotional bonding, especially in early development. Preverbal infants are continuously exposed to their caregivers’ vocal productions, which are finely tuned to their age and states (Stern et al. 1982), and to the functions and contexts of communication. In recent decades, infant research shed new light on the importance of a coordinated bidirectional proto-conversation between adults and preverbal infants, for their optimal cognitive, social, and emotional development (Jaffe et al. 2001).

The development of the vocal aspects of infant’s preverbal communication is tightly linked to other nonverbal aspects, such as gestures (Ejiri and Masataka 2001) or gaze behaviors (Crown et al. 2002), which reflect their common participation and cognitive interdependency in the communicative act, which can be produced voluntarily or involuntarily. Vocal expression of emotions has been thoroughly investigated in the last decades, both from a theoretical point of view and through empirical studies (Scherer et al. 2003), by adopting a composite perspective that includes the encoding aspects of vocal signals, their acoustic characteristics, and the decoding phase, in which the partners interpret the communicative signals (Scherer 1982; Grandjean et al. 2006).

This composite perspective and other approaches have supported the emergence of a growing body of literature on vocal communication of emotion, including comparative perspectives. Thanks to new techniques and methods, such as neuroimaging (e.g. Grandjean 2020; Kotz et al. 2013) and machine learning or deep neural network approaches, combined with behavioral data (e.g. Vayrynen et al. 2013), a systematic scientific approach to these phenomena has emerged.

This Special Issue was conceived as a complement to the International Workshop on Voice and Early Development, held in Geneva in December 2018 and organized by the Swiss Center of Affective Sciences at the University of Geneva.

The first two contributions address theoretical and methodological questions on the development of vocal nonverbal communication providing a valuable general overview of the topic through both ontogenetic and phylogenetic perspectives. Filippi’s (2020) contribution reviews the role that vocal emotional intonation plays in the origins of language communication. Her hypothesis is that emotional voice intonation facilitates the identification and the production of phonemes, linguistic rule processing, and the association between vocal utterances and meanings. She adopts a comparative approach, supporting her hypothesis with nonhuman animal and infant empirical studies. Through this evolutionary perspective, she suggests that the expression of emotions through voice (and music), might have paved the way for the emergence of language in the first hominids. Although models for a joint prosodic precursor of language and music have been proposed previously (Brown 2017)—and are still controversially discussed—the author reports a solid overview across animal species assuming that the mechanisms of production of emotional vocalizations might be evolutionary conserved across species and could play an important key role in language evolution.

Next, the paper by Porkony et al. (2020) introduces a precise review of recent methodological approaches in preverbal behavior analysis, including automatic vocalization-based differentiation in typical and atypical infant development. Intelligent audio analysis paradigms are discussed by the authors, both in their technical aspects and in their clinical challenges for the early detection of atypical preverbal development. Useful suggestions for retrospective analysis are included in the review, pointing out the possibilities, potentialities, and the limitations of retroactive investigations. Although the application of audio processing and machine learning methods for preverbal behavior analysis is in its nascent stages, researchers studying nonverbal aspects of human communication can find other valuable tools for vocal analysis in this paper, in addition to practical suggestions for using data collected ‘in the past’ for retrospective analysis.

In the two last papers by Saliba et al. (2020) and Filippa et al. (2020), we shift towards the analysis of social interactional processes in specific at-risk conditions. More generally, the two papers explore the acoustical combined with behavioral methods through which early dyadic interaction can be investigated, with particular attention to emotional vocalizations. They deal with a common research setting, investigating at-risk dyads of preterm infants and their parents during hospitalization in the Neonatal Intensive care Unit (NICU).

Saliba et al. (2020) investigated the effects of Early Vocal Contact between mothers, fathers, and their preterm infants in the NICU. The novelty of this paper is the inclusion of the fathers in the analysis during early interactional processes within the preterm dyad. The neurobiology of fathering, especially in at-risk conditions, is still under investigated which calls for research examining the specificity of paternal behaviors, and the routes and mechanisms of paternal effects in infant’s development (Braun and Champagne 2014). In addition, in this paper the authors updated and expanded the discussion around the origins of the reciprocal co-regulation processes in early periods of development, when premature newborns are still in early neurodevelopment outside their intra-uterine environment.

In the final research paper, Filippa et al. (2020) apply to the analysis of social interactional processes a research coding system originally developed for recognizing the behavioral repertoire observable in the perinatal period, the System for Coding Perinatal Behavior. The novelty of this approach allowed the authors to suggest that preterm newborns from 32 weeks’ postmenstrual age respond to social and contingent vocal stimuli with a general activation of self-touch and eye-opening behaviors. Interestingly, authors also found different oral responses to live maternal speech versus songs. The results call for an exploration of the existence of early mirroring processes, which is a promising area of research.

Taken together, these theoretical, methodological, and research advances through ontogenetic and phylogenetic perspectives in the domain of the vocal nonverbal communication constitute a valuable body of research, especially for future studies on typical and atypical human development in the context of early vocal communication.