The modeling and implementation of sophisticated multimodal software/hardware interfaces is a current scientific challenge of high societal relevance. The main characteristics entailed by these interfaces are being able to interact with people, inferring social, organizational and physical contexts based on sensed data, assisting people with special needs, enhancing elderly health-care assistance, learning and rehabilitation in daily functional activities. Implementing such Human Computer Interaction (HCI) systems is of public utility and profitable for a living science that should simplify user’s accesses to a wide range of social services, either remotely or in a person-to-person setting.

The current and future applications foreseen in this highly interdisciplinary field are countless: among these are featured context-aware avatars and robotic devices replacing and/or acting on behalf of humans in high responsibility tasks or time-critical dangerous tasks such as urban emergencies. Other emerging applications concern robot companions for elderly and vulnerable people and intelligent agents for services where there is a shortage of suitable skills or otherwise there is a request of significant investments in training-qualified personnel such as in therapist-based interventions. Given the complexities required by these automated tasks, the approach for developing such devices has to account for a holistic investigation perspective. New cognitive architectures must be foreseen and new cognitive integrations must be exploited in order to take advantage of the knowledge derived from the analysis of human behaviors across different contexts.

At the stake, there is the need to develop a deep understanding of the emotional and intentional cognitive processes underpinning human interactions. Inherently new insights must be deployed for designing complex–autonomous systems, which are required to be able to feel human emotional and intentional states; cooperatively adapt to them through a socially ethical and sensible conduct; and exhibit coherent vocal, visual and gestural affordances.

The present Special Issue investigates these topics, by gathering new experimental data and theories across a spectrum of disciplines, in order to identify the meta-structures underlying these phenomena. This effort hopefully will stimulate, on the one hand, the conception of new mathematical models for representing data, reasoning and learning. On the other hand, it will produce new psychological and computational approaches with respect to the existing cognitive frameworks and algorithmic solutions.

Enabling a consistent progress toward the implementation of a human automaton level of intelligence is crucial for developing such HCI systems and enhancing the quality of life of people addressing their current and future societal needs.

The topics proposed by the present special issue are interdisciplinary and cover issues related to several areas of research. Let us report them: behavioral analysis of interactions; mathematical models for representing data, reasoning and learning; social signal and context effects; algorithmic solutions for socially believable robots and ICT interfaces; human and/or machine encoding/decoding of affective behavioral patterns; psychological and computational approaches to behavioral analyses; case studies for the analysis and identification of personality traits, affective wellbeing and emotional states; social robotics; and ICT interfaces for supporting education, wellbeing and empathy.

The idea to dedicate a special issue of Cognitive Computation to cover the interdisciplinary aspects of human–human and human–machine interactions was prompted by our desire to elicit new guidance in the quest for the implementation of emotionally and socially believable robot and ICT interfaces. First, we aimed to initiate a discussion on what has been achieved to date and has currently been made available to end users. Second, we intended to focus on which needs have been fostered by the use of prototypical applications and what has been missed so far.

The special issue is an outcome of the COST Strategic Workshop “The future concept and reality of social robotics: challenges, perception and applications. Role of social robotics in current and future society,”Footnote 1 held in Brussels (Belgium), from the 10 to the 13 of June, 2013.

COST is one of the longest-running European framework supporting cooperation among scientists and researchers across Europe and oversea. Built on the key principles of supporting excellence, and being open and inclusive, COST allows the coordination of nationally funded research at a European level, strengthening Europe’s research and innovation capacities. By fostering new ideas and knowledge sharing, it aims to enable scientific breakthroughs leading to new theoretical concepts and products that will promote the European scientific excellence.

The workshop’s main objectives were to develop an advanced comprehension of how everyday ICT uses and practices influence people’s interactional and intentional behaviors and affective displays, and entail forms of dynamic learning processes between people and devices “where both entities are reciprocally affected and mobilized, where [technology] uses are the result of negotiations and clashes between technical affordances, commercial conditions and people’s intentions, aims, habits and obligations, and where non-intentional, as well as non-conscious aspects are involved” [10].

On these premises, it clearly appears that new technological developments must develop a user-centered approach taking into account user expectations and requirements to qualify as “user-friendly” the socially believable ICT interfaces that provide social/physical/psychological/assistive ICT services.

In particular, the term “social robotics” envisions a “natural” interaction of such devices with humans, where “natural” is interpreted as the ability of such agents to enter the social and communicative space ordinarily occupied by living creatures. In this sense, a social information communication device should be able, as already mentioned, to combine and build up knowledge through verbal and nonverbal signals contextually enacted by exploiting an intuitive data processing that uncovers the wealth of information conveyed by humans during interactions.

The fundamental questions that continue to remain open are those posed by Esposito et al. [6]:

  • Which human behavioral patterns are entitled to provide features that, appropriately modeled, will raise machine intelligence to a level close to human expectations?

  • Which of the current computational paradigm (machine learning, artificial intelligence, probabilistic approaches) is more appropriate? Alternatively, is there a need for new computational approaches that can better exploit interactional signal features?

  • Which level of trustfulness, credibility and satisfaction is expected from such human–device interaction?

  • Which are the software/hardware parameters that make such devices trustable and socially believable?

  • Are there standards to be formulated for autonomous systems satisfying users’ expectations and requirements in a structured manner?

The contributions in this special issue cover different topics emerged during the presentations and discussions at the COST Strategic workshop. These topics are all of great importance in the context of “modeling emotion, behavior and context in socially believable robots and ICT interfaces.” They reflect the highly interdisciplinary approach envisioned in the conception of the workshop covering and combining three scientific themes: “Perception,” “Challenges” and “Applications”

The contributions to this special issue have been then assigned to one of these three scientific themes according to a rough thematic classification in sections. Nevertheless, it must be said that the themes overlaid providing fundamental insights for cross-fertilization among the different disciplines they exemplify.

The Perception of intelligent devices deals with linguistic, behavioral, psychological and perceptual issues that bring to the definitions of mathematical models, algorithms, and heuristic strategies for data analysis, coordination of the data flow and optimal encoding of multi-channel verbal and nonverbal features. It starts with the article of Carl Vogel [26, p. 1], which discusses the perception of linguistic impoliteness and argues that impolite behaviors “arise from offence management associated with disgust.” The contributions of Olimpia Matarazzo et al. [13], Marina Cosenza et al. [4] and Francesca D’Errico and Isabella Poggi [5] are an attempt to explain human emotional conducts under psychological states produced by positive and/or negative contextual instances, psychological disorders (such as alexithymia) and “acid” conflictual interactions, respectively. The next two articles report data showing that communicative behaviors are affected by the social instance. Alessandro Vinciarelli et al. [25] show that, over mobile phones, negotiation approaches initiated by callers are more successful than those initiated by receivers. Costanza Navarretta [16] reports classification experiments that show that training a machine learning algorithm with data describing body movements co-occurring during speech improves the identification of those speakers producing such communicative behaviors. The last two papers on the Perception theme deal with the human ability to decode emotional states from voices and faces, respectively. The first contribution shows that this ability is affected by individual traits’ differences such as a “secure/unsecure attachment style” [7]. The contribution of Meeri Mäkäräinen et al. [12] shows that artificial faces can be perceived to express the same range of emotional intensity (from a neutral to a full intense emotional state) as human faces using exaggeration. Exaggeration attributes to artificial faces a certain degree of realism: less realism requires more exaggeration.

The Challenge theme was inspired by questions related to the necessity to ascertain the state of the art in social robotics and emotionally aware interfaces and the attempt to stimulate new approaches and proposals to advance the research. This section presents original studies that discuss theoretical and practical solutions adopted to model human–machine interactions. There are four contributions in this sections that suggest solutions for the design and prototyping of friendlier human–machine interaction systems. The first is that of Sofiane Boucenna et al. [2], which emphasizes the needs that interactive technologies must satisfy when dealing with vulnerable people such as children with autism. The second contribution by Dag Sverre Syrdal et al. [23] highlights the requirements and features that must be exhibited by robots acting in domestic environments. The third is the article of Hidenobu Sumioka et al. [22], describing minimal constraints to enhance the feeling of a human presence in a human-like robotic media. Finally, the article of Milan Gnjatović [9] reports on the features a robot’s dialogue management systems must exhibit for therapist-centered use.

The next three papers in this section face a different challenge that accounts of the behavior humans may adopt toward a system that, as much as complex and autonomous can be, can offer only a sub-optimal interaction process with respect to human beings. In this line of discussion, the paper of Leopoldina Fortunati et al. [8] sustains that “the direct experience of building a robot enables [children] to obtain a more effective and complex learning of what a robot is” [8], p. 1), and entails effective creativities and new social potentialities because of shared meanings captured and understood through children interactions. The contribution of Beňuš [1] advocates that intelligent devices must exhibit forms of social cognition and in particular of entrainment, intended here as “the tendency of interlocutors to become similar to each other in various aspects of their verbal and nonverbal behaviors.” Finally, the contribution of Paul Wilson and Barbara Lewandowska-Tomaszczyk [27] shows that emotional features are influenced by cultural and linguistic differences, and therefore, mindful devices must display sensitivity to languages and cultures.

Further interdisciplinary challenges are offered by three more contributions allocated to this section. We begin by reporting on the paper written by Jingjing Zhao et al. [28] who propose a biologically inspired model of visual saliency. This contribution is followed by the article of Gil Luria et al. [11] that, through a noninvasive computerized system, analyzes deceptive handwriting in clinician–patient interactions. This section ends with the contribution of Giovanni Vecchiato et al. [24] reporting data that correlate the pleasant emotional feeling aroused by commercial advertisements to an increase of the brain activity.

The last section of this volume, covering the theme Applications, deals with technological issues related to the implementation of intelligent avatars and interactive dialog systems that exploit verbal and nonverbal communication features. In this section, four contributes are dedicated to Voice User Interfaces (VUI) and Dialogue Management Systems (DMS), underling the importance of speech in reliable and acceptable human–machine interactions. When it comes to speech, users’ expectations are much higher than those demanded to graphical and text interfaces. In this context, the appropriate design and ability of both a VUI and DMS in processing and understanding free forms of conversations allow users to feel a naturalistic interaction and ensure their satisfaction [14, 15]. To our knowledge, there are no standards for the development of such satisfying DMSs and VUIs, and the contributions of Marcin Skowron et al. [21], Ingo Siegert et al. [20], Andreas Persson et al. [17] and Jiří Přibil and Anna Přibilová [18] are all providing original solutions for improving quality standards of these devices.

The remaining two papers in this section report on some prototypes of human–robot interaction systems. The article of Juha Röning et al. [19] reports on the implementation of an intelligent mobile robot, Minotaurus, for socially affective human–robot interaction in different environments. Minotaurus is able to move in an unknown setting detecting obstacles and planning paths, and in addition recognize users’ faces, facial emotional expressions, as well as interact through speech and gestures. Considerations on how to improve such applications and build robotic systems socially believable are provided by the contribution of Filippo Cavallo et al. [3] that assesses the use, from a user-centered perspective, of the Robot-Era system, a multi-robotic system acting in environments daily inhabited by humans, such as buildings and homes.

It must be said that the research works discussed in this special issue address only fragments of the needs and challenges that must be faced in this emerging field. Expanding the research requires to broaden our inquiry’s efforts and adopt a holistic perspective that accounts for theoretical and technological outcomes from interdisciplinary scientific domains.

The investigations must be driven by differently and motivated societal needs. On the one hand, because of space and security constraints, there is a tendency to deconstruct the “classical” robot into intelligent agents, automated personal assistants, future smart environments, ambient assistive living technologies, computational intelligent games/storytelling devices, embodied conversational avatars, and automatic health-care and education services. On the other hand, the re-structuring of robotic agents as drones, humanoids, swarms, cleaning, manufacturing, emergency and space exploration robots is motivated by unpleasant, tedious or dangerous tasks, as well as tasks requiring a certain amount of strength and precision abilities. To advance the research, it is important to equally weight the dynamics raised by these different societal needs. The development of such sophisticated multimodal software/hardware interfaces must be conceived as a research effort to outline appropriate solutions for alleviating obstacles and ameliorate the quality of life of the widest possible end user population.

We hope that this special issue will inspire and stimulate many additional researchers to join us in exploring the implications that socially believable robots and ICT interfaces will have in future societies.

Finally, we are deeply indebted to contributors and reviewers: The firsts for making this special issue a scientifically stimulating compilation of new and original ideas and the latters for their rigorous and invaluable scientific support. A special acknowledgement goes to COST for providing practical means and invaluable professional supports for the implementation of the trans-disciplinary strategic workshop.