This project will build and develop JOKER, a generic intelligent user interface providing a multimodal dialogue system with social communication skills including humor, empathy, compassion, charm, and other informal socially-oriented behavior.
Talk during social interactions naturally involves the exchange of propositional content but also and perhaps more importantly the expression of interpersonal relationships, as well as displays of emotion, affect, interest, etc. This project will facilitate advanced dialogues employing complex social behaviors in order to provide a companion-machine (robot or ECA) with the skills to create and maintain a long term social relationship through verbal and non verbal language interaction. Such social interaction requires that the robot has the ability to represent and understand some complex human social behavior. It is not straightforward to design a robot with such abilities. Social interactions require social intelligence and ‘understanding’ (for planning ahead and dealing with new circumstances) and employ theory of mind for inferring the cognitive states of another person.
JOKER will emphasize the fusion of verbal and non-verbal channels for emotional and social behavior perception, interaction and generation capabilities. Our paradigm invokes two types of decision: intuitive (mainly based upon non-verbal multimodal cues) and cognitive (based upon fusion of semantic and contextual information with non-verbal multimodal cues.) The intuitive type will be used dynamically in the interaction at the non-verbal level (empathic behavior: synchrony of mimics such as smile, nods) but also at verbal levels for reflex small- talk (politeness behavior: verbal synchrony with hello, how are you, thanks, etc). Cognitive decisions will be used for reasoning on the strategy of the dialog and deciding more complex social behaviors (humor, compassion, white lies, etc.) taking into account the user profile and contextual information.
JOKER will react in real-time with a robust perception module (sensing user's facial expressions, gaze, voice, audio and speech style and content), a social interaction module modelling user and context, with long-term memories, and a generation and synthesis module for maintaining social engagement with the user.
The research will provide a generic intelligent user interface for use with various platforms such as robots or ECAs, a collection of multimodal data with different socially-oriented behavior scenarios in two languages (French and English) and an evaluation protocol for such systems. Using the database collected in a human-machine context, cultural aspects of emotions and natural social interaction including chat, jokes, and other informal socially-oriented behavior will be incorporated.