On the Design of the Huggable Robot Probo

—Nowadays robots are being created that interact with human beings in order to satisfy certain social needs. Following this trend, the development of the social robot Probo has started. The robot will be used in hospitals, as a tele-interface for entertainment, communication and medical assistance. Therefore, it requires the ability to express emotions. In order to do so, an emotional interface is developed to fully conﬁgure the display of emotions. These emotions -represented as a vector in an emotion space-are mapped to the degrees of freedom used in the robot. Besides emotions, the interface includes a control for the point of attention and a module to create and store animations. A 3D virtual model is created, acting as a virtual replica of the robot, providing realistic visual feedback to evaluate the design choices for the facial expressions. This paper presents the objectives of this new robot and describe the concepts and design of the ﬁrst prototype.


I. INTRODUCTION
A Hospitalization has a serious physical and mental influ- ence, particularly on children.It confronts them with situations which are completely different from these at home.In a hospital, children's experiences are limited due to the closed and protective environment, leading to many difficulties [1].
Animal-assisted therapy (AAT) and animal-assisted activities (AAA) are becoming commonly used in hospitals, especially in the United States [2].AAT and AAA are expected to have useful psychological, physiological and social effects.Some psychological studies have already shown that animals can be used to reduce heart and respiratory rate [3], lower levels of stress [4], progress mood elevation and social facilitation.However animals are difficult to control, they always have a certain unpredictability, and they are carrier of disease and allergies.Therefore, the use of robots (instead of animals) has more advantages and a better chance of being allowed in hospitals.Using these social pet robots for therapeutic purposes is termed robot-assisted therapy (RAT).For example, the seal robot Paro is used for pediatric therapy at university hospitals [5] [6].Currently, Sony's dog robot Aibo [7], Philips' iCat [8] and Omron's Necoro [9] are also being tested for RAT.As a part of the ANTY project the development of the huggable robot Probo has started.The main goal for the robot Probo is to create a friend for children, acting as an interface between the real, sometimes hard, hospital world and the imaginary fantasy world wherein children grow up.The robot will also be used as a multidisciplinary research platform, giving other researchers the opportunity to improve and explore the possibilities of RAT.Communication will be the first focus of this robot.Having a fully actuated head, the robot is capable of expressing a wide variety of facial expressions, in contrast with other comparable robots such as; Paro, Huggable [10], Aibo and Necoro.Philip's iCat has facial expression of emotions, but lacks the huggable appearance and warm touch that attracts children.Probo will emphasize its expression of emotions by using his nonsense affective speech.Probo must fulfill the specifications to operate in a hospital environment and guarantee a smooth interaction with children.The intrinsic safety when dealing with human robot interaction is therefore of high priority.

II. PROBO A. A huggable robotic imaginary animal
The name Probo is derived from the word Proboscidea, the order containing only one family of living animals, Elephantidae or the elephants, with three living species (African Bush Elephant, African Forest Elephant, and Asian Elephant) [11].During the period of the last ice age there were more, now extinct species, including a number of species of the elephantlike mammoths and mastodons.
The looks of the robot in Figure 1 represents an imaginary animal based on the ancient mammoths.The main aspects are the huggable appearance, the attractive trunk or proboscis, and the interactive belly-screen.The internal mechanics of the robot will be surrounded by foam and a removable furjacket, in such a way that Probo looks and feels like a stuffed animal.The basic design of the robot is based on an imaginary animal, so that there is no exact similarity with a well-known creature.The combination of a caricatured and zoomorphic [12] representation of a mammoth-like animal is useful and effective to accomplish the goals, rather than using complex, realistic representations.The color of Probo is green because this color evokes mainly positive emotions such as relaxation and comfort.In [13], the relationship between color and emotion was tested, whereas the color green attained the highest number of positive responses (95.9%), followed by the color yellow (93.9%).The majority of emotional responses for the green color indicated the feelings of relaxation and calmness, followed by happiness, comfort, peace, hope, and excitement.The green color was associated with nature and trees, and thus creating feelings of comfort and soothing emotions.

B. A tele-interface
The robot Probo will be used as a tele-interface focusing on entertainment, communication and medical assistance.A touch screen in the belly of the robot creates a window to the outside world and opens up a way to implement new and existing computer applications.
1) Entertainment: Young children have a strong need for distraction and entertainment.Providing them with a robotic user interface (RUI) will extend the possibilities of interactive game playing and includes the capability of emotional feedback.
2) Communication: Hospitalized children are sometimes placed in a isolated environment, strongly reducing the communication with friends and family.The robot can function as a perfect interface to communicate with other people using standard videoconferencing techniques.The eyes of the robot will house the cameras, whereas the screen in the belly will display the image, giving the opportunity to establish interactive video-communication.
3) Medical Assistance: The robot interface can be used by the medical staff to inform the children about medical routines or operations.In the same philosophy, Probo can comfort children during difficult medical procedures.The unknown environment will be explored and examinations will be described in a child friendly manner.By using predefined scenarios with pictures, video and sounds children can preexperience the medical routines, guided by Probo.A good preparation for the examinations will reduce the child's fear, providing the medical staff with better results when assessing the child's pain factor.

C. A social interface
Children will have some basic expectations as the robot represents a living animal, resulting in the necessity to react on primary stimuli and to have natural movements.In order to establish some bond with the children, Probo must be able to communicate.In daily life, people rely on face-toface communication and the face plays a very important role in the expression of character, emotion and/or identity [14].Mehrabian [15] showed that only 7% of information is  transferred by spoken language, that 38% is transferred by paralanguage and 55% of transfer is due to facial expressions.Facial expression is therefore a major modality in human faceto-face communication.To start face-to-face communication with children, the robot is equipped with an intriguing trunk in the middle of its face, provoking children to interact with its trunk and stimulate them to maintain their focus on its face.
In [16], Breazeal defines four classes (social evocative, social interface, socially receptive, sociable) of social robots in terms of; (1) how well the robot can support the social model that is ascribed to it and, (2) the complexity of the interaction scenario that can be supported.This project aims to start working with the robot Probo as a social interface, providing a natural interface by employing human-like social cues and communication modalities.In this first phase the focus is the construction of a physical prototype with an actuated head, trunk and facial expressions.

D. Operational Concept
At first, the prototype is used as a RUI (Figure 2) interacting with children and controlled by an operator.The operator can be every person who wants to communicate with the child, in particularly caregivers and researchers.The robot functions as an interface that performs preprogrammed scenarios and reacts on basic input stimuli.The input stimuli, coming from low-level perceptions, are derived from vision analysis, audio analysis and touch analysis.Those stimuli will influence the attention-system and emotion-system, used to set the robot's point of attention, current mood and corresponding facial expression.The vision analysis includes the detection of faces, objects and facial features.Audio analysis includes detecting the direction and intensity of sounds and the recognition of emotions in speech.
A specific behavior-based framework is being developed to process these input stimuli.The framework is based on earlier work of Ortony, Norman and Revelle [17], who focus on the interplay of affect, motivation and cognition in controlling behavior.Each is considered at three levels of information processing: the reactive level is primarily hard-wired and has to assure the quick responses of the robot to make it look alive; the routine level provides unconscious, un-interpreted scenarios and automotive activity; and the reflective level supports higher-order cognitive functions, including behavioral structures and full-fledged emotions, finally resulting in a sociable robot.Starting with a social interface, the reactive and routine level are being implemented.Currently, there is a shared control between the operator, configuring behavior, emotions and scenarios, and the robot, having basic autonomous reactions.Further research and development is required to enhance the robot's emotions and behavior, by implementing a cognitive software architecture at the reflective level to successfully obtain a sociable robot in the end.Therefore a study and implementation of joint attention mechanisms for human-robot communication has been started.

E. Nonsense Affective Speech
The robot Probo will speak to the children using nonsense affective speech, which will be a cross-cultural and understandable language for most of the children regardless of their own native language.In one of the current approaches [18], this speech is produced by using a database with natural expressive speech samples and a database with neutrally spoken speech samples, both recorded with a professional speaker.From the neutral speech examples, carrier sentences of the non-existing language for Probo will be produced by firstly segmenting the recorded utterances into several nonsense syllables and then concatenating them in the same syllabic structure as the desired emotional prosodic template, which is selected from the expressive database.To produce emotional speech for that non-existing language, the same pitch and timing structure as found in the prosodic template are copied on the nonsense carrier phrase, a process that is known as prosodic transplantation and that effectively provides the synthetic output with a same intonation pattern as the natural expressive example.

III. MECHANICAL DESIGN
The first prototype of the robot has 20 Degrees Of Freedom (DOF) to obtain a fully-actuated head and trunk (Figure 3).By moving its head (3 DOF), eyes (3 DOF), eyelids (2 DOF), eyebrows (4 DOF), ears (2 DOF), trunk (3 DOF) and mouth (3 DOF) the robot is able to express its emotions [19].The trunk of the robot is the most remarkable element.When a child interacts with this trunk, it points its attention towards the face of the robot, locating itself in the scope of the onboard cameras, allowing proper vision analysis.Using these cameras, located in the eyes, the robot will be able to focus on a point of attention and follow it with natural eye-movements [20].The robot will use eyebrows, ears and eyelids to express moods and feelings.Flexible materials and compliant actuators are  being applied, considering a safe interaction.Because of the high hospital requirements on hygiene, the fur of the robot can be easily replaced and washed prior to each visit.The prototype measures about 66cm in height and 32cm in width.

A. Degrees of Freedom
For the display of the emotions most of the DOF in the face are based on the Action Units (AU) defined by the Facial Action Coding System (FACS) developed by Ekman and Friesen [21].AU express a motion of mimic muscles as 44 kinds of basic operation, with 14 AU to express the emotions of anger, disgust, fear, joy, sadness, and surprise, which are often supported as being the 6 basic emotions from evolutionary, developmental, and cross-cultural studies [22].Because the robot does not have a human face and in order to simplify the design, some of the AU are missing, others are replaced and some are added.The lack of the lower eyelid and a fixed upper lip lead to missing AU, the AU regarding the nose movements will be replaced by the movement of the 3 DOF trunk.The movement of the ears and the greater visual influence of the trunk will add extra gestures to express the emotions.Table I shows the DOF of Probo's robot head compared with some other non-android robot heads.

B. Soft Actuation
Most of the robots are actuated by electric drives as these actuators are widely available and their control aspects are well-known.Because of the high rotational speed of the shaft and the low torque of an electrical motor, a transmission unit is often required.Due to the high reflected inertia of the transmission unit, the joint must be seen as rigid.For safe and soft interaction the joints need to be flexible, which can be obtained by incorporating compliant actuation.Compliant actuators are gaining interest in the robotic community.Pneumatic artificial muscles [23](such as McKibben muscles, Festo muscles, PPAM [24]), electric compliant actuators (such as VIA [25], AMASC [26] and MACCEPA [27]) and voice coil actuators [28] are some examples of compliant actuators.While some of them exhibit adaptable compliance, so that the stiffness of the actuated joint can be changed, it is not required in the Probo robot.Therefore, compliance is introduced by placing elastic elements between the motor and the actuated robot joint.In this way the external forces on the joint will be dissipated by the elastic elements, resulting in safe and flexible joints.It is more complex to do precise positioning with compliant actuators than with classic high positioning (noncompliant) actuators, typically used in industrial applications, however, the intrinsic safety introduced in the system is of major importance.

C. Materials
In this stage of the development, most mechanical parts of the prototype are made of aluminum because it is a strong, lightweight and tractable material.Some very specific and complex parts are manufactured using rapid prototyping.To comply to the design constraints stated earlier our mechanical robotic part is encapsulated in a foam layer.This layer of flexible polyurethane provides a soft touch, protects the robotics inside and gives the robot a final form.On top of the foam layer the robot will have a removable fur-jacket, which can be washed and disinfected.The fur-jacket, which is a 100% cotton fabric, complies to the European toy safety standards EN71-1, EN71-2 and EN71-3.The use of the soft actuation principle together with well-thought designs concerning the robot's filling and huggable fur, are both essential to create Probo's soft touch feeling.To realize a full-body sense of touch, a sensitive skin will be used.A good example is being developed (by Stiehl et al. [10]) for a therapeutic robotic companion named: The Huggable.In another approach, research has started for the use of photonic crystal fibers [29] which will be implemented in some parts of Probo, such as the trunk.

IV. MODULAR SYSTEM ARCHITECTURE
Besides the restrictions mentioned above, the prototype designer has to bear in mind the need of a modular mechanical system architecture to simplify assemblage and maintenance.This approach leads to an effective development and realization of a robot prototype and requires the use of transferable mechanical and electronic components.Due to a lack of commercially available standard mechanic and electronic modules e.g.eyes, eyebrows, trunk, etc. one must design prototype dependant modules.In the next paragraphs the different modules with the AU needed to display facial expressions are described.Each module can be easily replaced without effecting the others.

A. Eyes and Eyebrows
Besides the role of the eyes to show some facial expressions, there are two additional reasons to equip a social robot with actuated eyes.
1) Eye-gaze based interaction: The phenomenon that occurs when two people cross their gaze is called eye contact [30], furthermore, people use eye-gaze to determine what interests each other.The same phenomenon will be used between robot and child to encourage human robot interaction.By focussing the robot's gaze to a visual target, the person that interacts with the robot can use the robot's gaze as an indicator of its intentions.This facilitates the interpretation and readability of the robot's behavior, as the robot reacts specifically to what it is looking at [31].This visual target will be referred to as the robot's point of attention (POA).
2) Active vision: When a robot is intended to interact with people, it requires an active vision system that can fulfill both a perceptual and a communicative function.An active vision system is able to interact with its environment by altering its viewpoint rather than passively observing it.Therefore, the designed eyes are hollow and contain small cameras.As these cameras can move, the range of the visual scene is not restricted to that of the static view.Although the aim is a pet-type robot, the design of the robot eyes are based on that of human anthropomorphic data.The imitation of anthropomorphic eyes gives the impression of being natural.Two eye-supports candidates are shown in Figure 4.The support shown on the left, holds the eye-ball between two Teflon R parts with the same spherical curvature as the eyeball itself, resulting in three DOF just like in a spherical joint, and a smooth rotation around the center of the sphere due to the low friction.Because there is no mechanical part that intersects the eye-ball, the eyes can bulge out of the head.The second concept (on the right in Figure 4) consists of two rings and two axis.One rotation axis passes through the center point of the eye and holds the eye in the inner ring.This way the eye can rotate relatively to the inner ring.A second rotation axis passes through the inner and outer ring, allowing the inner ring to rotate with respect to the outer ring.While panning the eye, the inner ring comes out of the plane of the other ring, whereas the eye can not bulge out as far as in the former support.Most of the other mentioned robot heads use the second support type or a variant on it, which in our case could lead to the visibility of mechanical parts or the disability to bulge out the eyes.For this reason, the first support type has been chosen .
The five DOF eyes module exists of two hollow eyeballs mounted with the chosen eye-support as shown in Figure 5.According to the chosen DOF based on the AU mentioned earlier; the eyes can pan separately and tilt together, each eye can be covered by an upper eyelid and the eyelids can blink separately.The eyebrows module fits on top of the eyes module.Each eyebrow has two DOF meaning that both the vertical position and the angle of each eyebrow can be set independently.Nine of the shelve hobbyist servomotors, together with a Bowden cable mechanism are used to power the eyes, eyelids and eyebrows.Axial springs and the usage of flexible cables both introduce compliance.Using flexible Bowden cables creates the opportunity to group and isolate the different servos and to place them anywhere in the robot.That way heat and noise dissipation can be controlled and the head can be held light-weighted, both resulting in a safe design.

B. Trunk
The trunk or proboscis of Probo seems to be the most intriguing element concluding the results of a small survey amongst children aged 10-13.In this survey, it was observed that all the children first touched the trunk, and most of them also start playing with it.That is why the trunk is used to grab and maintain the child's attention.When the child's attention is focussed on the trunk, the child's face fits within the scope of the on board eye cameras.In this way the child's attention can be guided towards the face of the robot, to start face to face communication by using Probo's facial expressions.
The three DOF trunk as shown in Figure 6 consists of a foam core with segmented extension discs.The trunk is created using FlexFoam-iT!TM X, a two-component flexible urethane foam with a 160kg/m 3 density cell structure, with a silicone mold.Axial to the centerline, three flexible cables are guided through the discs and attached to the front disc.The end of each cable is attached to a wind-up pulley resulting in a motion of the entire trunk.The motion of the trunk depends on; the number of discs, the dimensions of the discs and the core, the flexibility of the cables and the composition of the foam.A high compliance and durability of the trunk is ensured by using a foam material actuated by flexible cables.Interaction with this trunk will be safe both for the child, that can not be hurt, and for the motors, that can not be broken.Three maxon brushless motors are used to actuate the trunk.Each motor is coupled with a worm worm-wheel gear train to reduce the rotational speed and to increase the output torque.A worm drive is used because of its self locking capability.If, during interaction, the trunk is grasped, it will follow the grasp motion until it is released.When released, the trunk will return to its set position.That is because all external forces on the trunk will be stored and released by the elastic cables.Optical encoders are used to calculate the angular displacement of the pulleys to estimate the position of the trunk.

C. Mouth and Ears
The mouth and ears are both actuated to contribute to the robots facial expressions.In addition to the expressions, the mouth also serves to enhance the affective speech by performing basic lip-sync movements.Probo's mouth has an upper lip and a lower lip, the middle of the upper lip is attached to the trunk and the middle of the lower lip can move vertically so that the mouth can open.Both lips come together in the mouth's corners, which are actuated.The mechanism used for actuating the mouth corners is the same as that used in the ears module, shown in Figure 8.It consists of a brushed maxon motor with a planetary gear train.The first gear train is followed by a second one, which is a worm drive.Position measurement is established by an absolute position sensor fixed on the output shaft.On the output shaft either an ear or a mouth corner is attached.Opening the mouth is established by movement of the middle of the lower lip.Compliance is introduced by the shape of the ear and mouth corners and by means of flexible materials.The actuated part is flexible in a perpendicular direction, and stiff in the tangent direction.Position measurement of the joints is also established by absolute position sensors.In comparison with [5], [13] and [18], Probo has less DOF in the mouth.Each ear has one DOF.The movement of the robotic ear is a rotation which consists of two combined rotations.The first rotation turns the entire ear while the second rotation twists the ear axially.That way the ear's opening is pointed to the front when the robot is attentive and the opening is pointed to the ground when the ear lies flat to the back.

V. ELECTRONICS AND CONTROL SOFTWARE
The maxon brushless motors, which are used to actuate the trunk, are driven by maxon's EPOS motor controllers.The maxon brushed motors, used in the mouth and the ears, are driven by Pololu's motor controllers with position feedback and the hobbyist servo motors, for the eyes and eyebrows, are driven by Pololu's micro serial servo controllers.Figure 9 shows the architecture.
A Personal Computer (PC) is used to control the different motors.Two serial ports, using the RS232 protocol, are used to communicate with the motor controllers.The first serial port communicates with one of the three maxon EPOS motor controllers.This controller acts as a master in a master-slave set up with the two other maxon EPOS motor controllers (slaves).The communication between master and slaves is performed with a CAN-bus.The second serial port communicates with all Pololu controllers.Despite the use of serial communication and the high number of motor positions and speeds needed to refresh, the refresh time rests less than the mechanical inertia and is consequently acceptable.
The control software running on the host PC is written in C# using the Microsoft R .NET framework, it sends the desired motor positions and speeds to the respective motor controllers.This software component is linked with the emotional interface, providing a real time control for setting the emotions, a specific point of attention or to display programmed animations and the ability for visual feedback of the virtual model, which receives the same motor positions.

A. Emotional interface
Several theorists argue that a few select emotions are basic or primary, they are endowed by evolution because of their proven ability to facilitate adaptive responses to the vast array of demands and opportunities a creature faces in its daily life [22] [32].To achieve a translation from emotions into facial expressions, emotions need to be parameterized.In the robot Kismet [34], facial expressions are generated using an interpolation-based technique over a three-dimensional, componential affect space (arousal, valence, and stance).In this model two dimensions; valence and arousal are used to construct an emotion space, based on the circumplex model of affect defined by Russell [33], which has as well been implemented in the robot Eddie [35].In the emotion space a Cartesian coordinate system is used, where the xcoordinate represents the valence and the y-coordinate the arousal, consequently each emotion e(v, a) corresponds to a point in the valence-arousal plane (Figure 10).This way, the basic emotions can be specified on a unit circle, placing the neutral emotion e(0, 0) in the origin of the coordinate system.Now, each emotion can also be represented as a vector with the origin of the coordinate system as initial point and the corresponding valence-arousal values as the terminal point.The direction α of each vector defines the specific emotion, whereas the magnitude defines the intensity of the emotion.The intensity i can vary from 0 to 1, interpolating the existing emotion i = 1 with the neutral emotion i = 0.Each DOF that influences the facial expression is related to the current angle α of the emotion vector.An adjustable interface is developed to define the specific value for each angle (0 • − 360 • ) of each DOF.When selecting one DOF, a value for each basic emotion is set on the unit circle.To attain a contiguous relation, a linear interpolation between the configuration points is applied.By adding more (optional) points or values the curve can be tuned to achieve smooth, natural transitions between the different emotions.An example is shown (Figure 11) for the DOF that controls the eyelid, extra points were added in the first half of the emotion space respectively starting and ending with the happy emotion (α = 0 • = 360 • ).An emotional interface (Figure 12) has been developed wherein the user can fully configure the facial expressions and use the emotion space to test the different emotions and transitions.The user will obtain visual feedback from a virtual model of the robot.In addition to the facial expression this interface has been extended with a component controlling the point of attention.This component controls the eyes and neck motion according to a specific point in the three dimensional space.The respective coordinates of that point can be altered in real time and will be represented as a red cube in the virtual space.This coordinate is translated into rotation angles for the 4 DOF controlling the eyes (pan/tilt) and the head (pan/tilt).As part from the vision analysis, a face recognition component is developed using Intel R 's OpenCV library.This component uses a webcam to capture the images and then calculates the center of the face as a cartesian coordinate.This coordinate can then be used to control the point of attention in the virtual space.Another component in this interface gives the user the ability to create animations, store, edit and play them.Each animation consists of different key frames, which hold the values of the DOF at a given time.There is a linear interpolation between the different key frames resulting in a contiguous animation.The emotional interface can be used to easily insert emotions at a certain point in an animation.The different animations are stored in a database and will be employed later to build scenarios for the robot.

B. Virtual model
A virtual model of Probo has been created to evaluate the design choices and to advance on user testing, without the need for an actual prototype.The virtual model is created combining the mechanical designs (using Autodesk R Inventor R ) with the visual exterior of our robot, represented by the skin (using Autodesk R 3ds Max R ).The mechanical parts are linked together to obtain kinematical movements for realistic visual motions of the model.The skin is attached on the mechanical parts using skinning techniques in 3ds Max R .The movements can be controlled by using sliders to set the desired angle for each DOF and simulate actuation of the parts (Figure 13).This model has also been implemented in Microsoft R XNA TM framework where it is linked to the emotional interface to simulate the motions of the robot.Another benefit of this virtual model is that the positions of our body parts are known at anytime, which are practically the same as these in the real robot.Position feedback will be implemented using potentiometers on the DOF of the robot to improve the accuracy of the virtual model.

VII. CONCLUSION AND FUTURE WORK
The first steps in the creation of a social interface were successfully established.By using specific materials and compliant actuation, the durability and safety of the robot Probo is guaranteed.Based on the AU, a modular and efficient design for the DOF is realized and implemented.The developed software controls the virtual model, by using the emotion space, setting the point of attention and programming new animations.The virtual model provides visual feedback on every motion, as if it was the real robot.All the DOF of the physical prototype can be tested and configured.Using our emotional interface all the emotions can be translated into the values for each DOF.To fully cover all the emotions, the emotion space can be extended with a third dimension: stance, which will allow us to make more difference between anger and fear.By combining techniques from CAD and animation software, a fully realistic virtual prototype was created.In the next steps the virtual model will be connected with the interface controlling the actual motors, resulting in a real time control interface that can be used by an operator.

Fig. 2 .
Fig. 2. The Robotic User Interface (RUI) between an operator and a child

Fig. 3 .
Fig. 3.The prototype of the head of Probo

Fig. 11 .
Fig. 11.Adjustable interface for defining the value off the DOF (controlling the position of the eyelid) for each emotion (angle α).

Fig. 12 .
Fig. 12. Emotional interface for controlling facial expressions, point of attention and animations.

TABLE I DOF
AND RANGES OF THE ACTUATED JOINTS OF PROBO'S HEAD IN COMPARISON WITH OTHER PROMINENT NON-HUMANOID ROBOT HEADS