Exploring Embodied Resources in Gaze in Human-Robot Collaborative Environments

Among various types of embodied resources in humans, gaze, started with mutual gaze, plays a major role in embodied cognition. In addition to establishing relationships during interactions, gaze further portrays information about the level of engagement in a dyadic interaction. Hence the gaze and gaze-related behaviors such as averted gaze can be used as cues to make decisions regarding an interaction. This holds true for a human and a robot during human-robot interaction (HRI) as well. Hence proactive robots could evaluate human gaze as a parameter to achieve situation-awareness. In this work, we present the outcomes of several experiments aimed at evaluating such gaze behavior of human-human nonverbal interactions and other behaviors initiated as a result during dyadic interactions. The possibility of evaluating situations by such behavioral responses of individuals as cues, is also examined. We further compared the relationships between gaze behavior of humans during HRI and human-human interaction (HHI). We considered the properties of existence and aversion of gaze, as gaze behaviors during this study. Results of these experiments indicate interesting tendencies in verbal and nonverbal human behavior in initiating of an interaction in both HHI and HRI. The behavioral patterns related to gaze, observed during the study were analyzed using statistical methods and critical observations are highlighted. The potential of analyzing gaze behavior in displaying messages to the outside world during HRI is discussed.


Introduction
A robot's embodiment affects the its emotional relationship with human companions.Hence robotic bodies with adequate affective qualities must be designed to foster synergies between various aspects of interactivity [1] .'Gaze' plays an important role in continuing communication in a co-operative fashion.Despite the complexity of human behavior, gaze can be stated as a simple, but prominent cue observed in a social human behavior [2,3].Gaze is further considered a successful way of displaying the innermost intentions of engagement in an interaction to outside or the expected conversant [4].With the growth of collaborative robots, their cognitive skills have to be improved adequately [5].In such applications, robots received much higher acceptance from their human users.This is because most of the modern robots which we encounter today are specialized to perform a given task within a predefined model with limited constraints [6], hence lack social intelligence.Cognitively and emotionally intelligent robots are expected to have a general sense of social contexts than task-specific robots.Then only the robots will be able to assist their users without causing any disturbance.However, theoretical models IOP Publishing doi:10.1088/1757-899X/1292/1/012013 2 in embodiment and cognition are less specific to explain a complete behavioral scenario at present [7].Furthermore, users will not have to constrain their behaviors to adapt to the limitations in perceptive skills of a robot.
According to Arkin in [21], an intelligent agent has to be able to generate an abstract model of its current context and people in it before it acts.To achieve an adequate sense of social context, robots require a proper interpretation of the surrounding, especially its user [8].Psychological theories such as the theory of mind (ToM) [9] and some previous studies in HRI identified various observable human cues related to movements and behavior that display human interest in a particular subject.Gaze, gestures, posture changes, involuntary movements, verbal responses and facial expressions to name a few.A method to estimate the attention of a user is proposed in [10].The authors have used the pose and speeds of specific angular joints of a human to predict the nonverbal interaction demanding of a human user.These two cues were evaluated based on fuzzy logic to derive a user's interest level towards a robot.Even so, there can be other parameters that portray attention more clearly, especially when the users are not engaged in a task.
The human study performed in [11] was planned to investigate human responses towards an interaction with a robot in various circumstances.As indicated in this investigation, human reactions upon a robot-initiated interaction vary depending on their current activity.However several studies suggested that a gaze from a robots is interpreted differently than a gaze from a human [12].However this work does not explore the characteristics of gaze behavior between proactive robots and humans.More recently, attempts have been made to understand the temporal characteristics of social gaze but little research has been conducted in a natural setting with two interacting participants [13].
Interactivity of an encounter was measured using four types of connection events incorporating gesture and speech: directed gaze, mutual facial gaze, conversational adjacency pairs and back channels according to [14].The work explained here used initiator and responder gaze times to determine gaze and gaze is used with the other three parameters to determine user engagement during interaction.The trade-off between user's and robot's gaze were used as a cue to evaluate a certain situation in [15].In [16], 3D positioning of human frustum and eye movement were continuously monitored to evaluate situation-awareness during a joint task.However tracking eye movement needs to move the robot to close vicinity.Therefore this evaluation cannot be performed at a distance prior to an interaction.Hence such robots may use 'gaze' instead of 'eye movement'.In addition, authors in [20] have identified gaze as a parameter which portrays the user engagement with social robots.
The duration or time in consistent gaze proved to be an important parameter in determining conversational attention in multi-user environments [17].Findings of this work confirmed that the gaze directional cues of users could form a reliable source of input for the conversational systems that need to establish whether the user is speaking or listening to them.This further discusses that the predictive power of signals related to gaze may depend on the individual and the visual design of the overall conversational system.Mendez et al. in [18] proposes evaluating the diveristy of gaze patterns in older adults and young during HRI.The authors expect to compare differences in reaction times in response to gaze in both the age groups although the authors do not expect to see much difference in preferences for the robots between age groups.This can be stated as one research that will go in-line with the findings of our set of experiments in this regard.
An evaluation of personality traits in terms of gaze aimed at eye and body during HHI and HRI is explained in [19].Results of the experiment confirmed that the participants gaze more at body in HRI than in HHI, and they gaze more at the partner's face in HHI than in HRI.The reasons authors guessed for these trends were that robot's face is only blinking by changing color of the eye, which does not provide any information and to understand the meaning of the gesture of the robot respectively.From the above examples, we could observe that many robotic systems developed so far have considered 'gaze' as a parameter to evaluate a situation between a human and a robot.Therefore we tried to find the role of 'gaze' during an interaction to find how important it is to evaluate this feature and which other facts can be explored by analyzing gaze.In this research we tried to find answers for the following questions.
1) How do humans use gaze-related features as actions during human-human interactions?
2) How similar the features are associated with human-robot interactions with these?
3) How similar are HHI and HRI in terms of gaze behavior?
For that, we conducted a series of experiments with human-human and human-robot dyads to find out the patterns in their gaze behavior and hence find answers to the above questions.

Understanding human behavior
Robots in human-robot collaborative environments will have to co-operate with humans successfully, to ensure their survival in the real world.Humans adopt various physical behaviors to display their intentions to outside.Changes in movements, pose, posture, gestures, nature of speech, to name a few.Understanding human intents through visual cues is one critical ability such robots will have to master.In this paper, we took an ethological approach to analyze how human gaze behavior changes in the presence of another human and how should we adopt gaze observations in human-robot interactions.

Perception of behavior
Out of the numerous cues which humans display, gaze which receives much higher attention during an interaction [22] is analyzed in this paper.Furthermore, gaze can be used as a cue which displays our intentions to the outside world [3].Hence we intend to identify features related to gaze that are worth analyzing.We expect that such an analysis will facilitate developing cognitive models related to HRI in the future.

Timescales of human actions
According to the time scale of human actions given in [23], actions that take place within 100 milliseconds (ms) to 10 seconds (s) fall under the cognitive band of actions.In our work, we selected 'gaze' behavior which is also a cognitive act.Gaze can sometimes be an automatic behavior and sometimes, a controlled behavior [24] where the reaction time may take up to about 1.3 seconds.Hence we adopted a timescale to observe gaze behavior between human-human and human-robot dyadic interactions.

Understanding gaze behavior
Figure 1 illustrates a typical human-human encounter and the change in responder's gaze with time.Alongside this, three parameters involved in gaze: response time, duration, averted/not were analyzed during this study.These occur at different time scales when initiating an interaction.Here, the 'delay' represents the time taken by the responder to react, once a change in situation was detected.When considering gaze, the change in situation was the presence of an outsider who seeks attention.Here, the 'duration of gaze' relates to the entire time the responder looks at the approacher.The 'averted gaze' is when the responder returns his/her gaze back at his previous task.These features associated with gaze can say a lot to an outsider about the inner intents and priorities of the responder.Hence these three features of gaze during HHI and HRI have been investigated separately in our experiments.

Setting
The experiment was conducted in study areas and laboratories inside the university with the participation of 47 subjects in an age range of 25-51 years (mean of 38.6 and SD of 11.31).
Nearly half of the participants (23) had no strong technical background.The experiment was conducted in 2 stages: HHI and HRI.Initially participants were asked to be seated and read the guidelines of the experiments in the form of a document opened on a laptop on a table as shown in Figure 2.While the participant was reading these guidelines, a person previously known to them, but not family/ a relative, was allowed to approach him/her.The participant's reaction to the approacher were recorded.The role of the approacher was to walk towards the target person, without any verbal or nonverbal interaction.Here interactions refer to dialogs, gestures, facial expressions, etc.These observations were analyzed under stage 1: human-human interactive scenario, shortened into HHI.The same procedure was repeated in stage 2.But this time, the approacher was a robot.Stage 2 was conducted once a week over a period of 3 weeks to give the participants some time to adapt to social robots.We terminated the set of experiments once we observed several trends in HRI which were similar to HHI.
Stage 2: human-robot interactive scenario, shortened into HRI, was conducted over 3 weeks in 3 occasions.These occasions are HRI-week 1, HRI-week 2 and HRI-week 3 in which the same experiment was conducted in the first, second and third week respectively.The reason to conduct only stage 2 along 3 weeks is that human-robot interaction process was not normalized as human-human interaction even at present.Therefore we gave participants time to adapt to the human-robot collaborative environments.Therefore we repeated the same experiment thrice within 3 consecutive weeks.The only difference in stage 2 was that the approacher was the robot, wherein the approacher was another human in stage 1.
Therefore in the end, there were four occasions considered as follows.
• HHI-Both the approacher and the participant were humans.
In all 4 occasions, the interaction was dyadic.We did not allow other people in the surrounding as human responses might have an influence from outsiders' presence [11].The interaction stopped before any conversation between the two parties begins.In all the occasions, the features associated with the gaze: presence of gaze/averted gaze, delay in gaze and the duration of existed gaze, were recorded using a video camera set up in front of the setting.At a later stage, this video was analyzed to extract required parameters during the interaction.Two example scenarios from HHI and HRI from the experiment are shown in Figure 2.

Results of the experiments and discussion
Firstly, we analyzed whether gaze existed or not in a scenario and if existed, we recorded parameters or properties of existed gaze behavior.At the end, we looked for the existence of other responses towards an interaction between two participants in addition to gaze.An in-depth analysis of our observations is given in the following paragraphs.
Several labels have been used for the ease of referring the parameters related to gaze.These are mentioned below.
• Continued gaze-When the user looked at the approacher and continued looking until the interaction ends.• Averted gaze-When the user looked at the approacher and averted gaze after a while.
• Delay-The time taken by the participant to look at the approacher after the person approached him/her.• Duration-The duration of existed gaze before the participant averted gaze.0 and 1 in the graphs denote non-existence and existence of a considered parameter respectively.Figure 3 shows the presence of gaze parameters during HHI.As seen from the graph, 26 persons out of the 47 considered (55%), maintained gaze with an approacher.Out of them, 14 averted their gaze and turned back to their work, which accounted for 29.8% in total.That is, 54% of individuals who maintained gaze, averted gaze after a while and continued their work.Even when at work, more than half of the participants (55%) maintained gaze with the approacher, although it is averted later.It can be considered a friendly gesture of paying attention towards the approacher Even so, more than half of them (54%) averted gaze and gave their attention back to work.However it is interesting that most individuals tried to be interactive and not to be disturbed at the same time.
The existence of continued/averted gaze as a percentage in each of the 4 occasions is shown in Figure 4.In HHI, this percentage was 55 and in HRI, these percentages were 81, 68, and 49 in the three consecutive weeks.In the first week of HRI, a higher percentage of individuals maintained gaze in week 1, while this reduced considerably (by 32%) by week 3. HRI was only slightly different from HHI in gaze behavior by week 3.
Considering scenarios with averted gaze, the duration of existed gaze is plotted against the average delay of the gaze for the four occasions in Figure 5.In the four occasions, the durations of gaze were recorded as 4.45, 3.36, 2.97 and 1.8 seconds respectively.Corresponding average delays in gaze were 1.06, 1.88, 3.23 and 3.35 seconds.The difference between the duration of gaze and the delay in maintaining gaze was maximum in HHI and HRI-week 3. The duration of gaze was higher in HHI, and the delay in maintaining gaze was higher in HRI-week 3. From this, it can be seen that, when humans are engaged, robots received much higher negligence from humans.
Figure 6 illustrates box plots showing the gaze delays in encounters where a continued gaze was present for the four occasions.First and third quartiles and the mean for the four occasions are marked on the box and whisker plots here.Outliers, minimum and maximum values recorded are marked on the plots as well.When the average delay of the four occasions were considered, HRI-week 1 recorded the least average delay in continued gaze.This value gradually increased by week 3 and, HHI and HRI-week 3 resulted in close average delays in continued gaze.The trend in HHI and HRI in terms of gaze delay, at the end of the 3 weeks.A possible reason for the scatter of data in week 2 is that people are getting familiar with social robots in their environments starting from week 2. The scatter of data was again normalized in week 3.This can be the reason to observe negative delays in gaze as well.The unfamiliarity of humans with robots resulted in giving much more attention to robot's behavior when they approach.
Figure 7 plots several box plots to identify the behavior of gaze delay of encounters where a continuous gaze was present.First and third quartiles and the mean for each occasion are marked on the blox and whisker plots.Similar to the previous graphs, duration of averted gaze also recorded similar trends in HHI and HRI-week 3 in terms of the average duration and the When continued gaze delay and averted gaze delay were compared, the difference between HHI and HRI-week 3 is very small.A probable reason for this is if users are interested they paid attention for both robots and humans.But if they are not interested in an interaction, they will tend to neglect the robot for long.Even though the mean values of averted gaze duration in HHI and HRI-week 3 are close, the actual durations in HRI were scattered over a wide range of values.Familiarity and past experience with robots and personal opinions on robots can be stated as a probable reason for this observation.Humans still consider a robot as a 'machine' rather than a 'companion'.Such perceptions can also support this observed behavior.
In addition to gaze-related behaviors, several other responses towards the interaction were observed during the study.These were voice responses such as greetings (e.g: "Hey", "Hello, how are you?"),smile, changes in body pose and changes in hand pose.Figure 8  behaviors adopted by the participants to react to the approacher, in addition to gaze.Out of the total participants, 44.68% used some kind of voice response to welcome the approacher.36.17%smiled at the approacher while only 23.40% changed their pose voluntarily or involuntarily.17.02% changed their hand pose during the encounter according to the graph.Therefore humans use several other responses to support an interaction even when engaged.Proactive robotic systems have a challenge in perceiving such behaviors as well as analyzing gaze behavior.

Finding answers
As mentioned in section 1, the objectives of this set of experiments were to find answers to following questions and the answers that were derived from the study are as follows.
1) How humans use gaze-related features as actions during human-human interactions?
During the study we observed the existence/nonexistence of gaze and if gaze existed, the delay and the duration of that gaze.These parameters associated with gaze were numerically measured as a part of the study.In addition, four other responses were observed; voice, smile, pose changes and hand movements.From the results, it could be observed that gaze was the most frequently used behavior during HHI, upon all these observed behaviors.Hence exploring gaze-related behaviors received much higher attention while analyzing the results.
2) How similar features were associated with human-robot interactions?Once humans were comfortable with social robots, similar responses they use towards other humans could be observed towards robots as well.Comparison of results in Figure 8 and [9] testify to this.According to these two studies, when users were busy, robots received lesser attention than humans, as a result lesser responses as well.Somehow the behaviors observed in front of robots were only slightly different from those in front of humans, except in number.
3) How similar are HHI and HRI in terms of gaze behavior?Out of the three parameters considered: continued gaze delay, averted gaze delay and averted gaze duration, we could observe similarities between HHI and HRI in continued gaze delay and averted gaze delay.Averted gaze duration during HHI and HRI showed a significant difference.Therefore we can say that there are dissimilarities between HHI and HRI in terms of gaze.It is important to identify these changes as an insight for the accurate cognitive models of future robot assistants.It is also important in evaluating the human acceptance of social robots as well.
The answers to these questions were critically discussed under results and discussion in section 3.2.According to the findings, not only gaze plays an important role during interactions, but also in HHI and HRI with some unique features in both.It will be helpful to identify where 44.68 36.17

Implications of the study
Gaze is an important cue which determines the interactivity of a situation.Hence considering gaze in decision-making can be proposed as a design guideline in future Robotics.
The second design guideline derived from our study is analyzing the properties associated with gaze, without considering the existence/nonexistence of gaze solely.
Human-robot interaction was not very different from human-human interaction in terms of gaze behavior.Therefore proactive robotic systems which are intended to initiate interactions with humans, can adopt human rules and norms related to gaze.This is the third design guideline proposed by this study.
Robots are deployed in social environments in different levels in different communities.In this study we considered younger and adult Asians originated in Sri Lanka where robots are rare in social environments.Therefore considering previous experience with the robot will be our forth design guideline.
In addition to the relationship between the two conversant, there can be other factors which contribute for gaze parameters to differ.For instance factors such as age group and gender might have an influence, at some level.Therefore considering some other factors which might be of interest, will be beneficial in decision making during HRI.This will be another implication derived from our experiments.
Even though gaze is a strong display of emotions during interaction, there are other factors which determine emotions associated with an encounter as well.Therefore multimodal perceptive skills will be required for a robot to perceive a situation entirely during social interaction.But considering gaze as an observable cue is crucial according to the results of this work.Development of such a perceptive ability will be the sixth design guideline derived from the study.

CONCLUSIONS
We conducted a set of experiments based on gaze behaviors between human-human and human-robot dyads extends through a period of few weeks.We analyzed similarities and differences in three parameters related to gaze: delay in maintaining any kind of gaze, duration of gaze and whether the gaze is averted or not.Results confirmed the fact that 'gaze' plays an important role in HHI as well as HRI with some observable variations.We could further discover that there are differences between HRI and HHI, with respect to the parameters associated with gaze.Therefore before imitating humanlike behaviors, it is important to identify how and why gaze parameters alter differently during these two scenarios.Hence our experiments combined both HHI and HRI in terms of gaze behavior to lay a justifiable basis for the development of multimodal approaches for a robot's cognition.In line with the previous studies, gaze proved to impact largely on the interactivity of a scenario than other behavioral responses.We evaluated and discussed how existence/non-existence of gaze, gaze time, and averted gaze differ in HRI compared to HHI.Our findings further suggest that averted gaze showed a significant difference during HHI and HRI upon the other parameters.These findings will help proactive robots in the future to make situation-cautious decisions in human-robot collaborative environments.Trends in human-human interaction that could be adopted in human-robot interaction are suggested as the outcomes of this work, helping future robots satisfactorily evaluate a situation in proactive HRI.In the future, we will explore the roles of other observable human cues which will have an impact on interactivity of a situation, except for gaze.

Figure 2 .
Figure 2. Two scenarios during the experiment are shown.(a) HHI scenario is shown.Both the responder and approacher were humans.(b) An HRI scenario is shown.The approacher was a robot while the responder was human.In both the occasions, the responses generated by the responder after noticing the presence of the approacher were recorded.

Figure 3 .
Figure 3. Existence or nonexistence of gaze during HHI is plotted against each user.If the user averted his/her gaze, such instances are marked in light blue while continued gaze is marked in dark blue.

Figure 4 .
Figure 4.The percentage (%) of encounters with any kind of gaze (continued or averted), out of the total encounters considered, is plotted against the occasion: HHI, HRI-week 1, HRI-week 2 and HRI-week 3.

Figure 5 .Figure 6 .
Figure 5.Of the total encounters where averted gaze could be observed, the average duration of existed gaze was plotted against the delay.All the units are in seconds (s).

Figure 7 .
Figure 7.The delays of encounters where an averted gaze was present.The box plots are drawn to identify the behaviors of parameters and their evolution with time and experience.The markers are drawn to the scale of the duration of gaze.

Figure 8 .
Figure 8. Responses observed from users during HHI, in addition to gaze are plotted against each individual.