Quantifying embodiment towards building better robots based on muscle-driven models

In recent years, researchers have investigated different methods to quantify embodiment for a variety of robotic systems including robotic arms, grippers and legged robots. This paper will discuss some of these methods, focusing on their potential contribution to designing robotic systems based on muscle-driven models. We start with the definition of embodiment based on the relational dynamics between the system and its environments by drawing upon the idea of mutual perturbation and structural coupling between the two. We will discuss how such an understanding can provide potential approaches to quantify embodiment. These includes two information-theoretic measures which are particularly suitable for muscle-driven models. The two methods are based on (i) comparing the controller and behaviour complexity and (ii) Conditional Mutual Information, which compares the difference in distribution of the action conditional on the actuated state and purely on its morphological properties. These methods were used on muscle-driven, biologically realistic hopping models to quantify embodiment at different stages of the hopping gait. The results clearly demonstrate the contribution of morphology of the muscle fibers at different points in the hopping cycle. Furthermore, these methods have been used in latter studies to measure the contribution of embodiment across different levels in a hierarchical control system of a neuro-musculoskeletal model and also to quantify the effects of information cost during various actions in a muscle-driven robotic system. We discuss the practical implications as well as limitations and the future work in the application of these quantification methods.


Introduction
Traditionally, robotic systems are dominated by control laws that are based on kinematics and dynamics.However, recently, the importance of the embodiment for effective operation of intelligent machines has been recognized.Historically, the motivation for studying embodiment comes from studying a wide range of biological systems [1] which are exhibiting remarkable performances.Examples include systems such as insects [2], dogs and other mammals [3] [4].To illustrate embodiment in these biological systems of varying complexity, we start with the simplest one, a single-cell organism, e.g., the bacterium E.Coli, see Figure 1.This remarkable organism does not possess a central nervous system and is simply propelled by its flagella to converge on nutrient sources.One could argue that only the embodiment of the organism is driving its locomotion behaviour.A much more complex example is the human body, which comprises hundreds of muscle-tendon complexes (MTC) [5], forming a complex neuro-muscular system.While neuronal structures are responsible for initiation and control of movements, it has been pointed out that the physical properties of the MTCs such as thickness, tendon stiffness, etc. contribute to the overall control as well [6].The muscles provide low-level reflexes [7] [8]   capable of stabilising the system against external perturbations, such as sudden changes in the terrain or obstacles [9].The above examples suggest that embodiment plays an important role in the control of muscle driven movement.To study the extent of embodiment in muscle-driven systems quantitatively, a suitable measure is required.In recent years, various informationtheoretic methods have been proposed by a number of researchers, including by Ghazi-Zehadi et al. [10], Polani et al. [11], and Rückert and Neumann [12].To the best of our knowledge, according to our literature review, the methods by Ghazi-Zehadi et al. [10] served as the main basis for a range of work on the quantification of in embodiment in muscle-driven models, e.g., [13] [14].Due to its central role, the aim of this paper is to review the methods by Ghazi-Zehadi et al. and to provide a critical examination in their use on non-trivial, and realistic scenarios, by simulating a hopping gait on muscle-driven model [15].We also provide an overview of subsequent research using these models on more complex models.
In the next section, we start with a conceptual understanding of embodiment, in the ontological context of robotics, primarily based on research by Quick et al. [16].This is followed by comparing two main information-theoretic quantification methods.We discuss their applications by Haeufle et al. [14] in quantifying the extent of embodiment in various levels of hierarchy in a neuro-musculoskeletal control system, and by Polani et al. [13] to understand the impact of embodiment on informational costs.

Embodiment in the context of robotics
There are several studies that have defined and examined different uses and interpretations of the notion of embodiment, from its theoretical grounding in philosophy [17], to its application in computer science and robotics [18] [6].Some of the seminal research by Brooks et al. on intelligence [19] has suggested that embodiment is almost invariably analysed in terms of the relational dynamics that exist between a system and its environment.In their research, the authors have described robots, which implement an abstract architecture, referred to as the subsumption architecture.This architecture embodies the fundamental ideas of decomposition into layers of task achieving behaviors, and composition through addressing the noisiness and unpredictability associated with the real world.However, no formal definition was provided to allow any pathways towards the quantification of embodiment.On the other hand, more recently, Quick et al. [16] proposed a framework to understand and exploit embodiment.It is based on the idea of structured coupling, which is postulated in the research by Maturana and Varela [1], who introduced it as a process that occurs when two structurally plastic systems repeatedly perturb one another in a non-destructive fashion over a period of time, leading to a structural 'fit' between the two systems.In the context of embodiment, a plastic system is one that can be affected by external events and perturbed by another system.Building on this idea of structural coupling, Quick et al. [16] proposed a definition of embodiment: Definition 2.1 (Embodiment).A system X is embodied in an environment E if perturbatory channels exist between the two.That is, X is embodied in E if for every time t at which both X and E exist, some subset of E's possible states have the capacity to perturb X's state, and some subset of X's possible states have the capacity to perturb E's state.
This definition provides also a sound basis for articulating a system-environment relationship within robotics.Taking the example of a legged robot in a walking gait on sandy surface, the Figure 2: A robot with a walking gait and its foot showing contact and friction forces as perturbations in the body and environment [4] [20] environment (sand) will provide a normal (contact) force and friction as perturbation to the foot (body) (see Figure 2).Conversely, the environment (sand) will be perturbed by the weight of the body with the shape of its foot, as the robot walks.A muddy surface will have different levels of perturbations due to different contact forces, friction, and compliance as compared to a sandy surface.In the next few sections, we look at methods to quantify embodiment based on this relationship between the body and the environment.

Quantifying Embodiment
In this section, we describe 4 properties that can potentially be used to quantify embodiment, as relevant to muscle-driven models.Quick et al. [16] introduced 2 of them, and Ghazi-Zehadi et al. [10] proposed the remaining 2. To explain these properties, we will use examples from the context of a legged robot.Specifically, we will look at the robot Solo8 (Figure 3), which is an open-source, impedance-controlled robot with 8 degrees of freedom (DoF), developed by the Open Dynamic Robot Initiative (ODRI) [21] [22].This robot is impedance controlled.As reminder for the reader, the concept of Impedance Z(s) is central to this robot and it is defined as the dynamic relation between the position perturbations and forces [23] [24].The underlying principle of impedance control is to define a desired end-effector dynamic behavior in response to unexpected disturbance forces.For the Solo8, an impedance controller regulates the stiffness and damping of the foot with respect to the hip, as per the control law1 in Equation 1: where X d and X are the desired and actual foot positions, K d and D d the desired stiffness and damping matrices, τ the joint torque matrix, and J the foot Jacobian.Having provided an overview of the control law of the example legged robot, we will use this example to describe the 4 properties for quantifying embodiment, which are: Size of the system's and environment's structure, where 'structure' is characterised in terms of its constituent components and its relationships.For a dynamical system like a legged robot, it would cover the size of its state space.In the example of the impedancecontrolled robot above, the state space would comprise of parameters such as stiffness, damping, actuator torque, velocity and position.The size of the state space can simply be derived from the model.While it is straightforward to use to quantify embodiment, it has limited use, because it does not provide any more information other than the size of the state space.It doesn't consider the environment and only provides a discrete quantification,i.e., number of states.Bandwidth of perturbatory channels, which essentially quantifies the effect the system and environment cause each other.If we take again our robot, it has 8 DoF (2 DoF per leg), with a force sensor on its foot.Therefore, while walking, the impact by the environment on the body, i.e. the bandwidth, will essentially comprise the forces propagating in the two joints, and friction forces.In another example, a robot designed for outdoor inspections such as the ANYMAL [4], the bandwidth will comprise forces that the body will be subjected to by the winds, in addition to the normal and friction forces from the ground.This proposed quantification can also be derived directly from the model and provides a useful measure of the number of parameters that are perturbed.However, depending on the type of perturbation it might be non-trivial to obtain an appropriated means of measurement.Contribution of Morphological Computation (MC), where 'morphology' refers to the agent's body, explicitly including all its physical properties (shape, sensors, actuators, friction, mass distribution, etc.).In the context of embodied intelligence, MC refers to processes, which are conducted by the body (and environment) that otherwise would have to be performed by the brain [25].MC, in this sense, captures the concept that control is partially performed by the controlled system interacting with the environment.This measure quantifies the contribution of MC as compared to active control for a specific gait or trajectory.For example, by adding compliance to the body of a given robot or adapting existing compliant elements in the body, the contribution of MC increases, as demonstrated by Mohseni et al. [5].Level of controller complexity, which essentially defines a factor of complexity of the dynamic relationship between the system and the environment [26].In the legged robot example, the complexity is a result of the impedance control law operating over a total of 8 DoF, along with the inner closed loop control system for the actuators.In contrast, a legged robot with an imaging system would add another layer of complexity, with the requirement of cameras and machine vision systems.
Having discussed the 4 properties, we now turn to the methods of realising quantification.This paper focuses on the last two properties, i.e.MC and the controller complexity for quantifying embodiment, since they provide more insight as compared to the other two and they were the only ones that were practically implemented so far.Interestingly, both methods are based on information theory [27].The next sections describe these methods in more details.

Information-theoretic methods
In recent years, several attempts have been made to incorporate the measure of embodiment on information processing in artificial agents.Polani [11] minimises the controller complexity in a reinforcement learning setting.Rückert and Neumann [12] quantify MC indirectly by measuring how changes in morphology reduce or increase the required controller complexity in the context of stochastic optimal control.Ghazi-Zehadi et al. [10] proposed a total of 7 methods, based on understanding the contribution of morphology, when the system moves from one state to the next.All the above methods are based on the principle of deriving general decomposition of mutual information of key random variables relating to the interaction of the body and the environment into unique, shared, and synergistic information.However, the difference in the methods proposed by Ghazi-Zahedi et al. is that it is focused on direct quantification, whereas the Polani et al. and Rückert and Neumann quantify MC indirectly by measuring controller complexity.To the best of our knowledge based on literature review thus far, the methods of direct quantification have been found to be the main basis for quantifying MC on muscle-driven models.Furthermore, in their subsequent work Ghazi-Zehadi et al. [15] described 2 of the 7 models as being suitable for operating on the full system's state, which enable investigation on realistic muscle models without taking approximation errors into account that could potentially occur from limited sensor information.In contrast, the other 5 models are based on deriving partial information from sensors and actuators and could potentially lead to errors in quantifying MC on muscle-driven models.
Before proceeding to elaborate on the quantification methods, it is necessary to explain the concept of the sensorimotor loop.According to Ghazi-Zahedi et al. [10], the sensorimotor loop [28] is the basic control loop used in robotics, comprising a controller, which sends signals to the system's actuators, thereby affecting the system's environment.The body and environment is therefore encapsulated in a single random variable named world.The loop involves three (stochastic) processes S(t), A(t), W (t) (see Figure 4), which take values s, a, w, in the sensor, actuator, and world state spaces.The directed edges reflect causal dependencies between these random variables, in discrete time-steps.Random variables without any time index refer to some fixed time t and primed variables to time t + 1, i.e., the two variables W, W ′ refer to W (t), W (t + 1).
Starting with the initial distribution over world states, denoted by p(w), the sensorimotor loop for reactive systems is given by three conditional probability distributions, β, α, π, also referred to as kernels.The sensor kernel, which determines how the agent perceives the world, is denoted by β(s|w), the agent's controller or policy is denoted by π(a|s), and finally, the world dynamics kernel is denoted by α(w ′ |w, a).The world dynamics kernel α(w ′ |w, a) captures the influence of the actuator signal A and the previous world state W on the next world state W ′ .
To measure MC, we need to measure the extent to which the system's behaviour is the result of the world dynamics (i.e., the body's internal dynamics and it's interaction with its world) and how much of the behavior is determined by the controller policy π.Using the above conditional probability distributions and kernels, Ghazi-Zahedi et al. simulated the hopping gait in a legged robot and measured MC [15] using 2 complementary methods.The first method, based on conditional mutual information captures the change in the kernel to measure MC contribution and the second, measures the difference between controller and behaviour complexity.Both are described in the next section.

Method 1 -MC as conditional mutual information (M C W )
The principle underlying this method is based on the change to the world dynamics kernel α(w ′ |w, a), depending on the extent of the MC in the system.A complete lack of MC would mean that the behavior of the system is entirely determined by the system's controller, and, hence, by the actuator state A. In this case, the world dynamics kernel reduces to p(w ′ |a).Every difference from this assumption means that the previous world state W had an influence, and, hence, information about W changes the distribution over the next world states W ′ .The discrepancy of these two distributions can be measured with the average of the Kullback-Leibler divergence D KL (α(w ′ |w, a)||p(w ′ |a)), which is also known as the conditional mutual information I(W ′ ; W |A).This is computed by: (2)

Method 2 -MC as comparison of behavior and controller complexity (M C M I )
This method is based on the assumption that for a given behavior, MC decreases with an increasing effect of the action A on the next world state W ′ .This measurement compares the complexity of the behavior with the complexity of the controller.The complexity of the behavior can be measured by the mutual information of consecutive world states, I(W ′ ; W ), and the complexity of the controller can be measured by the mutual information of sensor and actuator states, I(A; S).The mutual information I(W ′ ; W ) is high if the system shows a diverse but non-random behavior, which is what we would like to see in an embodied system.On the other hand, a system with high MC should produce a complex behavior based on a controller with low complexity.Hence, we want to reduce the mutual information I(A; S), because this either means that the policy has a low diversity in its output (low entropy over actuator states H(A) or that there is only a very low correlation between sensor states S and actuator states A (high conditional entropy H(A|S).Therefore, we define this measure as the difference of these two terms: W , W ′ , A and S were created as per algorithms described in the paper2 .The robot hopped periodically up to the same length of 1.07m.Figure 5 shows M C W being measured in two different phases of the hopping gait, as shown by the vertical lines.First, the flight phase, during which the robot does not touch the ground, and second, the stance deceleration phase, which occurs after landing.During flight, the behavior of the system is governed only by the interaction of the body (mass,velocity) and the environment (gravity) and not by the actuator models.Also, all actuator control signals are constant during flight.As shown in Figure 5, MC drops as soon as the systems touch the ground.Shortly after touching the ground, the system shows a strong decline of MC, which is followed by a strong incline during the deceleration with the muscle.Furthermore, it was found that both M C W and M C M I have similar values, and therefore only M C W is shown in the plot.

Discussion of the two information-theoretic methods
The numerical results of the two quantification methods, M C W and M C M I complement other findings, like [29], which showed that the minimum information required to generate hopping is reduced by the material properties of the non-linear muscle fibers.The higher MC in the muscle model can be attributed to the force-velocity characteristics of the muscle fibers [30].These non-linear contraction dynamics reduce the influence of the controller on the actual hopping kinematics [29].This implies that the kinematics trajectory is more predetermined by the material properties.
One of the points raised by Ghazi-Zahedi et al. [15] is whether there is a preference for M C W or M C M I .Both methods are useful, M C W to evaluate the extent of the role that morphology IOP Publishing doi:10.1088/1757-899X/1292/1/0120048 plays, and M C M I to understand controller complexity.The latter will help in simplifying the controller design, if possible.
4. Other related work using information-theoretic approaches for the quantification of embodiment While the provided examples are interesting, the chosen simulation setups are, in general, rather simple.It is of interest to understand how embodiment could be measured in hierarchical structure with more complexity.The first example, where this has been applied was Haeufle et al.
[14], who investigated this by simulating point-to-point and oscillatory human arm movements with a neuro-musculoskeletal model.The model (Figure 6) has been developed to access all signals, i.e., low-level controller, control signal, muscles' active state (biochemistry), force, joint torques, and the resulting joint angles.MC was then quantified on the different hierarchy levels.The results show that morphological computation is highest for the most central level u central of the modeled control hierarchy, where the movement initiation and timing are encoded [14].Furthermore, they show that the lowest neuronal control layer u, the muscle stimulation input (see Figure 6), exploits the morphological computation of the biochemical and biophysical muscle characteristics to generate smooth dynamic movements.This study suggests that the system's design in the mechanical as well as in the neurological structure can take over important contributions to control, which would otherwise need to be performed by the higher control levels.
Another related work has been carried out by Montúfar et al. [31], who quantified the advantages of providing a well-chosen embodiment.The authors link the the quality of an embodiment to the informational cost required to achieve it.The core message is that a good embodiment is one that makes decisions (informationally) cheaper, less complex, and more robust.It is based on the concept of relevant information and to identify how and when embodiment affects the decision density, in a minimalist scenario.Taking this research further, Polani et al. [13] investigated how embodiment affects information costs when, scripts, defined as predefined action sequences, are introduced instead of atomic actions.It was demonstrated that that there are cases for which the introduction of scripts allows the agent to save decisions without processing more information for an average decision, thereby reducing information costs.

Key considerations in quantifying embodiment
The information-theoretic methods has been widely applied for various studies and investigations, as demonstrated by a few examples in the previous section 4.However, one of the underlying considerations is that the quantification is based on a simulated neuromusculoskeletal model that resembles the organ-level dynamics.But in reality, each muscletendon unit consists of many motor units that need to be controlled and we cannot rule out the effect on the overall MC.Nevertheless, this model represents the basic functional unit considering the main dynamic properties relevant for the passive contribution of muscles.Furthermore, whilst the results have been quantified on the non-trivial hopping gait using a deterministic system, the studies so far have not used these methods on high-dimensional data collected from natural systems.
Another consideration is whether MC results in the desired behaviour.Montufar et al. [32] characterised Good MC as body-environment interactions which contribute to the desired behavior in a way that reduces the required controller complexity, and Bad MC as bodyenvironment interactions which make the desired behavior more difficult to maintain (e.g., object slipping out of the hand due to the hand's compliance).The authors provide a way of extracting information from M C W by using dimensionality reduction methods to identify the characteristics that result in Bad MC.In that regard, quantifying the informational flows within the sensorimotor network, as described in the seminal research by Schmidt et al. [33], will also provide powerful insight into the expected behaviour.

Future work and potential
The above quantification methods have been tested in the hopping gait and for a robotic arm in a point to point motion trajectory and grasping action.The future work involves testing these methods using other gaits such as walking and running, and compare with the relevant bio-mechanical models.Furthermore, to the best of our knowledge, the research in this area has been conducted using different simulated muscle models and need to be tested on a real robot.Nothwithstanding the above, the above methods provide a solid basis of testing the extent of MC as part of the design process.This can be further used to test the design by augmenting the system with physical elements such as springs, dampers and other components and use the system's morphology to its advantage.This area of research shows promise with recently developed legged robot by Mohseni et al. [5] using a blended approach of virtual and physical impedance control.Another recent example is an under-actuated soft-bodied arm developed by Mahdi et al. [34] to take advantage of the morphology.
Another interesting area of research where quantification of embodiment will be useful is in the development of highly re-configurable intelligent systems with adaptive morphologies [35].Inspired by biology, where many animals have evolved morphable bodies to extend their operational range, several terrestrial and aerial robots have been developed.Examples are GoQBot [36] that morphs before rolling, RHex [37] with legs that adapt their stiffness when morphed to run effectively over a broad range of terrains, and a few others.These systems are developed using programmable matter [35], which is concerned with the theory, design, and manufacturing of systems that reconfigure assuming different morphological configurations.This programmable matter is developed in a modular way as building blocks, starting from the simplest elements.As these elements are developed, their embodiment can be quantified to evaluate their behaviour before adding further building blocks, resulting in lower risk in the design process.Further research by Hauser et al. [38] discusses resilient machines that could enable robots to recover more effectively from damage with adaptive morphologies.

Figure 1 :
Figure 1: A free swimming E.Coli bacterium with flagella clearly visible (photo by CDC on Unsplash)

Figure 3 :
Figure 3: The Solo8 [22] 8 DoF robot based on impedance control and one leg showing the cartesian impedance

Figure 4 :
Figure 4: Sensorimotor loop showing the intrinsic and the world states, and the directed graph showing the causal nature of the sensorimotor loop[15]

) 3 . 4 .
Experiment and results of the two information-theoretic methods.Ghazi-Zahedi et al. [15] simulated various muscle-driven biologically realistic hopping models and quantified M C W and M C M I .The data was discretised and uni-variate random variables IOP Publishing doi:10.1088/1757-899X/1292/1/0120047

Figure 5 :
Figure 5: Extent of M C W at various points in the hopping gait, which shows that the MC increases during flight and is maximum at the highest point.In the stance phase, the actuator control comes into play, and the MC is low.

Figure 6 :
Figure 6: Different levels of hierarchy in the neuro-musculoskeletal model, from the initiation of the movement to the joint angle trajectory [14]