SQU Journal for Science, 2015, 20(2), 78-87 ©2015 Sultan Qaboos University 78 Rolling Optimization Method for Humanoid Robots Riadh Zaier Department of Mechanical and Industrial Engineering, College of Engineering, Sultan Qaboos University, P.O. Box: 33, Al Khod, PC 123, Muscat, Oman. Email: zaier@squ.edu.om. ABSTRACT: In this paper, a method of optimizing the rolling amplitude needed for a stable and smooth walking movement of a humanoid robot is considered. The optimization algorithm was based on minimizing a cost function defined by the rolling overshoot. The amplitude of the rolling during locomotion was calculated using the lateral zero moment point (ZMP) position. The initial value of the rolling was the static rolling that corresponds to the position of the ZMP at the center of the support polygon. The algorithm consisted of performing a ZMP calculation at two points that correspond to single support phases. Simplifying the robot as an inverted pendulum, the gyro feedback controller parameters were tuned to have a passive-like walking motion and a faster response of the robot state to the equilibrium point at single support phase. Experimental results, using HOAP-3 of Fujitsu, showed that the algorithm was successfully implemented along with the locomotion controller. With the optimal rolling technique, the humanoid robot could exhibit a stable and smooth walking movement. Keywords: Humanoid robot; Locomotion control; Rhythmic motion; Rolling optimization; Zero Moment Point (ZMP). نسان اآللي لإلمثلى التفاف طريقة رياض الزاير نٍ، انزوبىث. وتستُذ ِا نإلَساٌَطىر فٍ هذِ انىرقت دور طزَقت يثهً عٍ طزَق ضبظ انقًُت انالسيت نهحصىل عهً حزكت يشٍ يستقزة وسهست ملخص: ٌ لالل انحزكت ًُالهذا انَجاد انقًُت انصغزي نذانت انكهفت انتٍ َحذدها انتجاوس فٍ يُالٌ انجسى انفىقٍ نهزوبىث. َتى احتساب قًُت إهذِ انطزَقت عهً ض. تعتًذ هذِ انطزَقت عهً عًهُت زفٍ يزكش انًضهع انًفت ZMPانجاَبٍ، حُث أٌ انقًُت األونُت نهًُالٌ ثابتت وتتىافق يع يىقع ZMPباستخذاو يىقع ضبظ انًعطُاث انتٍ تتحكى فٍ فٍ َقطتٍُ تتىافق يع يزاحم دعى واحذة. كًا تى تبسُظ انزوبىث بانبُذول انًقهىب وانذٌ َساعذ بذورِ عهً ZMPحساب َساٌ وعهً استجابت أسزع فٍ تحىَم حزكت انزوبىث إنً َقطت انتىاسٌ عهً رجم واحذة.إلحزكت انذوراٌ انجاَبٍ وانحصىل عهً حزكت شبُهت بحزكت ا تقُُت نذا فإٌاح يع وحذة تحكى انحزكت. َشاء انخىارسيُت قذ تى بُجإنشزكت فىجُتسى، أٌ HOAP-3نٍ َِساٌ اإلأظهزث انُتائج انتجزَبُت، باستخذاو ا ًكٍ اإلَساٌ اِنٍ يٍ انًشٍ بحزكت يستقزة وسهست شبُهت بحزكت اإلَساٌ.تانذور األيثم (ZMP). انحزكت اإلَقاعُت، اإلَساٌ اِنٍ، انًُالٌ األيثم، انتحكى فٍ انحزكت، َقطت انصفز نعشو انذوراٌ :كلمات مفتاحية 1. Introduction n recent years, a humanoid robot, particularly the biped robot has been drawing attention of many researchers. The majority of researches regarding the biped robot simplifies it as an inverted pendulum [1] and uses zero moment point (ZMP) [2-6], and controls the ZMP to keep it inside the supporting polygon. In this approach, a humanoid robot and the surrounding environment of the robot are accurately modeled and differential equations are solved. However, the modeling becomes difficult if there is an unknown element such as gear backlash, belt tension changes etc. Moreover, since solving differential equations consumes time, it is difficult to perform real-time control. More recently, biologically inspired control strategies have been proposed to generate autonomously adaptable rhythmic movement. These are based on a neural network, termed as central pattern generator (CPG) [7-16] that is capable of generating a rhythmic pattern of motor activity in the absence of sensory input signals. Taga [8, 9] has demonstrated that bipedal locomotion can be realized as a global limit cycle generated through entrainment between the neural network consisting of a neural oscillator, and the physical system. These approaches aim to efficiently control the robot so as to allow the humanoid robot to perform various motions with stability, while eliminating the need for modeling a humanoid robot or the surrounding environment. However, these control approaches suffer a lack of stability robustness since no sensory feedback has been used other than that used for the CPG entrainment. In other words, any small disturbance acting on the humanoid robot during locomotion may risk its falling. To face these problems, the I ROLLING OPTIMIZATION METHOD FOR HUMANOID ROBOTS 79 robot walking control includes feedback control based on the rotation angle or gyro sensor placed on the upper side of the robot. To find the proper parameters of the gyro feedback loop, details of the robot model should be provided as mentioned in [1, 2]. Some trial in dealing with the locomotion controller using piecewise linear oscillator has been proposed [17-19]. However, the rolling has been not optimized and therefore a rolling overshoot during locomotion can be observed. In this paper, therefore, the proposed approach consists of reducing the control input by tuning the amplitude of the rolling during locomotion. Indeed, by reducing the fluctuation or overshooting of the ZMP attributed to the feedback control, it becomes possible to improve the walking performance of a robot. The impact of reducing the feedback control to its minimum can be seen in the improvement of robot walking performance. Moreover, if the feedback control frequently causes fluctuation or overshooting of the ZMP, the motors used in the walking control get exhausted. Thus, reducing the feedback control as much as possible is also a key to the reduction of motor fatigue. The rest of this paper is organized as follows. Section 2 describes the locomotion controller, Section 3 presents the method of adjusting the rolling amplitude, Section 4 describes the optimization algorithm, Section 5 presents the experimental results, and Section 6 is the conclusion. 2. Locomotion Controller The structure of the humanoid robot considered in this paper is made by Fujitsu Laboratories Ltd. [20,21], there being 6 actuated joints at each leg, and one at the torso, as is shown in Figure 1. The Gyro sensor is placed at the upper side of the robot and 4 force sensors are placed under the sole plate of each leg. Figure 2 shows the implementation of the gyro sensor in a feedback loop that controls the hip and ankle joints. . Figure 1. Humanoid robot structure. Figure 2. Sensor implementation and control. Right yaw hip Right roll hip Right pitch hip Right pitch knee Right pitch ankle Right roll ankle Right leg Left leg Force sensor Force sensor Sole plate Body Gyro sensor Pitch waist joint Left yaw hip joint Left roll hip joint Left pitch hip joint Left pitch knee joint Left pitch ankle joint Left roll ankle joint y 1 1 Mass Gyro/x k b b k )(t l fb  )(t r fb  RIADH ZAIER 80 The controller is of a proportional derivative (PD) type, which may be classed as a damper spring system. On the other hand, to ensure a smooth landing of each leg with a flat foot on a flat ground, the swing leg should touch the ground softly. To satisfy this condition instead of using an impact model, we simply use two PDs controllers that work as spring and damper, and get inputs from the force sensors located under each leg. The outputs of these PDs are fed to the hip, knee, and ankle joints of the landing leg. For this, consider the mass spring damper model that can be ex- pressed by the following: yss Fyk dt dy b dt yd m  2 2 (1) where bs is the coefficient of friction, ks is the coefficient of spring, y is the displacement of the mass m along the vertical axis, and Fy is the external force acting on the supporting leg. To satisfy the constraint of parallel landing of the sole plate on the flat ground, and under the assumption that the thigh and shank of the robot have the same length, and with respect to the angles definition in [20], this condition can be satisfied as follows:           )()()( )(2)( )()()( ttt tt ttt sl p hm lkm sl p am    , (2) where p am , km , and p hm  are the pitching motor commands to the ankle, the knee and the hip, respectively. The overall control structure is depicted in Figure 3. Figure 3. Overall control system of the robot locomotion. The optimization algorithm is implemented in the central control unit, which takes input from the gyro sensor and force sensor control unit as well as commands from upper level control via the communication interface unit. A detailed diagram of the central control unit, in which there is a motion generating unit and rolling amplitude adjustment unit, is illustrated in Figure 4. The latter unit calculates the optimized rolling value and adapts the motion generator accordingly. Central Control Unit Gyro Sensor Gyro Sensor Control Unit Joint Control Unit Joint Joint Control Unit Joint Force Sensor Control Unit Force Sensor Force Sensor Control Unit Force Sensor Communication Interface Memory ROLLING OPTIMIZATION METHOD FOR HUMANOID ROBOTS 81 Figure 4. Central control unit structure. Lifting Landing Rolling Double support phase M Single support phases C1 C2 C3 Switching Switching Switching Mass M G Figure 5. State flow of stepping motion with compliance controller Ci and Gyro feedback G. Figure 5 shows the state flow of the stepping motion, which consists of three sequential phases; rolling, lifting, and landing. The locomotion pattern is designed based on the rolling motion pattern: the robot starts rolling toward one of its legs, then lifts the other leg and moves it forward to make the stride. Once the swing leg lands on the ground the robot rolls towards the other leg and repeats the same sequence. It is assumed that the landing of each leg is accomplished with a flat foot on flat ground. At the end of the single support, the swing leg should touch the ground softly. To satisfy this condition, instead of using an impact model, two oscillators that work as dampers and get inputs from the force sensors located under each leg are used. The outputs of these oscillators are fed to the hip, knee, and ankle joints of the landing leg. The compliance controllers Ci and the gyro feedback G stabilizing the movement are designed for each phase of the walking. Both feedback controllers and motion phases are switched simultaneously, such that the overall stability of the robot is maintained. 3. Method of Adjusting the Rolling Amplitude Figure 6 explains how the rolling amplitudes are adjusted. At the time of shifting from the lifting motion to the landing motion; the rolling to left and right becomes the maximum, that is, the rolling amplitudes are al and a2. The rolling amplitude adjusting (RAA) unit tunes the rolling amplitudes a1 and a2 so that the gyro feedback control at those points of time when the rolling to left and right reaches its maximum is reduced as much as possible. Rolling Amplitude Adjusting Unit Central Control Unit Force Sensor Data Compliance Control Unit Correcting Unit Feedback Control Unit Gyro Sensor Data Central Control Unit RIADH ZAIER 82 Figure 6. Method of adjusting rolling amplitude by a gyro sensor feedback. Figure 7 illustrates a relationship between a sideways moving amount XZMP and a sideways moving velocity VZMP of the ZMP. Figure 7-a represents a case when the rolling amplitudes are not adjusted and Figure 7-b represents a case when the rolling amplitudes are adjusted to optimum values. When the rolling amplitudes are not adjusted, fluctuation or overshoot occurs in XZMP due to the gyro feedback control performed at the point of time of XZMP = a1 or XZMP = a2. In comparison, as illustrated in Fig.7-b; because of the optimum adjustment of the rolling amplitudes, the need to perform gyro feedback control at the point of time of XZMP = a1 or XZMP = a2 is eliminated. (a) (b) Figure 7. Phase portrait of the ZMP; (a) with rolling overshoot, (b) without rolling overshoot. More particularly, the RAA unit calculates the value of a cost function J that represents the amount of gyro feedback control for the rolling motion of each single cycle and then adjusts the rolling amplitudes so that the value of the cost function J becomes the minimum. The cost function J is defined as follows: 1 2 ZMP av (X X ) , t t J dt  (3) where Xav represents the average value of XZMP, tl represents the point of time at which the maximum rolling occurs, and t2 represents the point of time at which the rolling starts to decrease. Moreover, t1 and t2 can be determined according to the cycle of the rolling motion. Figure 8 is a graph for explaining the fluctuation of the rolling during locomotion and Figure 9 illustrates the cost function J and explains its minimization procedure, where α represents an adjustment amount calculated for each cycle by the RAA unit. The adjustment amount α is defined as below. max max ( ) , V roll V   (4) ROLLING OPTIMIZATION METHOD FOR HUMANOID ROBOTS 83 Figure 8. Fluctuation of the rolling motion during locomotion. Figure 9. Cost function versus rolling amplitude. where Vmax represents the maximum value of the sideways moving velocity of the ZMP, V(rollmax) represents the sideways moving velocity of the ZMP at that point of time at which the maximum rolling occurs (roll max), and β represents an experimentally obtained constant number having the default value of 1. In this way, the RAA unit calculates the cost function J and the adjustment amount α for the rolling motion of each single cycle, and then adjusts the rolling amplitudes so that the value of the cost function J is minimized. This enables achieving a reduction in the gyro feedback control at the points of time when the rolling to left and right is at its maximum. The correcting unit in Figure 4 is a processing unit that corrects, by using the output of the compliance control unit and the feedback control unit, the start time of the rolling angle. 4. Optimization Algorithm Figure 10 represents a flowchart for explaining a sequence of operations of the rolling amplitude adjustment process performed by the RAA unit. This adjusting unit calculates the sideways moving amount XZMP using the force sensor data (Step S1) and calculates the value of the cost function J (Step S2). Then, the RAA unit calculates the sideways moving velocity VZMP (Step S3) and calculates the adjustment amount at Step S4. Subsequently, the RAA unit determines whether the calculated value of the cost function J is smaller than a value JO of the cost function that was calculated for the cycle of the previous rolling motion (Step S5). If the value of the cost function J is not smaller than the value JO, then the RAA unit inverts the sign of α (Step S6). Meanwhile, the initial value of JO is the value of the cost function J calculated for the first cycle. Then, the RAA unit corrects the rolling amplitudes a1 and a2 by subtracting from it the adjustment amount α (Step S7) and notifies the corrected rolling amplitudes a1 and a2 to the motion generating unit (Step S8). Upon receiving the corrected rolling amplitudes a1 and a2, the motion generating unit reflects the corrected rolling amplitudes a1 and a2 in the control information generated for the subsequent cycle. Subsequently, the RAA unit determines whether the robot has come to a halt (Step S9). If the robot has not come to a halt, then the RAA unit sets the value of the cost function J calculated for the current cycle as the value JO (Step S10) and returns to Step S1 to perform the rolling amplitude adjustment for the subsequent cycle. On the other hand, if the robot has come to a halt, then the process is terminated. In this way, since the RAA unit performs the rolling amplitude adjustment for each cycle of rolling motion, it becomes possible to reduce the gyro feedback control. As described RIADH ZAIER 84 above, in the present embodiment, the motion generating unit generates the control information with respect to a walking motion having no movement in the front-back direction; the compliance control unit performs the compliance Figure 10. Algorithm of adjusting rolling amplitude. Front ZMP F1 F2 F3 F4 x y l x2 x1 y1 y2 D L Figure 11. Sole reaction forces on the foot and ZMP. control based on the force sensor data; and the feedback control unit performs the ZMP feedback control based on the force sensor data and performs the gyro feedback control, based on the gyro sensor data, at the points of time when the rolling to left and right becomes the maximum. Then, the RAA unit calculates the value of the cost function J and the adjustment amount α for each cycle of rolling motion, and adjusts the rolling amplitudes so that the value of the cost function J becomes smaller. The motion generating unit reflects the changes of the rolling amplitudes in the control information for the subsequent motion cycle. Such a configuration enables achieving reduction in the gyro feedback control and achieving improvement in the robot’s walking while reducing the motor exhaustion. According to an embodiment, motors are switched less often in a gyro feedback control. 5. Experiment Instead of writing equilibrium equations of forces and moments acting on the robot body, the analysis simply utilizes the sole reaction forces measured by sole sensors as shown in Figure. 11. The idea is based on the evaluation of the shift in the ZMP position, using foot force sensor data. Let D(xm,ym) be the ZMP. At any arbitrary point M(xi,yi) inside the supporting polygon, the reaction torque T(Tx Ty) for one foot can be written as follows. ))(())(( 212431 FFyyFFyyT iix  , (5) ROLLING OPTIMIZATION METHOD FOR HUMANOID ROBOTS 85 1 4 2 2 1 3 ( )( ) ( )( ), y i i T x x F F x x F F      (6) Application Program Interface User Application Motion pattern generator USB Thread SM 2 High speed FIFOs Slow speed FIFO LINUX SM 1 KERNEL WINDOWS /LINUX Socket Figure 12. Structure of the motion control system. where all forces are defined in Figure. 11. If we consider the supporting polygon for the two feet, then we have      4 1 3142242311 )(/)()( i l i r i llrrllrr m FFFFFFxFFFFxx (7)      4 1 2121243431 )(/)()( i l i r i llrrllrr m FFFFFFyFFFFyy (8) where r iF represents the sole reaction force applied on the right foot and l iF represents the sole reaction force applied on the left foot, and i varies from 1 to 4. Therefore, the ZMP is calculated during normal condition (no large perturbation is present). The data has to be recorded at each single support phase. A perturbation is considered large when the deviation in the angular velocity exceeds a threshold that is defined experimentally, using gyro sensor data. On the other hand, four postures are defined (learned in advance) to which the robot will shift its pose, when it stops walking. These postures consist of moving the leg to front, back, right, or left, according to the ZMP position. Then, a feedback controller will be activated at the final posture controlling the waist joint and the legs of the robot. The experiment is conducted using the humanoid robot HOAP-3 of Fujitsu [20], which has 28 degrees of freedom and is 60 cm tall and weighs 8.8kg. The real-time control algorithms are implemented in real-time threads running in the RT- Linux kernel space, as shown in Figure 12. Kernel mode shared memory (SM) is constructed for the communication between real-time threads. The control period is 1 ms, and the interface between the motion pattern generator and the robot uses a real-time USB driver thread. xm ym Z M P p o si ti o n ( c m ) 10 8 2 6 4 0 4 .0 5.0 6.0 Time (s) Figure 13. Plot of the ZMP during normal walking. Figure 13 shows the ZMP location during walking of HOAP-3 in the absence of disturbance, which is calculated using (7) and (8). Figure 14 plots the hip rolling joint output and sole reaction force acting on the left leg. It shows how the robot starts lifting using the virtual spring energy. Moreover, here it can be seen that, when the robot is lifting, the rolling angle is almost zero. It is also zero at the landing time. The robot, when landing, relies on gravity only. RIADH ZAIER 86 0 1 2 3 4 50 0 -50 100 0 20 40 Standing phase Time (s) 60 Hip rolling motion Lifting Landing L e ft l e g v e rt ic a l so le r e a c ti o n f o rc e ( N ) A n g u la r p o si ti o n ( d e g ) Figure 14. Plot of the ZMP during normal walking. A n g u la r v e lo c it y ( d e g /s ) Time (s) (a) A n g u la r v el o ci ty ( d eg / s) Time (s) (b) Figure 15. Gyro sensor outputs: (a) without rolling tuning, (b) when using the optimization algorithm. There are no constraints on the ZMP to be satisfied. Figure 15 shows the gyro sensor output, which represents the oscillation speed of the upper body of the humanoid robot. The upper graph (a) shows the case before using the proposed optimization algorithm, while the lower graph (b) shows the case when using the algorithm. In contrast to the result proposed in [17-19], where the locomotion controller was designed using a piecewise linear oscillator, the attenuation of the oscillation of the upper body was not discussed and therefore a rolling overshoot during locomotion has been observed. 6. Conclusion The rolling amplitude needed for a stable locomotion of a humanoid robot was obtained by minimizing its rolling overshoot. The amplitude of the rolling during locomotion was calculated using the lateral ZMP position. The optimization algorithm consisted of performing a ZMP calculation at two points that correspond to single support phases, and when the rolling is at its maximum value. As a result, the robot could roll and walk with the minimum gyro ROLLING OPTIMIZATION METHOD FOR HUMANOID ROBOTS 87 feedback control. Experimental results, using HOAP-3 of Fujitsu, showed that the algorithm was successfully implemented along with the locomotion controller. The robot had a passive-like walking motion. 7. Acknowledgements The experimental work was conducted at Fujitsu Laboratories Limited, Japan. References 1. Kajita, S., Kanehiro, F., Kaneko, K., Yokoi, K. and Hirukawa, H. The 3D Linear inverted pendulum mode: A simple modeling for a biped walking pattern generation. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, October 29 - November 3, 2001 Maui, USA. 2. Huang, Q., Yokoi, K., Kajita, S., Kaneko, K., Arai, H., Koyachi, N. and Tanie, K. Planning Walking Patterns for a biped robot, IEEE Trans. on Robotics and Automation, 2001, 17(3), 280-289. 3. Vukobratović, M. and Borovac, B. Zero moment point - thirty five years of Its Life, J. Humanoid Robotics, 2004, 1(1), 161-162. 4. Kajita, S., Matsumoto, O. and Saigo, M. Real-time 3D walking pattern generation for a biped robot with telescopic legs, Proc. of the 2001 IEEE International Conference on Robotics and Automation, May 21-26, 2001, Seoul, Korea. 5. Kajita, S., Kanehiro, F., Kaneko, K., Fujiwara, K., Yokoi, K. and Hirukawa, H. A Real-time Pattern Generator for biped walking, Proc. of the 2002 IEEE International Conference on Robotics and Automation, May 11-15, 2002, Washington, DC, USA. 6. Kagami, S., Nishiwaki, K., Kitagawa, T., Sugihiara, T., Inaba, M. and Inoue, H. A Fast Generation Method of a dynamically stable humanoid robot trajectory with enhanced ZMP Constraint, Proc. of IEEE International Conference on Humanoid Robotics, September 7-8, 2000, Cambridge, USA. 7. Grillner, S. Neurobiological bases of rhythmic motor acts in vertebrates, J. Science, 1985, 228, 143-149. 8. Taga, G. A model of the neuro-musculo-skeletal system for human locomotion, I. Emergence of basic gait, J. Boil. Cybern, 1995, 73, 97-111. 9. Taga, G., Yamaguchi, Y. and Shimizu, H. Self-organized control of bipedal locomotion by neural oscillators in unpredictable environment, J. Biological Cybernetics, 1991, 65, 147-159. 10. Ijspeert, A.J. Central pattern generators for locomotion control in animals and robots: a review Neurobiology of CPGs, J. Neural Networks, 2008, 21, 642-653. 11. Li, C., Lowe, R., Duran, B. and Ziemke, T. Humanoids that crawl: Comparing gait performance of iCub and NAO using CPG architecture, Proc. of the 2011 IEEE International Conference on Computer Science and Automation Engineering, 10-12 Jun, 2011, Shanghai, China. 12. Li, C., Lowe, R., Duran, B. and Ziemke, T. Modelling walking behaviors based on CPGs: A Simplified Bio- inspired architecture from animals to animates, Lecture Notes in Computer Science, 2012, 7426, 156-166. 13. Li, C., Lowe, R., Duran, B. and Ziemke, T. Humanoids Learning to Walk: a Natural CPG-Actor-Critic Architecture, J. Frontiers in Neurorobotics, 2013, Published online. 14. Matsuo, T., Sonoda, T. and Ishii, K. A design method of CPG network using energy efficiency to control a snake like robot, Proc. of the Fifth IEEE International Conference on Emerging Trends in Engineering and Technology, November 5-7, 2012, Himeji, Japan. 15. Nakamura, Y., Mori, T., Sato, M. and Ishii, S. Reinforcement learning for a biped robot based on a CPG-actor- critic method, J. Neural networks, 2007, 20(6), 723–35. 16. Tomoyuki, T., Azuma, Y. and Shibata, T. Acquisition of energy-efficient bipedal walking using CPG-based reinforcement learning, Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, October 11-15, 2009, St. Louis, MO, USA. 17. Zaier, R. and Nagashima, F. Motion pattern generator and reflex system for humanoid robots, Proc. of IEEE/RSJ International Conference on intelligent robots and systems, October 9-15, 2006, Beijing, China. 18. Zaier, R. and Kanda, S. Piecewise-Linear pattern generator and reflex system for humanoid robots, Proc. of the 2007 IEEE International conference on robotics and automation. April 10-14, 2007, Rome, Italy. 19. Zaier, R. and Kanda, S. New Approach of Neural Network for Controlling Locomotion and Reflex of Humanoid Robot, In Humanoid Robots, INTECH International Publishers, book edited by Ben Choi, ISBN 978-953-7619- 44-2, 2009, 365-388. 20. Murase, Y., Yasukawa, Y., Sakai, K. and Ueki, M. Design of compact humanoid robot as a platform, Proc. of the 19th Annual Conference of the Robotics Society of Japan. September 18-20, 2001, Tokyo, Japan. 21. HOAP-3, Fujitsu Automation Ltd, http://home.comcast.net/~jtechsc/HOAP-3_Spec_Sheet.pdf. Received 1 July 2014 Accepted 15 September 2014