Hybrid Internal Model for Legged Locomotion

1OpenRobotLab, Shanghai AI Laboratory, 2Zhejiang University, 3Tsinghua University

Our locomotion policy can drive robots to walk across any terrain under any disturbances. It is powered by Hybrid Internal Model that uses the robot’s historical internal states to simulate implicit response and estimate the robot's velocity with the help of successor state so that the policy can estimate disturbances from environmental dynamics.


This paper presents a Hybrid Internal Model (HIM) based method for legged locomotion control in quadruped robots. The method aims to address the limitations of existing learning-based locomotion control paradigms, which suffer from information losses, noisy observations, sample efficiency, and difficulties in developing general locomotion policies for robots with different sensor configurations. The proposed HIM method leverages joint encoders and an Inertial Measurement Unit (IMU) as the only sensors for predicting robot states. Our framework consists of two components: the information extractor HIM and the policy network. Unlike previous methods that explicitly model environmental observations such as ground elevation, friction, restitution, etc, HIM only explicitly estimates velocity and implicitly simulates the system response as an implicit latent embedding, with velocity and this embedding, the policy can estimate the environmental disturbance then perform robust locomotion control. The embedding is learned through contrastive learning, which enhances robustness and adaptability in disturbed and unpredictable environments. The proposed method is validated through simulations in different terrains and real-world experiments on the Unitree robots. The results demonstrate that HIM achieves substantial agility over challenging terrains with minimal sensors and fast convergence.

Internal Model Control Modeling

The classical Internal Model Control(IMC) suggests that we can perform robust control without directly modeling the disturbance. As shown in the above figure, it uses an internal model to simulate the system response and further estimate the system disturbance, increasing the closed-loop stability. The more accurate the internal model is, the more robust control it can perform.
In the context of locomotion, the system disturbance from the environment can be estimated from the response of the robot. Therefore, we consider external environmental properties such as elevation maps, ground friction, and ground restitution as disturbances, and do not exploit them for modeling. As shown in the figure, we modify the original IMC for the locomotion task. The commands contain the reference velocity of our robot, however, there also exists an underlying command that requires the robot to keep stable through the whole process. To achieve a closed-loop control system, we need feedback containing the robot's velocity and an implicit response indicating stability that can not be directly accessed from the robot. Following the principles of the IMC framework, we can build an internal model that can simulate the robot's velocity and the implicit response indicating stability. With this model, we can estimate the disturbance brought by the environment and perform robust locomotion control.

Framework Pipeline

The policy network receives partial observations and the hybrid internal embedding, which is optimized to be close to the robot's successor state where the response of the robot system is naturally embedded, we use contrastive learning in this process to utilize batch-level information and deal with noise.

The robot chases a boat on Huangpu River.

Robot climbs a 63-degree slope and walks on it under disturbance.

This robot gets kicked by an adult while walking on the stairs.

This robot gets dragged by an adult on its leg while walking on the stairs.

This robot gets dragged by an adult on its back while walking on the stairs.

The robot moves on stairs in different ways.

The robot steps down from a very high platform.

We also applied our framework to the control of Cassie, a bipeddal robot. The video demonstrates the result in simulator.


          author    = {Junfeng Long, Zirui Wang, Quanyi Li, Jiawei Gao, Liu Cao, Jiangmiao Pang},
          title     = {Hybrid Internal Model: Learning Agile Legged Locomotion with Simulated Robot Response},
          journal   = {Arxiv},
          year      = {2023},