MixTBN: A Fully Test-Time Adaptation Method for Visual Reinforcement Learning on Robotic Manipulation

You are here: Homepage > Search

MixTBN: A Fully Test-Time Adaptation Method for Visual Reinforcement Learning on Robotic Manipulation

Liu, Zi'ang / Li, Wei

Learning adaptive policies that can transfer to unseen environments remains challenging in visual generalizable learning (RL). Existing methods commonly learn a robust policy via data augmentation and domain randomization for better generalization. Limited by the unobservability of the target environment, these methods are unable to utilize reward signals for model fine-tune and adaptatively transfer into new scenarios. In this work, we first investigate how a visual RL agent would benefit from the Test-time Adaptation. Surprisingly, we find that the optimization on the Batch-Normalization layer could significantly improve the generalization of visual RL. Hence, we propose a lightweight test-time adaptation algorithm Mix Test-time Batch Normalization (MixTBN), which adaptively transfer the learnt policy into unseen environments without any additional parameter. By solely mixing the statistics of the Batch Normalization layers, our method achieves a state-of-the-art performance on two robotic manipulation tasks. Extensive ablation experiments demonstrate the effectiveness of each component of our method.