Logo Xingxin on Bug

HKUST PhD Chronicle, Week 40, POMDP Trap in Robot Learning

May 23, 2026
3 min read

Observation Aliasing and POMDP

This is the biggest lesson learned this week. According to Saining Xie, when an experiment doesn’t work as expected, that is a signal. My designed experiments related to the blender included:

  • open the blender lid
  • close the blender lid
  • prepare an item and put it into the blender
  • use a tool to turn on the blender

The last one is special and yielded somewhat unexpected results. I collected demonstrations via teleoperation as usual.

  • 1: Robot starts from A.
  • 2: Robot goes from A to B to pick up a tool on the table.
  • 3: Robot goes from B to C to press the button using the tool.
  • 4: Robot goes from C back to B and releases the tool on the table.

process-to-divided.webp

The trajectories are simple and easy to understand. However, when I deployed the policy after doing SFT on 📄GR00T N1: An Open Foundation Model for Generalist Humanoid Robots using these collected demonstrations, something looked weird.

After picking up the tool, the robot went up and instantly released it. Then it tried to pick up the tool again, going back and forth. The robot seemed to be trapped in a loop. I realized the robot was trying to denoise based on the exact same visual observation, which could logically lead to either the 1st trajectory(going to the button) or the 4th trajectory(releasing the tool).

It turns out this is a well-studied subject known as observation aliasing in POMDP.

Gaussian Splatting

To close the visual sim2real gap, I was actively looking for solutions that could “render” my physical lab in simulation. I recently found a framework called 📄GSWorld: Closed-Loop Photo-Realistic Simulation Suite for Robotic Manipulation that perfectly addresses my needs.

In short, it applies Gaussian Splatting to the simulation via ManiSkill. The visual observations look much more realistic.

Cosmos

To increase the size of my dataset, I also tried NVIDIA Cosmos, specifically the cosmos-transfer2.5 model.

Unfortunately, the result was quite… creepy😂 The “disco light” artifact effect was pretty horrible👻.

See also...