HKUST PhD Chronicle, Week 45, Offline GCRL and Windsurfing

Offline GCRL

This week marks the end of my exploration into potential solutions for offline goal-conditioned reinforcement learning(GCRL). Honestly, I haven’t figured out how to integrate GCRL into my problem formulation yet.

Specifically, during inference

\pi(a \mid s, g): \mathcal{S} \times \mathcal{S} \to \Delta(\mathcal{A}),

I still haven’t thought clearly what is the appropriate goal $g$ for the policy should be. Is it an “imagined” future image? Or is it an “imagined” state? Despite these unanswered questions, I still think offline GCRL is a lot of fun.

Reinforcement Learning

To better understand GCRL, I had to fill in a few knowledge gaps in foundational reinforcement learning. Specifically, I dived deep into the value function and policy evaluation. Here are my 2 learning notes:

Windsurfing

From the day one of enrolling at HKUST, I planned to learn windsurfing. After nearly a year of entering the course lottery, I finally managed to get a spot. Blessed with good weather, I passed the basic windsurfing exam using “luff up” and “bear away” to sail upwind!

I am now a certified novice windsurfer!

HKUST PhD Chronicle, Week 45, Offline GCRL and Windsurfing

Offline GCRL

Reinforcement Learning

Windsurfing

See also...