13,987 followers
Another quibble with the paper is that it doesn't cite the seminal work of Moore & Atkeson on prioritised sweeping: https://t.co/Trxm03F4vA. Their approach learns a backtracking model and uses it to speed up planning in model-based RL.