New preprint: Data-Efficient RL with Momentum Predictive Representations(https://t.co/AN8St4eSpC)
— Ankesh Anand (@ankesh_anand) July 14, 2020
In 100K steps(<2hrs) on Atari, using self-predictions via a latent model & data aug, MPR:
* improves SOTA human-norm’d score from 26.8% to 44.4%
* exceeds human scores on 6/26 games pic.twitter.com/C0nnoCa665
from Twitter https://twitter.com/ankesh_anand
July 13, 2020 at 06:06PM
via IFTTT
No comments:
Post a Comment