Thoughts: Favorite tweets

Wednesday, August 14, 2019

Favorite tweets

In our latest AI safety blog post, we explore principled solutions to the reward tampering problem, in which a reinforcement learning agent actively changes its reward function to maximise reward.

Blog post: https://t.co/UAPa3b71bK
Paper: https://t.co/cxksIx5kwU pic.twitter.com/HRnoYBHBYA
— DeepMind (@DeepMindAI) August 14, 2019

from Twitter https://twitter.com/DeepMindAI

August 14, 2019 at 08:47AM
via IFTTT

No comments:

Post a Comment