Preference-based reinforcement learning: a formal framework and a policy iteration algorithm

Overview of attention for article published in Machine Learning, August 2012

Altmetric Badge

About this Attention Score

In the top 25% of all research outputs scored by Altmetric
Good Attention Score compared to outputs of the same age (75th percentile)
High Attention Score compared to outputs of the same age and source (90th percentile)

Mentioned by

twitter: 3 X users

patent: 1 patent

Citations

dimensions_citation: 61 Dimensions

Readers on

mendeley: 131 Mendeley

Summary X Patents Dimensions citations

So far, Altmetric has seen 3 X posts from 3 X users, with an upper bound of 2,109 followers.

@DrJimFan Hey Jim! Eureka is really cool :) Just wanted to point out that Preference-based RL (or RLHF to the NLP community) has actually been around long before OpenAI/DM's paper. For ex: https://t.co/7bsoer1XJB and lots of other work by Arkrour circa

26 Oct 2023

Reply Repost Favourite

[強化学習] / “Preference-based reinforcement learning: a formal framework and a policy iteration algorithm, Johannes Fürn…” http://t.co/zYKrmeeZ

20 Aug 2012

Reply Repost Favourite

[強化学習] / “Preference-based reinforcement learning: a formal framework and a policy iteration algorithm, Johannes Fürn…” http://t.co/zYKrmeeZ

20 Aug 2012

Reply Repost Favourite