Quantcast
Channel: GameDev.net
Viewing all articles
Browse latest Browse all 17625

UCT + Policy

$
0
0

Hello,

I want to add heuristics to an MCTS implementation but I still want MCTS to “take over” and make the final decision. Say I have a function policy() that returns a value from 1.0 to 0.0 depending on the strength of the move as determined by heuristics. Is this modification to the UCT formula correct?

UCT = move.rewards/move.visits + exploration_rate * policy(move) * sqrt(log(totalSiblingVisits) / move.visits)


Viewing all articles
Browse latest Browse all 17625

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>