[問題] A3C Actor Gradient longlyeagle PTT批踢踢實業坊

[問題] A3C Actor Gradient

作者: longlyeagle (長鷹寶寶實驗室) 2017-10-08 10:31:47

Working on A3C deep reinforcement learning.
Since I am too lazy to modify the last layer of my NN to softmax,
I use a softmax filter to let the linear layer directly target
the softmax output.
The algorithm works in my test cases for now.
But it might go wrong when the reward is on a different scale.
Can anyone help me to check if my implementation is correct?
https://goo.gl/FV8sFu

作者: longlyeagle (長鷹寶寶實驗室) 2017-11-05 22:07:00

It turns out that the current test case willmake correct result target to 1 after softmaxand the wrong result to 0That's why the reward will work in its currentscale

繼續閱讀

Fw: [問題] 給字串找出第一個符合的globdanny0838 Fw: [問題] 如何加快搜尋效率FacetheFaith 請問god800915 [閒聊] Project Euler-第51題jimfan [情報] 北醫黑客松參賽訊息AeiCheng [問題] 基於影像式三維建模webber90 [問題] 有障礙物的八皇后問題woody3724 [問題] 類神經網路的反向傳播與邏輯回歸st1009 [問題] 最佳運費的問題sate1128 [問題] 找四環有幾個，有沒有比O(n^3)快的算法rareone