Found: 2
Select item for more details and to access through your institution.
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences.
- Published in:
- Journal of Machine Learning Research, 2022, v. 23, p. 1
- By:
- Publication type:
- Article
On Generalized Bellman Equations and Temporal-Difference Learning.
- Published in:
- Journal of Machine Learning Research, 2018, v. 19, n. 26-55, p. 1
- By:
- Publication type:
- Article