OpenURL Connection Search - EBSCO

Connecting you to content on EBSCOhost

Select item for more details and to access through your institution.

Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences.
Published in:
Journal of Machine Learning Research, 2022, v. 23, p. 1
By:
- Chan, Alan;
- Hugo Silva;
- Sungsu Lim;
- Tadashi Kozuno;
- Mahmood, A. Rupam;
- White, Martha
Publication type:
Article
On Generalized Bellman Equations and Temporal-Difference Learning.
Published in:
Journal of Machine Learning Research, 2018, v. 19, n. 26-55, p. 1
By:
- Huizhen Yu;
- A. Rupam Mahmood;
- Sutton, Richard S.
Publication type:
Article