A Course in Reinforcement Learning

Name: A Course in Reinforcement Learning
Rating: 5 (1 reviews)

Dimitri Bertsekas

2023年6月 · Athena Scientific

5.0

1 則評論

電子書

421

頁

符合資格

關於本電子書

These lecture notes were prepared for use in the 2023 ASU research-oriented course on Reinforcement Learning (RL) that I have offered in each of the last five years. Their purpose is to give an overview of the RL methodology, particularly as it relates to problems of optimal and suboptimal decision and control, as well as discrete optimization.

There are two major methodological RL approaches: approximation in value space, where we approximate in some way the optimal value function, and approximation in policy space, whereby we construct a (generally suboptimal) policy by using optimization over a suitably restricted class of policies.The lecture notes focus primarily on approximation in value space, with limited coverage of approximation in policy space. However, they are structured so that they can be easily supplemented by an instructor who wishes to go into approximation in policy space in greater detail, using any of a number of available sources, including the author's 2019 RL book.

While in these notes we deemphasize mathematical proofs, there is considerable related analysis, which supports our conclusions and can be found in the author's recent RL and DP books. These books also contain additional material on off-line training of neural networks, on the use of policy gradient methods for approximation in policy space, and on aggregation.

評分和評論

5.0

1 則評論

關於作者

Dimitri Bertsekas' undergraduate studies were in engineering at the National Technical University of Athens, Greece. He obtained his MS in electrical engineering at the George Washington University, Wash. DC in 1969, and his Ph.D. in system science in 1971 at the Massachusetts Institute of Technology (M.I.T.).

Dr. Bertsekas has held faculty positions with the Engineering-Economic Systems Dept., Stanford University (1971-1974) and the Electrical Engineering Dept. of the University of Illinois, Urbana (1974-1979). From 1979 to 2019 he was with the Electrical Engineering and Computer Science Department of M.I.T., where he served as McAfee Professor of Engineering. Since 2019 he has been Fulton Professor of Computational Decision Making and a full time faculty member at the School of Computing and Augmented Intelligence at Arizona State University (ASU), Tempe. He has served as a consultant to various private companies, and as editor for several scientific journals. In 1995 he founded a publishing company, Athena Scientific, which has published, among others, all of his books since that time. In 2023 he was appointed Chief Scientific Advisor of Bayforest Technologies, a London-based quantitative investment company.

Professor Bertsekas' research spans several fields, including optimization, control, large-scale computation, reinforcement learning, and artificial intelligence, and is closely tied to his teaching and book authoring activities. He has written numerous research papers, and twenty books and research monographs, several of which are used as textbooks in MIT and ASU classes.

Professor Bertsekas was awarded the INFORMS 1997 Prize for Research Excellence in the Interface Between Operations Research and Computer Science for his book "Neuro-Dynamic Programming", the 2001 ACC John R. Ragazzini Education Award, the 2009 INFORMS Expository Writing Award, the 2014 ACC Richard E. Bellman Control Heritage Award for "contributions to the foundations of deterministic and stochastic optimization-based methods in systems and control," the 2014 Khachiyan Prize for Life-Time Accomplishments in Optimization, the SIAM/MOS 2015 George B. Dantzig Prize, and the 2022 IEEE Control Systems Award. Together with his coauthor John Tsitsiklis, he was awarded the 2018 INFORMS John von Neumann Theory Prize, for the contributions of the research monographs "Parallel and Distributed Computation" and "Neuro-Dynamic Programming". In 2001, he was elected to the United States National Academy of Engineering for "pioneering contributions to fundamental research, practice and education of optimization/control theory, and especially its application to data communication networks."

Dr. Bertsekas' recent books are "Introduction to Probability: 2nd Edition" (2008), "Convex Optimization Theory" (2009), "Dynamic Programming and Optimal Control," Vol. I, (2017), and Vol. II: (2012), "Convex Optimization Algorithms" (2015), "Nonlinear Programming" (2016), "Reinforcement Learning and Optimal Control" (2019), "Rollout, Policy Iteration, Distributed Reinforcement Learning" (2020), "Abstract Dynamic Programming" (2022, 3rd edition), "Lessons from AlphaZero for Optimal, Model Predictive, and Adaptive Control" (2022), and "A Course in Reinforcement Learning" (2023), all published by Athena Scientific.