TY  - CONF
T2  - 16th IFAC World Congress
Y1  - 2005///
UR  - http://www.ifac-papersonline.net/Detailed/28368.html
A1  - Patrinos, Panagiotis
A1  - Sarimveis, Haralambos
ID  - eprints1035
N2  - This paper presents a neuro-dynamic programming methodology for the control of markov decision processes. The proposed method can be considered as a variant of the optimistic policy iteration, where radial basis function (RBF) networks are employed as a compact representation of the cost-to-go function and the ?-LSPE is used for policy evaluation. We also emphasize the reformulation of the Bellman equation around the post-decision state in order to circumvent the calculation of the expectation. The proposed algorithm is applied to a retailer-inventory management problem.
TI  - An RBF based neuro-dynamic approach for the control of stochastic dynamic systems
AV  - none
M2  - Prague, Czech Republic
ER  -