We present a variant of the Q-learning algorithm with automatic control of the exploration rate by a com- petition scheme. The theoretical approach is accom- panied by systematic simulations of a chaos control task. Finally, we give interpretations of the algorithm in the context of computational ecology and neural networks.
|Number of pages||4|
|Journal||Nonlinear Theory and Its Applications (NOLTA)|
|Publication status||Published - 1996|