Self-Organization of Exploratory Control

Georg Martius, Frank Hesse, J. Michael Herrmann

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

We study homeokinesis as a viable approach to active learning. The approach comprises the simultaneous maximization of the sensitivity of an agent with respect to its stimuli and of the predictability of these stimuli. Sensitization leads to exploratory behavior, which is counterbalanced by the requirement of predictability by means of an internal model. As a theoretically provable consequence, the motor activity becomes distributed over many degrees of freedom of a robotic system. The homeokinetic approach leads to flexible, versatile and body-specific behaviors while starting from scratch, minimizing the assumptions on rewards, gains, and initializations (see robot.informatik.uni-leipzig.de).Among a number of implementations of the learning scheme we focus here on a simulated spherical robot, see Fig.1. Initially the robot does not move, but later a regular rolling behavior is executed which breaks down infrequently to give way for different movement patterns.As a result of the homeokinetic learning rule, a flexible coordination of movements is observed to appear in more complex robot morphologies. External reward-like information can be used to modulate the learning process such that exploration becomes biased towards desired behaviours.Further applications of homeokinesis include control of myoelectric prostheses and the interaction of adaptive agents. Of particular interest is also the composition of more complex goal-directed behaviors based on elementary sensorimotor relations that are extracted from the waxing and waning of the emergent behaviors during homeokinetic learning.On the slower time scale of the learning of the model, however, these modes tend to be destabilized again such that a number of behaviors is sequentially activated and learned limited only by the complexity of the internal model. The extractable behaviors, negotiated between the dynamics of the robot and the internal model, are well-suited as a set of elementary behaviors for use in symbolic higher-order learning. Eventually, by a concatenation of several elementary behaviors regions in the environment become reachable that are unlikely to be found by random or quasi-random exploration. Fig. 1. Spherical robot exploring behavioral modes. The three internal masses are actuated each along its axis. Axes orientations serve as sensors. (a) Typical behaviors: rolling modes about each of the three internal axes (A-C) keeping one weight still, and intermittent rotation about any other (unstable) axis (D); (b) Screen shot taken from computer simulation; (c) Amplitudes of the motor value oscillations (y1...3) and the objective function (ETLE (time loop error)) averaged over 10 seconds and scaled for better visibility
Original languageEnglish
Number of pages2
JournalFrontiers in Computational Neuroscience
Issue number30
DOIs
Publication statusPublished - 2010

Fingerprint

Dive into the research topics of 'Self-Organization of Exploratory Control'. Together they form a unique fingerprint.

Cite this