Policy learning for time-bounded reachability in Continuous-Time Markov Decision Processes via doubly-stochastic gradient ascent

Ezio Bartocci, Luca Bortolussi, Tomás Brázdil, Dimitrios Milios, Guido Sanguinetti

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Filter
Finished

Search results