This work presents the application of reinforcement learning for the optimal resistive control of a point absorber. The model-free Q-learning algorithm is selected in order to maximise energy absorption in each sea state. Step changes are made to the controller damping, observing the associated penalty, for excessive motions, or reward, i.e. gain in associated power. Due to the general periodicity of gravity waves, the absorbed power is averaged over a time horizon lasting several wave periods. The performance of the algorithm is assessed through the numerical simulation of a point absorber subject to motions in heave in both regular and irregular waves. The algorithm is found to converge towards the optimal controller damping in each sea state. Additionally, the model-free approach ensures the algorithm can adapt to changes to the device hydrodynamics over time and is unbiased by modelling errors.