In face-to-face conversation, communicators orchestrate multimodal contributions that meaningfully combine the linguistic resources of spoken language and the visuo-spatial affordances of gesture. In this paper, we characterise this meaningful combination in terms of the COHERENCE of gesture and speech. Descriptive analyses illustrate the diverse ways gesture interpretation can supplement and extend the interpretation of prior gestures and accompanying speech. We draw certain parallels with the inventory of COHERENCE RELATIONS found in discourse between successive sentences. In both domains, we suggest, interlocutors make sense of multiple communicative actions in combination by using these coherence relations to link the actions' interpretations into an intelligible whole. Descriptive analyses also emphasise the improvisation of gesture; the abstraction and generality of meaning in gesture allows communicators to interpret gestures in open-ended ways in new utterances and contexts. We draw certain parallels with interlocutors' reasoning about underspecified linguistic meanings in discourse. In both domains, we suggest, coherence relations facilitate meaning-making by RESOLVING the meaning of each communicative act through constrained inference over information made salient in the prior discourse. Our approach to gesture interpretation lays the groundwork for formal and computational models that go beyond previous approaches based on compositional syntax and semantics, in better accounting for the flexibility and the constraints found in the interpretation of speech and gesture in conversation. At the same time, it shows that gesture provides an important source of evidence to sharpen the general theory of coherence in communication.