Savanna ecosystems are one of the most dominant and complex terrestrial biomes that derives from a distinct vegetative surface comprised of co-dominant tree and grass populations. While these two vegetation types co-exist functionally, demographically they are not static, but are dynamically changing in response to environmental forces such as annual fire events and rainfall variability. Modelling savanna environments with the current generation of terrestrial biosphere models (TBMs) has presented many problems, particularly describing fire frequency and intensity, phenology, leaf biochemistry of C3 and C4 photosynthesis vegetation, and root water uptake. In order to better understand why TBMs perform so poorly in savannas, we conducted a model inter-comparison of 6 TBMs and assessed their performance at simulating latent energy (LE) and gross primary productivity (GPP) for five savanna sites along a rainfall gradient in northern Australia. Performance in predicting LE and GPP was measured using an empirical benchmarking system, which ranks models by their ability to utilise meteorological driving information to predict the fluxes. On average, the TBMs performed as well as a multi-linear regression of the fluxes against solar radiation, temperature and vapour pressure deficit, but were outperformed by a more complicated nonlinear response model that also included the leaf area index (LAI). This identified that the TBMs are not fully utilising their input information effectively in determining savanna LE and GPP, and highlights that savanna dynamics cannot be calibrated into models and that there are problems in underlying model processes. We identified key weaknesses in a model's ability to simulate savanna fluxes and their seasonal variation, related to the representation of vegetation by the models and root water uptake. We underline these weaknesses in terms of three critical areas for development. First, prescribed tree-rooting depths must be deep enough, enabling the extraction of deep soil water stores to maintain photosynthesis and transpiration during the dry season. Second, models must treat grasses as a co-dominant interface for water and carbon exchange, rather than a secondary one to trees. Third, models need a dynamic representation of LAI that encompasses the dynamic phenology of savanna vegetation and its response to rainfall interannual variability. We believe this study is the first to assess how well TBMs simulate savanna ecosystems, and that these results will be used to improve the representation of savannas ecosystems in future global climate model studies.