Much effort has been invested in developing snow models over several decades, resulting in a wide variety of empirical and physically based snow models. For the most part, these models are built on similar principles. The greatest differences are found in how each model parameterizes individual processes (e.g., surface albedo and snow compaction). Parameterization choices naturally span a wide range of complexities. In this study, we evaluate the performance of different snow model parameterizations for hydrological applications using an existing multimodel energy-balance framework and data from two well-instrumented alpine sites with seasonal snow cover. We also include two temperature-index snow models and an intensive, physically based multilayer snow model in our analyses. Our results show that snow mass observations provide useful information for evaluating the ability of a model to predict snowpack runoff, whereas snow depth data alone are not. For snow mass and runoff, the energy-balance models appear transferable between our two study sites, a behavior which is not observed for snow surface temperature predictions due to site-specificity of turbulent heat transfer formulations. Errors in the input and validation data, rather than model formulation, seem to be the greatest factor affecting model performance. The three model types provide similar ability to reproduce daily observed snowpack runoff when appropriate model structures are chosen. Model complexity was not a determinant for predicting daily snowpack mass and runoff reliably. Our study shows the usefulness of the multimodel framework for identifying appropriate models under given constraints such as data availability, properties of interest and computational cost.