Acquiring scenery depth is a fundamental task in computer vision, with many applications in manufacturing, surveillance, or robotics relying on accurate scenery information. Time-of-flight cameras can provide depth information in real-time and overcome short-comings of traditional stereo analysis. However, they provide limited spatial resolution and sophisticated upscaling algorithms are sought after. In this paper, we present a sensor fusion approach to time-of-flight super resolution, based on the combination of depth and texture sources. Unlike other texture guided approaches, we interpret the depth upscaling process as a weighted energy optimization problem. Three different weights are introduced, employing different available sensor data. The individual weights address object boundaries in depth, depth sensor noise, and temporal consistency. Applied in consecutive order, they form three weighting strategies for time-of-flight super resolution. Objective evaluations show advantages in depth accuracy and for depth image based rendering compared with state-of-the-art depth upscaling. Subjective view synthesis evaluation shows a significant increase in viewer preference by a factor of four in stereoscopic viewing conditions. To the best of our knowledge, this is the first extensive subjective test performed on time-of-flight depth upscaling. Objective and subjective results proof the suitability of our approach to time-of-flight super resolution approach for depth scenery capture.