Two contrasting views of visual attention in scenes are the visual salience and the cognitive relevance hypotheses. They fundamentally differ in their conceptualization of the visuospatial representation over which attention is directed. According to the saliency model, this representation is image-based, while the cognitive relevance framework advocates an object-based representation. Previous research has shown that (1) viewers prefer to look at objects over background and that (2) the saliency model predicts human fixation locations significantly better than chance. However, it could be that saliency mainly acts through objects. To test this hypothesis, we investigated where people fixate within real objects and saliency proto-objects. To this end, we recorded eye movements of human observers while they inspected photographs of natural scenes under different task instructions. We found a preferred viewing location (PVL) close to the center of objects within naturalistic scenes. Compared to the PVL for real objects, there was less evidence for a PVL for human fixations within saliency proto-objects. There was no evidence for a PVL when only saliency proto-objects that did not spatially overlap with annotated real objects were analyzed. The results suggest that saccade targeting and, by inference, attentional selection in scenes is object-based.