%0 Generic
%D 2014
%T Predicting Saliency Beyond Pixels
%A Juan Xu
%A Ming Jiang
%A Shuo Wang
%A Mohan Kankanhalli
%A Qi Zhao
%X <p>A large body of previous models to predict where people look in natural scenes focused on pixel-level image attributes. To bridge the semantic gap between the predictive power of computational saliency models and human behavior, we propose a new saliency architecture that incorporates information at three layers: pixel-level image attributes, object-level attributes, and semantic-level attributes. Object- and semantic-level information is frequently ignored, or only a few sample object categories are discussed where scaling to a large number of object categories is not feasible nor neurally plausible. To address this problem, this work constructs a principled vocabulary of basic attributes to describe object- and semantic-level information thus not restricting to a limited number of object categories. We build a new dataset of 700 images with eye-tracking data of 15 viewers and annotation data of 5551 segmented objects with fine contours and 12 semantic attributes. Experimental results demonstrate the importance of the object- and semantic-level information in the prediction of visual attention.</p>
%8 01/2014
%U http://www.ece.nus.edu.sg/stfpage/eleqiz/predicting.html