Title | Predicting Saliency Beyond Pixels |
Publication Type | Dataset |
Year of Publication | 2014 |
Authors | Xu, J, Jiang, M, Wang, S, Kankanhalli, M, Zhao, Q |
Date Published | 01/2014 |
Abstract | A large body of previous models to predict where people look in natural scenes focused on pixel-level image attributes. To bridge the semantic gap between the predictive power of computational saliency models and human behavior, we propose a new saliency architecture that incorporates information at three layers: pixel-level image attributes, object-level attributes, and semantic-level attributes. Object- and semantic-level information is frequently ignored, or only a few sample object categories are discussed where scaling to a large number of object categories is not feasible nor neurally plausible. To address this problem, this work constructs a principled vocabulary of basic attributes to describe object- and semantic-level information thus not restricting to a limited number of object categories. We build a new dataset of 700 images with eye-tracking data of 15 viewers and annotation data of 5551 segmented objects with fine contours and 12 semantic attributes. Experimental results demonstrate the importance of the object- and semantic-level information in the prediction of visual attention. |
URL | http://www.ece.nus.edu.sg/stfpage/eleqiz/predicting.html |
Citation Key | 390 |