Image interpretation above and below the object level|
Year of Publication|
Ben-Yosef, G, Ullman, S|
Computational models of vision have advanced in recent years at a rapid rate, rivalling in some areas human-level performance. Much of the progress to date has focused on analysing the visual scene at the object level—the recognition and localization of objects in the scene. Human understanding of images reaches a richer and deeper image understanding both ‘below' the object level, such as identifying and localizing object parts and sub-parts, as well as ‘above’ the object level, such as identifying object relations, and agents with their actions and interactions. In both cases, understanding depends on recovering meaningful structures in the image, and their components, properties and inter-relations, a process referred here as ‘image interpretation'. In this paper, we describe recent directions, based on human and computer vision studies, towards human-like image interpretation, beyond the reach of current schemes, both below the object level, as well as some aspects of image interpretation at the level of meaningful configurations beyond the recognition of individual objects, and in particular, interactions between two people in close contact. In both cases the recognition process depends on the detailed interpretation of so-called ‘minimal images’, and at both levels recognition depends on combining ‘bottom-up' processing, proceeding from low to higher levels of a processing hierarchy, together with ‘top-down' processing, proceeding from high to lower levels stages of visual analysis.