|Title||Image interpretation above and below the object level|
|Publication Type||CBMM Memos|
|Year of Publication||2018|
|Authors||Ben-Yosef, G, Ullman, S|
|Keywords||Interaction Recognition, minimal images, Social Interactions, Visual interpretation, visual recognition|
Computational models of vision have advanced in recent years at a rapid rate, rivaling in some areas human-level performance. Much of the progress to date has focused on analyzing the visual scene at the object level – the recognition and localization of objects in the scene. Human understanding of images reaches a richer and deeper image understanding both ‘below’ the object level, such as identifying and localizing object parts and sub-parts, as well as ‘above’ the object levels, such as identifying object relations, and agents with their actions and interactions. In both cases, understanding depends on recovering meaningful structures in the image, their components, properties, and inter-relations, a process referred here as ‘image interpretation’.
In this paper we describe recent directions, based on human and computer vision studies, towards human-like image interpretation, beyond the reach of current schemes, both below the object level, as well as some aspects of image interpretation at the level of meaningful configurations beyond the recognition of individual objects, in particular, interactions between two people in close contact. In both cases the recognition process depends on the detailed interpretation of so-called 'minimal images', and at both levels recognition depends on combining ‘bottom-up’ processing, proceeding from low to higher levels of a processing hierarchy, together with ‘top-down’ processing, proceeding from high to lower levels stages of visual analysis.
- CBMM Funded