%0 Conference Proceedings %B Advances in Neural Information Processing Systems 30 %D 2017 %T Learning to See Physics via Visual De-animation %A Jiajun Wu %A Lu, Erika %A Kohli, Pushmeet %A William T. Freeman %A Joshua B. Tenenbaum %E I. Guyon %E U. V. Luxburg %E S. Bengio %E H. Wallach %E R. Fergus %E S. Vishwanathan %E R. Garnett %X
3D object reconstruction from a single image is a highly under-determined problem, requiring strong prior knowledge of plausible 3D shapes. This introduces challenge for learning-based approaches, as 3D object annotations in real images are scarce. Previous work chose to train on synthetic data with ground truth 3D information, but suffered from the domain adaptation issue when tested on real data. In this work, we propose an end-to-end trainable framework, sequentially estimating 2.5D sketches and 3D object shapes. Our disentangled, two-step formulation has three advantages. First, compared to full 3D shape, 2.5D sketches are much easier to be recovered from a 2D image, and to transfer from synthetic to real data. Second, for 3D reconstruction from the 2.5D sketches, we can easily transfer the learned model on synthetic data to real images, as rendered 2.5D sketches are invariant to object appearance variations in real images, including lighting, texture, etc. This further relieves the domain adaptation problem. Third, we derive differentiable projective functions from 3D shape to 2.5D sketches, making the framework end-to-end trainable on real images, requiring no real-image annotations. Our framework achieves state-of-the-art performance on 3D shape reconstruction.
%B Advances in Neural Information Processing Systems 30 %I Curran Associates, Inc. %C Long Beach, CA %P 540–550 %8 12/2017 %G eng %U http://papers.nips.cc/paper/6657-marrnet-3d-shape-reconstruction-via-25d-sketches.pdf %0 Conference Proceedings %B Advances in Neural Information Processing Systems 30 %D 2017 %T Shape and Material from Sound %A zhang, zhoutong %A Qiujia Li %A Zhengjia Huang %A Jiajun Wu %A Joshua B. Tenenbaum %A William T. Freeman %E I. Guyon %E U. V. Luxburg %E S. Bengio %E H. Wallach %E R. Fergus %E S. Vishwanathan %E R. Garnett %XWhat can we infer from hearing an object falling onto the ground? Based on knowledge of the physical world, humans are able to infer rich information from such limited data: rough shape of the object, its material, the height of falling, etc. In this paper, we aim to approximate such competency. We first mimic the human knowledge about the physical world using a fast physics-based generative model. Then, we present an analysis-by-synthesis approach to infer properties of the falling object. We further approximate human past experience by directly mapping audio to object properties using deep learning with self-supervision. We evaluate our method through behavioral studies, where we compare human predictions with ours on inferring object shape, material, and initial height of falling. Results show that our method achieves near-human performance, without any annotations.
%B Advances in Neural Information Processing Systems 30 %C Long Beach, CA %P 1278–1288 %8 12/2017 %G eng %U http://papers.nips.cc/paper/6727-shape-and-material-from-sound.pdf