Many of the most interesting and salient activities that humans perform involve manipulating objects, which frequently involves using hands to grasp and move objects. Unfortunately, object detectors tend to fail at precisely this critical juncture where the salient part of the activity occurs because the hand occludes the object being grasped. At the same time, while a hand is manipulating an object, the hand is being significantly deformed making it more difficult to recognize. We are addressing both of these problems simultaneously, jointly tracking hands and objects while determining their relationships and the events they participate in.