CSE Colloquium: Understanding the Dynamic Visual World: from Motion to Semantics
Abstract: We live in a dynamic world, which is continuously in motion. As the famous psychologist James J. Gibson suggested, "We must perceive to move, but we must also move to perceive." On the one hand, motion information contained in videos, as a result of moving camera, independently moving objects, and scene geometry, consists of abundant information, revealing the structure and complexity of our dynamic visual world. For example, objects that are closer to the camera tend to move faster in the image plane. In this talk, Huaizu will discuss how to use the motion information as supervision to train deep neural networks, avoiding the tedious effort of collecting manual annotations. On the other hand, understanding the motion of its surroundings is also critical for an autonomous agent (e.g., a self-driving car) to move around safely. Huaizu will introduce an approach where the agent learns to solve multiple tasks together, instead of focusing just a single one at a time. Such a holistic scene understanding approach leads to a more efficient, compact, and accurate 3D motion estimation model. Finally, Huaizu will introduce SuperSloMo, which can convert a standard frame-rate to arbitrarily slow-motion versions by estimating the motion of every single pixel. It helps people see movements otherwise hard to see clearly, allowing them to record their life experiences more easily.
Biography: Huaizu Jiang is a fifth-year Ph.D. student at the College of Information and Computer Sciences at the University of Massachusetts, Amherst. He has broad research interests, including computer vision, computational photography, natural language processing, and machine learning. His work, SuperSloMo, was recognized as one of 10 coolest papers from CVPR 2018. He received Adobe Fellowship and NVIDIA Graduate Fellowship, both in 2019.
Event Contact: Robert Collins