CSE Colloquium: Designing Video Models for Human Behavior Understanding

Zoom Information: Join from PC, Mac, Linux, iOS or Android: https://psu.zoom.us/j/94128390729?pwd=UWU0Sk9DeFNyQWJMMHplamFnQkdCQT09 Password: 435568

Or iPhone one-tap (US Toll): +13017158592,94128390729# or +13126266799,94128390729#

Or Telephone: Dial: +1 301 715 8592 (US Toll) +1 312 626 6799 (US Toll) +1 646 876 9923 (US Toll) +1 253 215 8782 (US Toll) +1 346 248 7799 (US Toll) +1 669 900 6833 (US Toll) Meeting ID: 941 2839 0729 Password: 435568 International numbers available: https://psu.zoom.us/u/aep2iTbjU

ABSTRACT: Many modern computer vision applications require extracting core attributes of human behavior such as attention, action, or intention. Extracting such behavioral attributes requires powerful video models that can reason about human behavior directly from raw video data. To design such models, we need to answer the following three questions: how do we (1) model videos (2) learn from videos, and lastly, (3) use videos to predict human behavior?

In this talk I will present a series of methods to answer each of these questions. First, I will introduce TimeSformer, the first convolution-free architecture for video modeling built exclusively with self-attention. It achieves the best reported numbers on major action recognition benchmarks at 1/10th of the cost of state-of-the-art 3D CNNs. Afterwards, I will present COBE, a new large-scale framework for learning contextualized object representations in settings involving human-object interactions. Our approach exploits automatically-transcribed speech narrations from instructional YouTube videos, and it does not require manual annotations. Lastly, I will introduce a self-supervised learning approach for predicting a basketball player's future motion trajectory from an unlabeled collection of first-person basketball videos.

BIOGRAPHY: Gedas Bertasius is a postdoctoral researcher at Facebook AI working on computer vision and machine learning problems. His current research focuses on topics of video understanding, first-person vision, and multi-modal deep learning. He received his Bachelor’s Degree in Computer Science from Dartmouth College, and a Ph.D. in Computer Science from the University of Pennsylvania. His recent work was nominated for the CPVR 2020 best paper award.

Event Contact: Robert Collins

About

The School of Electrical Engineering and Computer Science was created in the spring of 2015 to allow greater access to courses offered by both departments for undergraduate and graduate students in exciting collaborative research fields.

We offer B.S. degrees in electrical engineering, computer science, computer engineering and data science and graduate degrees (master's degrees and Ph.D.'s) in electrical engineering and computer science and engineering. EECS focuses on the convergence of technologies and disciplines to meet today’s industrial demands.

CSE Colloquium: Designing Video Models for Human Behavior Understanding

Share this event

About