The 5th Large-scale Video Object Segmentation Challenge

Introduction

The 5th LSVOS challenge will be held in conjunction with ICCV 2023 in Paris, France. In this edition of the workshop and challenge, we are combining the classic YouTube-VOS benchmark with the newly introduced VOST dataset. VOST focuses on complex object transformations, such as egg cracking or molding of clay, which break the assumptions behind existing methods and require rethinking the basic design principles behind them. The combined challenge will be held in conjunction with video instance segmentation and referring video object segmentation YouTube-VOS competitions. In addition, we will hold a series of talks by the leading experts in video understating. The workshop will culminate in a round table discussion, in which speakers will debate the future of video object representations.

Dates

May 26th: Codalab websites open for registration. Training and validation data are released.
Aug 1st - 10th: Release test data and open the submission of the test results.
Aug 17th: The final competition results will be announced and top teams will be invited to give oral/poster presentations at our ICCV 2023 workshop.
Oct 2nd: The workshop will take place in conjunction with ICCV at room S02, Paris Convention Center.

Invited Speakers

Kristen Grauman

Professor in the Department of Computer Science at the University of Texas at Austin
[Homepage]

Carl Vondrick

Associate Professor of Computer Science at Columbia University
[Homepage]

Cordelia Schmid

INRIA Research Director, Head of the THOTH project-team
[Homepage]

Adam Harley

Postdoc in Prof. Leonidas Guibas’ lab, at Stanford University
[Homepage]

Laura Leal-Taixe

Senior Research Manager at NVIDIA, Adjunct Professor at the Technical University of Munich (TUM)
[Homepage]

Thomas Kipf

Senior Research Scientist at Google Brain
[Homepage]

Benjamin Peters

Marie Curie Fellow at the University of Glasgow [Email]

Workshop schedule

Oct 2, 2023, 8:30 - 17:20 GMT+2 @room S02

Zoom link (code: 720009)

Time (GMT+2)	Event	Speaker
8:30	Opening remarks	Host
8:40	Invited Talk #1: VIS-à-MOT-: A face-to-face of video tracking benchmarks	Tim Meinhardt (NVDIA)
9:10	VOS Track introduction	Host
9:20	Video Object Segmentation under Transformations problem introduction	Host
9:30	VOS winning teams talks	Challenge participants
10:00	Invited Talk #2: All the Ways to Track Occluded Objects	Carl Vondrick (Columbia University)
10:30	Coffee Break
10:50	Invited Talk #3:Dynamic object vision in humans and machines: Bridging human cognitive science and computer vision	Benjamin Peters (University of Glasgow and Columbia University)
11:20	Video Instance Segmentation track introduction	Host
11:30	VIS winning teams talks	Challenge participants
12:00	Invited Talk #4: TBD	Kristen Grauman (UT Austin and Meta)
12:30	Lunch Break
13:30	Invited talk #5: Large-Scale Fine-Grained Tracking	Adam Harley (Stanford University)
14:00	Referring VOS track introduction	Host
14:10	RVOS winning teams talks	Challenge participants
14:40	Invited Talk #6: Dense video object captioning	Cordelia Schmid (Inria and Google Research)
15:10	Coffee Break
15:30	Invited talk #7: TBD	Thomas Kipf (Google Deepmind)
16:00	Roundtable discussion
17:00	Closing marks	Host
17:20	Workshop Ends

Organizers


Ning Xu Apple Inc.	Pavel Tokmakov Toyota Research Institute	Linjie Yang ByteDance Inc.	Yuchen Fan Meta Reality Labs	Jie Li NVIDIA

Achal Dave Toyota Research Institute	Adrien Gaidon Toyota Research Institute	Joon-Young Lee Adobe Research	Seonguk Seo SNU, Korea