Introduction
The 5th LSVOS challenge will be held in conjunction with ICCV 2023 in Paris, France. In this edition of the workshop and challenge, we are combining the classic YouTube-VOS benchmark with the newly introduced VOST dataset. VOST focuses on complex object transformations, such as egg cracking or molding of clay, which break the assumptions behind existing methods and require rethinking the basic design principles behind them. The combined challenge will be held in conjunction with video instance segmentation and referring video object segmentation YouTube-VOS competitions. In addition, we will hold a series of talks by the leading experts in video understating. The workshop will culminate in a round table discussion, in which speakers will debate the future of video object representations.
Dates
- May 26th: Codalab websites open for registration. Training and validation data are released.
- Aug 1st - 10th: Release test data and open the submission of the test results.
- Aug 17th: The final competition results will be announced and top teams will be invited to give oral/poster presentations at our ICCV 2023 workshop.
- Oct 2nd: The workshop will take place in conjunction with ICCV at room S02, Paris Convention Center.
Invited Speakers
Kristen Grauman
Professor in the Department of Computer Science at the University of Texas at Austin
[Homepage]
Carl Vondrick
Associate Professor of Computer Science at Columbia University
[Homepage]
Cordelia Schmid
INRIA Research Director, Head of the THOTH project-team
[Homepage]
Adam Harley
Postdoc in Prof. Leonidas Guibas’ lab, at Stanford University
[Homepage]
Laura Leal-Taixe
Senior Research Manager at NVIDIA, Adjunct Professor at the Technical University of Munich (TUM)
[Homepage]
Thomas Kipf
Senior Research Scientist at Google Brain
[Homepage]
Benjamin Peters
Marie Curie Fellow at the University of Glasgow
[Email]
Workshop schedule
Oct 2, 2023, 8:30 - 17:20 GMT+2 @room S02
Zoom link (code: 720009)
Time (GMT+2) | Event | Speaker |
---|---|---|
8:30 | Opening remarks | Host |
8:40 | Invited Talk #1: VIS-à-MOT-: A face-to-face of video tracking benchmarks | Tim Meinhardt (NVDIA) |
9:10 | VOS Track introduction | Host |
9:20 | Video Object Segmentation under Transformations problem introduction | Host |
9:30 | VOS winning teams talks | Challenge participants |
10:00 | Invited Talk #2: All the Ways to Track Occluded Objects | Carl Vondrick (Columbia University) |
10:30 | Coffee Break | |
10:50 | Invited Talk #3:Dynamic object vision in humans and machines: Bridging human cognitive science and computer vision | Benjamin Peters (University of Glasgow and Columbia University) |
11:20 | Video Instance Segmentation track introduction | Host |
11:30 | VIS winning teams talks | Challenge participants |
12:00 | Invited Talk #4: TBD | Kristen Grauman (UT Austin and Meta) |
12:30 | Lunch Break | |
13:30 | Invited talk #5: Large-Scale Fine-Grained Tracking | Adam Harley (Stanford University) |
14:00 | Referring VOS track introduction | Host |
14:10 | RVOS winning teams talks | Challenge participants |
14:40 | Invited Talk #6: Dense video object captioning | Cordelia Schmid (Inria and Google Research) |
15:10 | Coffee Break | |
15:30 | Invited talk #7: TBD | Thomas Kipf (Google Deepmind) |
16:00 | Roundtable discussion | |
17:00 | Closing marks | Host |
17:20 | Workshop Ends |
Organizers
Ning Xu Apple Inc. |
Pavel Tokmakov Toyota Research Institute |
Linjie Yang ByteDance Inc. |
Yuchen Fan Meta Reality Labs |
Jie Li NVIDIA |
Achal Dave Toyota Research Institute |
Adrien Gaidon Toyota Research Institute |
Joon-Young Lee Adobe Research |
Seonguk Seo SNU, Korea |