News
- “The 5th Large-scale Video Object Segmentation Challenge” has started! It will be held in conjunction with ICCV 2023 in Paris, France! Call for participants!
- We have a joint challenge track with Multiple Object Tracking and Segmentation in Complex Environments Workshop in ECCV 2022 on long video VIS. Call for participation!
- Due to maintainance issues of the old Codalab website, we have migrated the VOS and VIS evaluation servers of the 2019 challenge to the new codalab site. Details
What is YouTube-VOS
YouTube-VOS is the first large-scale benchmark that supports multiple video object segmentation tasks.
- Semi-supervised Video Object Segmentation
- Video Instance Segmentation
- Referring Video Object Segmentation
It also has the following features.
- 5000+ high-resolution YouTube videos
- 90+ semantic categories
- 7800+ unique objects
- 190k+ high-quality manual annotations
- 340+ minutes duration
Research papers
Please cite the following papers if you find our dataset is useful.
Semi-supervised video object segmentation
- YouTube-VOS: A Large-Scale Video Object Segmentation Benchmark. arXiv 2018
- YouTube-VOS: Sequence-to-Sequence Video Object Segmentation. ECCV 2018
Video instance segmentation
Referring Video Object Segmentation
Dataset examples
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |