What is YouTube-VOS

YouTube-VOS is the first large-scale benchmark that supports multiple video object segmentation tasks.

  • Semi-supervised Video Object Segmentation
  • Video Instance Segmentation
  • Referring Video Object Segmentation

It also has the following features.

  • 5000+ high-resolution YouTube videos
  • 90+ semantic categories
  • 7800+ unique objects
  • 190k+ high-quality manual annotations
  • 340+ minutes duration

Research papers

Please cite the following papers if you find our dataset is useful.

Semi-supervised video object segmentation

Video instance segmentation

Referring Video Object Segmentation

Dataset examples