Code & Datasets – CVML @ NUS

Assembly101

4321 videos of assembling and disassembling 101 toy vehicles. Multiview (8 static + 4 egocentric) sequences are annotated with 100K coarse and 1M fine-grained action segments and 18M 3D hand poses. Featured in our CVPR’22 paper Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities.

Tasty Videos

4022 videos of unique cooking recipes. The individual steps from the recipe text are aligned temporally to the video via annotations. Featured in our ICCV’19 paper Zero-shot anticipation for instructional activities.

Bonn Furniture Styles

~90k images from 6 categories of furniture from 17 different styles. Featured in our GCPR’18 paper Learning Style Compatibility for Furniture.