SYNPLAY: IMPORTING REAL-WORLD DIVERSITY FOR A SYNTHETIC HUMAN DATASET

No Thumbnail Available

Files

Publication or External Link

Date

2024

Citation

Abstract

In response to the growing demand for large-scale training data, synthetic datasets have emerged as practical solutions. However, existing synthetic datasets often fall short of replicating the richness and diversity of real-world data. Synthetic Playground (SynPlay) is introduced as a new synthetic human dataset that aims to bring out the diversity of human appearance in the real world. In this thesis, We focus on two factors to achieve a level of diversity that has not yet been seen in previous works: i) realistic human motions and poses and ii) multiple camera viewpoints towards human instances. We first use a game engine and its library-provided elementary motions to create games where virtual players can take less-constrained and natural movements while following the game rules (i.e., rule-guided motion design as opposed to detail-guided design). We then augment the elementary motions with real human motions captured with a motion capture device. To render various human appearances in the games from multiple viewpoints, we use seven virtual cameras encompassing the ground and aerial views, capturing abundant aerial-vs-ground and dynamic-vs-static attributes of the scene. Through extensive and carefully-designed experiments, we show that using SynPlay in model training leads to enhanced accuracy over existing synthetic datasets for human detection and segmentation. Moreover, the benefit of SynPlay becomes even greater for tasks in the data-scarce regime, such as few-shot and cross-domain learning tasks. These results clearly demonstrate that SynPlay can be used as an essential dataset with rich attributes of complex human appearances and poses suitable for model pretraining.

Notes

Rights