OpenDance: Multimodal Controllable 3D Dance Generation Using Large-scale Internet Data

Anonymous Submission
Project Teaser Image

Overview of the OpenDance.


Abstract

Music-driven dance generation offers significant creative potential yet faces considerable challenges. The absence of fine-grained multimodal data and the difficulty of flexible multi-conditional generation limit previous works on generation controllability and diversity in practice. In this paper, we build OpenDance5D, an extensive human dance dataset comprising over 101 hours across 14 distinct genres. Each sample has five modalities to facilitate robust cross-modal learning: RGB video, audio, 2D keypoints, 3D motion, and fine-grained textual descriptions from human arts. Furthermore, we propose OpenDanceNet, a unified masked modeling framework for controllable dance generation conditioned on music and arbitrary combinations of text prompts, keypoints, or character positioning. Comprehensive experiments demonstrate that OpenDanceNet achieves high-fidelity and flexible controllability.



OpenDance5D Dataset

Project Teaser Image

Data Distribution of OpenDance5D in terms of (a) dancers and (b) genres. The violin plot shows the number of samples per dancer in raw video data, while the sunburst chart illustrates the distribution of samples across 14 dance genres.



OpenDanceNet

Project Teaser Image

The masked modeling based dance generation framework OpenDanceNet. During training, we input multiple user-customized conditions to generate controllable dance results. The transformer-based diffusion network is employed as body diffuser and hand diffuser, respectively.


Dataset Demos

Visualization case #1


Visualization case #2


Visualization case #3


Visualization case #4

Generation Performance

Coming soon...

BibTeX


      @misc{zhang2025opendancemultimodalcontrollable3d,
        title={OpenDance: Multimodal Controllable 3D Dance Generation Using Large-scale Internet Data}, 
        author={Jinlu Zhang and Zixi Kang and Yizhou Wang},
        year={2025},
        eprint={2506.07565},
        archivePrefix={arXiv},
        primaryClass={cs.CV},
        url={https://arxiv.org/abs/2506.07565}, 
      }