Published on Fri Jun 25 2021

Transflower: probabilistic autoregressive dance generation with multimodal attention

Guillermo Valle-Pérez, Gustav Eje Henter, Jonas Beskow, André Holzapfel, Pierre-Yves Oudeyer, Simon Alexanderson

Dance requires skillful composition of complex movements that follow the rhythmic, tonal and timbral features of music. Formally, generating dance conditioned on a piece of music can be expressed as a problem of modelling a high-dimensional continuous motion signal, conditioned on an audio signal.

1
25
130
Abstract

Dance requires skillful composition of complex movements that follow rhythmic, tonal and timbral features of music. Formally, generating dance conditioned on a piece of music can be expressed as a problem of modelling a high-dimensional continuous motion signal, conditioned on an audio signal. In this work we make two contributions to tackle this problem. First, we present a novel probabilistic autoregressive architecture that models the distribution over future poses with a normalizing flow conditioned on previous poses as well as music context, using a multimodal transformer encoder. Second, we introduce the currently largest 3D dance-motion dataset, obtained with a variety of motion-capture technologies, and including both professional and casual dancers. Using this dataset, we compare our new model against two baselines, via objective metrics and a user study, and show that both the ability to model a probability distribution, as well as being able to attend over a large motion and music context are necessary to produce interesting, diverse, and realistic dance that matches the music.

Mon Nov 11 2019
Machine Learning
Generative Autoregressive Networks for 3D Dancing Move Synthesis from Music
This paper proposes a framework which is able to generate a sequence of three-dimensional human dance poses for a given music. The proposed framework consists of three components: a music feature encoder, a pose generator, and a music genre classifier.
0
0
0
Sun Feb 02 2020
Computer Vision
Music2Dance: DanceNet for Music-driven Dance Generation
Synthesizing human motions from music, i.e., music to dance, is appealing and attracts lots of research interests. In this paper, we propose a novel autoregressive generative model, DanceNet, to take the style, rhythm and melody of music as the control signals.
0
0
0
Tue Aug 18 2020
Computer Vision
Learning to Generate Diverse Dance Motions with Transformer
0
0
0
Thu Jun 11 2020
Machine Learning
Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning
Dancing to music is one of human's innate abilities since ancient times. In machine learning research, synthesizing dance movements from music is a challenging problem. This paper formalizes the music-conditioned dance generation as a sequence-to-sequence learning problem.
0
0
0
Thu Jan 21 2021
Computer Vision
Learn to Dance with AIST++: Music Conditioned 3D Dance Generation
In this paper, we present a transformer-based learning framework for 3D dance generation conditioned on music. The critical components include a deep cross-modal transformer, which well learns the correlation between the music and dance motion.
2
11
13
Thu Mar 18 2021
Artificial Intelligence
DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer
DanceNet3D first generates key poses on beats of the given music and thenpredicts the in-between motion curves. The decoders are constructed on MoTrans, a transformer tailored for motion generation.
1
0
0
Wed Dec 12 2018
Neural Networks
A Style-Based Generator Architecture for Generative Adversarial Networks
We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes and stochastic variation in the generated images.
9
3,500
24,206
Fri Sep 28 2018
Machine Learning
Large Scale GAN Training for High Fidelity Natural Image Synthesis
We train Generative Adversarial Networks at the largest scale yet attempted. We find that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick" Our modifications lead to models which set the new state of the art in class-conditional
2
280
906
Sun Apr 08 2018
Artificial Intelligence
DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
A longstanding goal in character animation is to combine data-driven specification of behavior with a system that can execute a similar behavior in a physical simulation. We show that well-known reinforcement learning (RL)methods can be adapted to learn robust control policies.
1
273
898
Mon Jun 12 2017
NLP
Attention Is All You Need
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms. Experiments on two machine translation tasks show these models to be superior in
50
215
883
Thu Dec 05 2019
Machine Learning
Normalizing Flows for Probabilistic Modeling and Inference
Normalizing flows provide a general mechanism for defining expressive probability distributions. We place special emphasis on the fundamental principles of flow design. We discuss foundational topics such as expressive power and computational trade-offs.
2
183
683
Thu Mar 04 2021
Computer Vision
Perceiver: General Perception with Iterative Attention
The Perceiver is a model that builds upon Transformers and makes few architectural assumptions about the relationship between its inputs. The model leverages an asymmetric attention mechanism to distill inputs into a tight latent bottleneck, allowing it to scale to handle very large inputs.
18
51
280