Audio2gesture
Overview
Audio2Gesture is Neural network trained to generate body motion that is derived entirely from an audio source. With various animation styles and options available to animate the full body or upper body. Connect your character with the automatic Retargeting tool. A2G provides a high quality and efficient solution to generating body motion for characters in heavy dialogue scenarios.
Overview Tutorial Video
Getting Started in Audio2Gesture
Upon loading The Audio2Gesture extension the user is presented with the following pipeline options.
Option |
Effect |
---|---|
A2G offline pipeline |
The Audio2Gesture Offline Pipeline loads the “Regular” Audio Player for use with audio wave files to generate animation clips.
|
A2G Streaming pipeline |
The Audio2Gesture Streaming Pipeline loads the “Streaming” Audio Player and enables a runtime workflow for TTS audio Streaming.
|
Base Skeleton |
Loads the main skeleton that A2G manipulates to drive performances. The Base Skeleton will load by default when you build a new pipeline. The Skeleton is provided here as a convenience for retargeting reference should you encounter any problems setting up your character retarget.
|
Audio Players
See Audio2Face documentation links below.
A2G offline Pipeline
Target Skeleton
This field will display any valid skelroot found in the current stage.
Skeleton Connected
The Green Check mark means the selected skeleton has retargeting setup and is connected.
Skeleton not connected
Indicates the currently assigned skeleton is not ready for retargeting. Clicking the icon exposes the Run AutoRetarget command.
Auto Retarget
Success will return a green check Mark. Failure will prompt you to open the retargeting window. For a comprehensive look at the Retargeting tool - Please refer to the Documentation found here.
Open Retargeting Window
Opens the Retargeting tool for more comprehensive setup of characters. For more Details - Please refer to the documentation for Animation Retargeting.
Run A2G
Runs an optimization algorithm to find the best suited animations for the current audio source and parameters and sets A2G in a run state ready to receive audio. A progress bar will be presented during the process.
Note
Every time you change the Parameters for Audio2Gesture, you must click “run A2G” again so the new parameters can be processed by the neural network.
Style
This provides a variety of animation style options to suit various spoken word scenarios.
Neutral (default)
Big Gestures
Calm Speech
Public Speech
Public Speech - casual
Public Speech - behind a table
Animation Mode
A post processing feature that provides the option of a full body animation performance or upper body performance only.
Animation Option
A2G will present a number of options for motion types that best suit the processed audio file and will default to the best or “top” option. User can choose between the other options to explore character performance alternatives.
Advanced Settings
After changing these settings it is required that you “run” A2G once more.
Option |
Effect |
---|---|
Num Epochs |
A2G performs iterative optimizations for each new audio track. More iterations generates better quality.
|
Num Samples |
On each Iteration A2F generates a number of sample animations. More samples = Better quality.
|
Smoothing Time Span |
Parameter to control smoothing duration to source animations as they are stitched together.
|
Audio Sync Strength |
Animation smoothing can affect audio synchronization - this options provides control over that balance between smoothing and accuracy.
|
Animation Graph Setup
Option |
Effect |
---|---|
Character |
Select a character from the current stage.
|
Translation Var |
Select a translation variable from an anim graph in the current stage.
|
Rotation Var |
Select a rotation Variable from an anim graph in the current stage
|
Animation Recording
Destination path
Specify a folder on disk to write your animation clip. Press the folder to use a browser window to select the folder. Press the link button to browse to the folder in file explorer.
Take Name
Specify a name for your animation clip. The output USD will be: {destination_path}/{target_prim_name}_{take_name}.usd The output USD will contain one SkelAnimation with the Take Name.
Export FPS.
Set the desired Frames Per Second to record the animation data. (defaults at 60 fps)
Record.
Clicking record - will execute the “run” command and start playing the audio for a clean output of the full animation to match the audio clip duration.