What kinds of videos work best?
Any video with a single person performing the movement and a static (non-moving) camera. Phone recordings, screen captures, gameplay clips, and YouTube or social media links all work. For best results, make sure the full body is visible and the lighting is decent — heavy shadows or silhouettes can reduce extraction quality.
Does the person in the video need to be wearing a mocap suit or markers?
No. Uthana extracts motion from ordinary videos. No suits, markers, or special equipment are required.
How long can my video be?
Videos can be between 2-60 seconds long, 24-120 fps, and 300px to 4096px resolution.
Is video-to-motion available through the API?
Yes! Video-to-motion is available via the Uthana API. You can call video-to-motion programmatically and access similar inference speed and download options as the web app.
Can IÂ apply the motion to my own custom character?
Yes. Uthana's IK retargeting automatically maps extracted motion to any bipedal rig — your custom models, our built-in characters, or a character you generate yourself with Uthana. No manual retargeting required.
How accurate is the motion data compared to the original video?
Video-to-motion produces motion that closely matches the reference, and its quality is best when the video has a single character, good lighting, and a static camera. For performances that require frame-perfect fidelity, you can download the resulting motion and fine-tune the keyframes in your DCC.
Can IÂ use a video with multiple people in it?
Currently, videos should contain one person. If your reference has multiple people in frame, the extraction may not isolate the movement you want. Trim or crop your video to feature a single performer for the best results.
