AI finally aces motion capture without all the mocap gadgets

WHY THIS MATTERS IN BRIEF

Motion capture is going to be even more important in the future, just think content creation, healthcare, sports tracking, Metaverse avatars, and this breakthrough makes it much easier.

 

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential Universityconnect, watch a keynote, read our codexes, or browse my blog.

Motion capture (Mocap), the process of recording people’s movements so they can be used for a variety of purposes, such as in games and movies, traditionally requires special equipment, cameras, and software. But now researchers at the Max Planck Institute and Facebook Reality Labs claim they’ve developed a machine learning algorithm — PhysCap — that works with any off-the-shelf DSLR camera running at 25 frames per second.

 

See also
Humans suck at saving the planet so AI is coming to the rescue

 

In a paper expected to be published in the journal ACM Transactions on Graphics soon the team details what they say is the first of its kind for real-time, physically plausible 3D motion capture system that accounts for environmental constraints like floor placement. PhysCap ostensibly achieves state of the art accuracy on existing benchmarks and qualitatively improves stability at training time.

 

See the tech in action

 

Motion capture is a core part of modern film, game, and app development, and attempts to make motion capture practical for amateur videographers have ranged from a $2,500 suit to a commercially available framework that leverages Microsoft’s depth-sensing Kinect. But these are imperfect — even the best human pose-estimating systems struggle to produce smooth animations, yielding 3D models with improper balance, inaccurate body leaning, and other artifacts of instability. PhysCap, on the other hand, reportedly captures physically and anatomically correct poses that adhere to physics constraints.

 

See also
Your smartphone can now use your voice to predict heart attacks

 

In experiments, the researchers tested PhysCap on a Sony DSC-RX0 camera and a PC with 32GB of RAM, a GeForce RTX 2070 graphics card, and an eight-core Ryzen7 processor, which they used to capture and process six motion sequences in scenes acted out by two performers. The study co-authors found that while PhysCap generalized well across scenes with different backgrounds, it sometimes mis predicted foot contact and therefore foot velocity. Other limitations that arose were the need for a calibrated floor plane and a ground plane in the scene, which the researchers note is harder to find outdoors.

To address these limitations, the team plans to investigate modelling hand-scene interactions and contacts between a person’s legs and body in seated and reclining poses.

 

See also
Deadly brain tumours in mice completely destroyed for the first time

 

“Since the output of PhysCap is environment-aware and the returned root position is global, it is directly suitable for virtual character animation, without any further post-processing,” the researchers wrote. “Here, applications in character animation, virtual and augmented reality, telepresence, or human-computer interaction are only a few examples of high importance for graphics.”

Related Posts

Leave a comment

Awesome! You're now subscribed.

Pin It on Pinterest

Share This