Some recent research published by Xuan Luo and colleagues has introduced notable improvements to the estimation and depiction of depth in augmented reality footage.

As summarized nicely in a recent Two Minute Papers video on YouTube, for AR objects to be integrated convincingly and naturally within a given scene, say, that’s captured using a mobile phone, a decent depth map is needed. This is basically a visual representation of the distances between objects in a scene and the camera. It looks a bit like one of those infrared heat maps.

The thing is, with most of the tech that’s created these depth maps until now, there’s a tendency for the maps to come with a fair amount of visual artefacts. These include patchy areas of blurry, unresolved detail that appear to flicker. Seeing as the depth map helps to inform AR objects on how they should behave in a scene, ultimately having more artefacts like this will make for more jarring animations of these objects in the scene.

Enter these guys’ research: consistent video depth estimation. With this, significantly higher-quality and more accurately-detailed depth maps are produced. This means pretty notable improvements to the animation of AR objects and their on-screen interactions with real-life objects (like the snowflakes that get caught in the dude’s hair).

Basically, smoother depth maps means smoother, more natural integration of AR components into video-captured footage. With this technique, a global, geometrically-consistent depiction of depth can be retained throughout an entire video. For an idea of what this all looks like, the researchers have offered some videos showcasing the effects, viewable here.