3D scans at the reach of your hand!

In the last few years, 3D scanning technology such as photogrammetry, LiDAR or NERFs (an AI-based approach) made huge leaps forward. Creating a 3D scan is cheaper, easier than it used to, and the results keep getting better.

That being said, most techniques are still time-consuming and require some skills. Photogrammetry, for example, requires many pictures.

During SIGGRAPH Asia 2022, which took place last December, we discovered an innovating new approach: hold an object in your hand, film it, and you’re done! This technique will then allow you to extract a 3D mesh of the object. The main advantage compared to, say, photogrammetry, is that there is basically no learning curve. You don’t need a lot of hardware either: your smartphone will be perfect for this task.

The paper, titled Reconstructing Hand-Held Objects from Monocular Video, explains gives us some information about the technique:

  • first, the user will film the object, holding it in their hand and turning it in front of the camera.
  • the algorithm will then track the hand, to recover the 3D hand pose and the camera motion relative to it. This also allows to recover the object position/rotation. At the same time, a segmentation technique is used to create semantic maps: in a nutshell, the idea is to separate the hand and the object.
  • the algorithm will then reconstruct a 3D object from these informations. Since the hand and the object have been analyzed as separate objects, the algorithm is able to extract the mesh of the object alone.
Reconstructing Hand-Held Objects from Monocular Video - 3D Scan

Furthermore, three additional modules are proposed by the research team:

  • “The pose adjustment to compensate for imprecise hand pose tracking”.
  • “Tthe deformation field to model the relative motion between the hand and object”.
  • “The semantics-guided sampling to improve object reconstruction quality”.

The video below will give you more details about the process. A short introduction introduces the project and a few existing techniques. The video then (0’54”) details the new approach, challenges faced by the team and the overall technical process.

The video also showcases (2’55”) a few examples of what can be achieved thanks to this new approach. The results are compared to other techniques. It should be noted that, as showcased in the video, this technique also works on objects that are not textured.

Reconstructing Hand-Held Objects from Monocular Video is a paper by Di Huang, Xiaopeng Ji, Xingyi He, Jiaming Sun, Tong He, Qing Shuai, Wanli Ouyang, Xiaowei Zhou (The University of Sydney, Shanghai AI Laboratory, Image Derivative Inc., State Kay Lab of CAD&CG, Zhejiang University).

Code and data will soon be made available on the project page: you’ll be able to experiment on your own with this 3D scan technique. We can also hope that this innovative approach will soon inspire tools that will allow a wide audience to experiment and 3D scan objects.

Reconstructing Hand-Held Objects from Monocular Video - scan 3D
Comparison between this research project and other techniques, as well as ground truth (“GT”).

