A new way to create realistic 3D shapes using generative AI

Creating realistic 3D models for applications like virtual reality, filmmaking, and engineering design can be a cumbersome process requiring lots of manual trial and error.

While generative artificial intelligence models for images can streamline artistic processes by enabling creators to produce lifelike 2D images from text prompts, these models are not designed to generate 3D shapes. To bridge the gap, a recently developed technique called Score Distillation leverages 2D image generation models to create 3D shapes, but its output often ends up blurry or cartoonish.

MIT researchers explored the relationships and differences between the algorithms used to generate 2D images and 3D shapes, identifying the root cause of lower-quality 3D models. From there, they crafted a simple fix to Score Distillation, which enables the generation of sharp, high-quality 3D shapes that are closer in quality to the best model-generated 2D images.
 

A rotating robotic bee in color; as a 3D model; and silhouette.Rotating strawberryScore Distillation Sampling (SDS) in 2022 that uses a pretrained diffusion model to combine 2D images into a 3D representation.

The technique involves starting with a random 3D representation, rendering a 2D view of a desired object from a random camera angle, adding noise to that image, denoising it with a diffusion model, then optimizing the random 3D representation so it matches the denoised image. These steps are repeated until the desired 3D object is generated.

However, 3D shapes produced this way tend to look blurry or oversaturated.

“This has been a bottleneck for a while. We know the underlying model is capable of doing better, but people didn’t know why this is happening with 3D shapes,” Lukoianov says.

The MIT researchers explored the steps of SDS and identified a mismatch between a formula that forms a key part of the process and its counterpart in 2D diffusion models. The formula tells the model how to update the random representation by adding and removing noise, one step at a time, to make it look more like the desired image.

Since part of this formula involves an equation that is too complex to be solved efficiently, SDS replaces it with randomly sampled noise at each step. The MIT researchers found that this noise leads to blurry or cartoonish 3D shapes.

An approximate answer

Instead of trying to solve this cumbersome formula precisely, the researchers tested approximation techniques until they identified the best one. Rather than randomly sampling the noise term, their approximation technique infers the missing term from the current 3D shape rendering.

“By doing this, as the analysis in the paper predicts, it generates 3D shapes that look sharp and realistic,” he says.

In addition, the researchers increased the resolution of the image rendering and adjusted some model parameters to further boost 3D shape quality.

In the end, they were able to use an off-the-shelf, pretrained image diffusion model to create smooth, realistic-looking 3D shapes without the need for costly retraining. The 3D objects are similarly sharp to those produced using other methods that rely on ad hoc solutions.

“Trying to blindly experiment with different parameters, sometimes it works and sometimes it doesn’t, but you don’t know why. We know this is the equation we need to solve. Now, this allows us to think of more efficient ways to solve it,” he says.

Because their method relies on a pretrained diffusion model, it inherits the biases and shortcomings of that model, making it prone to hallucinations and other failures. Improving the underlying diffusion model would enhance their process.

In addition to studying the formula to see how they could solve it more effectively, the researchers are interested in exploring how these insights could improve image editing techniques.

Artem Lukoianov’s work is funded by the Toyota–CSAIL Joint Research Center. Vincent Sitzmann’s research is supported by the U.S. National Science Foundation, Singapore Defense Science and Technology Agency, Department of Interior/Interior Business Center, and IBM. Justin Solomon’s research is funded, in part, by the U.S. Army Research Office, National Science Foundation, the CSAIL Future of Data program, MIT–IBM Watson AI Lab, Wistron Corporation, and the Toyota–CSAIL Joint Research Center.

发布者:Dr.Durant,转转请注明出处:https://robotalks.cn/a-new-way-to-create-realistic-3d-shapes-using-generative-ai/

(0)
上一篇 4 12 月, 2024
下一篇 5 12 月, 2024

相关推荐

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信
社群的价值在于通过分享与互动,让想法产生更多想法,创新激发更多创新。