Colorization of 3D Objects Based on Geometric Generation Model

Colorization of gray image is a challenging research topic which is often discussed. It can be used to restore photos and assist artistic work. There are many approaches to colorize. From the beginning of the use of artificial colorization, and then the method of automatic colorization appeared. Most automatic colorization methods are based on deep learning. Most previous studies have focused on 2D image, the colorization of 3D objects is rarely discussed. In this paper, we introduce the colorization of 3D objects based on the geometric generation model. Unlike 2D images, 3D objects contain different information. Therefore, there are different problems in colorization. Colorization of 2D image can be regarded as the mapping of grayscale texture and color information. This mapping can be simulated by neural network, such as Variational Autoencoder and Generative Adversarial Networks. For 3D objects, we can use similar method for automatic colorization. In this paper, we introduce a generation model for colorization of 3D objects.


Introduction
In the past, many methods for colorization have been proposed which can be divided to three classes, which are scribble-based, exemplar-based and deep learning-based methods. The first two classes require users to provide the necessary information. The last class is based on deep learning that build a model trained by a number of data, can automatically convey grayscale image to chromatic image.

Scribble-Based Methods
Scribble-based methods require users to provide color scribbles as color indicator on grayscale image. They assume that adjacent and similar pixels have similar colors, so that color scribble can be spread across the image. Levin et al. [1] used the least-square optimization algorithm to spread the color scribbles. In order to reduce the number of user input scribbles, Luan et al. [2] and Qu et al. [3] extended the similarity that color propagation is carried out on texture. In order to improve the artifacts of color bleeding, Huang et al. [4] used the edge information of the grayscale image by edge detection algorithm. Moreover, Yatzivet et al. [5] propose a colorization method based on the concepts of luminance-weighted chrominance blending and fast intrinsic distance computations. This class of methods needs large of user input, which is in low colorization efficiency.

Exemplar-Based Methods
These methods colorize grayscale image by transferring the color from reference image to target image. Welsh et al. [6] transferred the color in pixels having similar luminance values and neighborhood statistics between reference image and target image. Irony et al. [7] extracted color from reference image and then used scribble-based method. Tai et al. [8] transferred color between two images by probabilistic segmentation. Charpiat et al. [9] estimated the chromatic probability of each pixel. Liu et al. [10] presented a method robust to illumination differences between target and reference images. Chia et al. [11] proposed a method to get suitable reference images from internet by semantic information. This class of methods are more efficient than previous methods, but they depend on suitable reference images, which has limitations.

Deep Learning Methods
Colorization methods based on deep learning have been proposed recently. These methods required a large number of grayscale and colorful image pairs as training data, can automatically colorize. Cheng et al. [12] first proposed a colorization method based on deep neural network. They take the semantic feature descriptor as the input of the network, the chrominance value as the output, and refine the output at last. Iizuka et al. [13] present a novel technique based on convolutional neural network that combines both global priors and local image features. Zhang et al. [14] posed colorization as a classification task and used class-rebalancing at training time to increase the diversity of colors in the result. Larsson et al. [15] exploited both low-level and semantic representations, and trained their model to predict histograms for every pixel. [14] and [15] both predict color from the histograms. Zhang et al. [16] and He et al. [17] combined the first two classes of methods on the basis of deep learning methods. Recently, the popularity of GAN [18] made researchers begin to colorize grayscale images with GAN. Isola et al. [19] released an image translation software called pix2pix, used conditional GAN to map input output image and input image. Significantly, they combined L1 loss and GAN loss when they were training the networks. Nazeri et al. [20] attempt to generalize the colorization to high resolution using DCGAN and made the training fast and stable. Some researchers focused on diverse colorization, such as Cao et al. [21] using GAN and Deshpande et al. [22] using VAE. This class of methods realize automatic colorization and make coloring process simple and fast.

Least-Squares Conformal Maps
The colorization in this paper focus to that of 3D human body, in which Least-Squares Conformal Maps [23] for transforming the 3D shape of human body to the image in a circle plane is used. The Least-Squares Conformal Map (LSCM) generated conformal maps by adding constraints. There are discrete 3D surface mesh S and a smooth target map U: S→(u,v). If and only if the Cauchy-Riemann equation (1) holds in the whole S, U is conformal on S.
However, the conformal condititon usually could not be strictly satisfied on the whole S. Therefore, the conformal map is constructed by the Least-Squares: d is a triangle on the mesh S. We suppose U and d are linear: where M= (Mf,Mp) is a sparse m×n complex matrix. This problem is a least-squares minimization problem in (4). We can use LSCM to map 3D surface to 2D domain.

Steepest Descent Algorithm
The minimizing of (4) and the solution of harmonic map with vetices are implemented by the steepest descent algorithm [24]. When we calculate a map f : M1→M2, we minimize the energy function E(f). This problem can be solved by the steepest descent algorithm:

Geometric Generation Model
The geometric generative model (GGM) used in following colorization scheme is shown in figure 1. X is the image space, Z is the feature space with the dimension reduction, and Z is feature space with statistical uniformation. There are two steps to imlemented the geometric generative model as follows: • The first step is to map the samples from the image space X to the feature space Z by using the deep neural networks. We can analyze the dimension through this way.
• The second step is to transform the probability measure. We all know that the information contained in image obey a certain distribution. And we'd like to switch the original distribution to a given distribution by mapping. We will achieve this step by geometric methods.

The Colorization of 3D Human Body Based on VAE-GGM
Variational autoencoder (VAE) is used to learn the corresponding features of color image. In order to make the coloring results have diversity, we use the features learned by VAE to train a mixed density network (MDN), and finally sample from the mixed density network to obtain a variety of coloring result. At the same time, in order to maintain the consistency of the image coloring effect, the scheme adopted in this paper first subsamples the image with higher resolution, and then uses another reconstruction network to reconstruct the subsampled image to restore the original resolution. Achieve realistic 3D human shape coloring.
The above is generally a black-and-white 3D human body shape, and the VAE can automatically colorize such 3D human body shape as shown as figure 2, where LSCM is least square conformal map for 3D to 2D dimension reduction, LSCM-1 is inverse least square conformal map for 3D to 2D dimension reduction. The steps of the scheme are given as followed: • Training Step1: the input of the VAE is the color image after deep learning network and the black-and-white image after multimodal decomposition.
• Training Step2: the result of (a) is sent to a geometric generation model for colorization training.
• Coloring stage: the image is processed by multimodal decomposition, then sampled, and then colored by the trained geometric generation model. Twindom DBA Human 3D Body Model Datasets will be used to train the model on GPU server. The GPU server is configured as GTX3090.

Conclusion
In this paper, we propose a method for coloring 3D objects. Unlike the combination of GAN and VAE, we use the GGM model as the generative model. This is a new idea, and the GGM model can be regarded as a GAN model. GGM has conducted preliminary experiments in the field of twodimensional image generation. We tried to apply GGM in the three-dimensional field through dimensionality reduction. We will continue to explore the effectiveness of this method.