Comparison and analysis of various deep learning models for super-resolution reconstruction of turbulent flows

Single image super-resolution (SR) has become a promising research topic, with many deep learning-based models invented to reconstruct high-fidelity high-resolution (HR) images from low-resolution (LR) images. Motivated by a large amount of turbulent flow field data collected by experimental measurements and numerical simulation, researchers begin investigating the application of these data-driven deep learning models to conduct SR reconstruction of LR flow field data. Due to the limitations of experimental equipment and computing power, sometimes researchers can only obtain LR data. However, deep learning models can quickly reconstruct HR spatial-temporal turbulent data from LR data so that researchers can easily conduct further qualitative and quantitative analyses. This article reviews the development of flow field data SR reconstruction models and the problems encountered from the two aspects of network structure and loss function definition. Finally, we propose the research direction of applying the conditional generative adversarial network (cGAN) to turbulent flow SR reconstruction since seldom study has been conducted in this field.


Introduction
The study of turbulent flows can be applied to many engineering fields, and its importance is self-evident.People's research on turbulence has never stopped, and the methods can be mainly divided into three categories: (a) Experimental measurements such as Particle Image Velocimetry (PIV); (b) Theoretical analysis applied to simplify complex flow phenomena; (c) Numerical simulations including Direct Numerical Simulation (DNS), Large Eddy Simulation (LES) and Reynolds Mean Stress Model (RANS).However, laboratory experiments are costly as the PIV method requires a high camera resolution, and the specifically designed configuration lacks reconfigurability.Simulations are usually too slow to operate real-time control [1].For complex geometries and high Reynolds number flows, the large number of grids required is a big challenge for our current computational resources.
So far, a large amount of experimental measurement data and numerical simulation data have been accumulated for turbulent flow problems.It provides the possibility for the application and development of deep learning models in this field, which can extract informative features from data and conduct selflearning.Brunton et al. [2] divide the current application of machine learning in fluid mechanics into "flow modeling" and "flow optimization and control."In flow modeling, flow field visualization or data visualization has been investigated intensively in recent years.Typically, the data come from numerical 2 simulations, which can be analyzed from a visualization perspective to be easily understood.To improve visualization quality, SR reconstruction of turbulent flows from LR data attracts extraordinary attention.Both supervised and unsupervised deep learning algorithms have been applied to achieve this goal.
Convolutional neural networks (CNNs) and generative adversarial networks (GANs) are commonly used in supervised learning.Deep-leaning algorithms perform better than the traditional interpolationbased method, such as bicubic interpolation.Since deep learning models are data-driven, they learn to apply LR images to infer HR ones, and models can even further help to remove noise and conduct turbulent flow field prediction based on statistical inference.For example, in PIV measurement, if we have an immersed object in the flow, the object's surface will irregularly reflect the laser sheet.The object's edges will create halation and shadow, making particle images incomplete in regions close to this object.Morimoto et al. [3] propose a supervised deep-learning model to predict complete flow fields by choosing incomplete particle images as inputs to obtain the corresponding missing velocity fields.Many researchers continue improving and innovating new deep-learning models to perform SR reconstruction of turbulent flows.This work will present research and comparison of some of these deep-learning models, and the study will attempt to propose a future direction for model improvement.
The following of this study is organized as follows.Section 2 provides a literature review to present the current applications of neural networks in fluid data SR reconstruction.Section 3 analyzes and compares different deep learning models and the problems or challenges accompanying them.Finally, section 4 gives the conclusion of this study.

Literature Review
HR visualization of turbulent flow field data is necessary for researchers to conduct qualitative and quantitative analyses.However, limited storage space, measurement equipment, and computational resources make only LR spatial-temporal fluid data readily available.Hence, the SR reconstruction of turbulent flows becomes a vitally important process, and the SR reconstruction model can be further applied to denoise LR fluid data [4].
In deep learning, CNN is a fast-developing algorithm.Dong et al. [5] first proposed a SR convolutional neural network (SRCNN) for single image SR.They constructed an end-to-end mapping from LR images as input and reconstructed SR images as output.Since the CNN model could also be applied to extract fluid dynamics features, Fukami et al. [6] leveraged SRCNN to perform SR reconstruction of two-dimensional laminar cylinder wake and two-dimensional homogeneous turbulence.They extended their model to the hybrid Downsampled Skip-Connection/Multi-Scale (DSC/MS) model to capture the multi-scale structures in the LR fluid data.Nevertheless, the improved model's SR reconstruction would still obscure the result as the resolution ratio increased.Moreover, the model's performance depended on how the LR data were collected since its reconstruction for max pooled LR data was bad.Since then, many excellent deep-learning models have been developed for SR reconstruction.Liu et al. [7] classified Fukami's model [6] as the static convolutional model (SCNN) since it only utilized the spatial information on the LR flow field at a specific instant t to reconstruct the HR flow field.Since turbulent flows were time-space coupled, they proposed the multiple temporal paths convolutional neural network (MTPC), which took a series of LR flow fields in a time span [td;t+d] as an input.The extracted extra-temporal information made MTPC perform better with smallscale spatial structures, but it required more time to perform SR than the static model.In addition to CNN-based models, GAN-based models also aroused researchers' interest.
Deng et al. [8] demonstrated that the SR generative adversarial network (SRGAN) and enhanced SRGAN (ESRGAN) worked well to reconstruct HR velocity fields.Yousif et al. [9]  Most models for SR reconstruction of turbulent flows above were inspired by models for single image SR reconstruction of life scene pictures in color channels, which were more difficult to reconstruct due to complexity and irregularity.However, the specific turbulent fields complied with some physical laws.Raissi et al. [10] introduced physics-informed neural networks, and thus physical laws described by general nonlinear partial differential equations could be incorporated as physical constraints into ordinary deep learning models.By applying the continuity equation and the conservation of momentum as a physics loss term, Subramaniam et al. [11] found that both physics-informed CNN and physicsinformed GAN could enrich the LR fields and recover a large portion of the missing finer scales.Incorporating prior physics knowledge into deep learning models could also help alleviate HR data requirements [4].
For GAN-based models, in order to make the reconstructed turbulent fields corresponding to the input LR ones, content loss such as the traditional pixel-wise Mean Squared Error (MSE) loss or the improved VGG loss between the synthetic SR data and the corresponding ground truth data was essential to the generator's loss function.However, Qiao et al. [12] and Gao et al. [13] observed that the GANbased model would still generate some artifacts when reconstructing life scene pictures, while the conditional GAN (cGAN) model could help to alleviate the mismatch between the output and the target ground truth.Nevertheless, there was insufficient research on the performance of cGAN-based models in turbulent flow SR reconstruction.
All models mentioned above were supervised learning models since they required humans to indicate the paired LR data and HR reference data to form training data pairs.In order to broaden the application scenarios, Kim et al. [14] applied a cycle-consistent generative adversarial network (Cycle-GAN) trained by unpaired turbulence data for SR reconstruction.Their work demonstrated good results when only unpaired data existed, such as SR reconstruction from LES fields in a channel flow.Furthermore, compared to the supervised learning models under the condition where paired data are available, the unsupervised Cycle-GAN also achieved an excellent comparable performance.

Discussion
Usually, the inputs of single image SR reconstruction models are colorful images with three channels (RGB or YCbCr) or grayscale images with one channel.For RGB images, each color is divided into 256 grades from 0 to 255 according to the brightness of the color.Since the color of a pixel is represented by three RGB values, the pixel matrix corresponds to three color matrices, which are three channels.In contrast, the channels for turbulent flow fields vary depending on the researcher's interest.They could be the velocity field in the x, y, and z directions, respectively, or the pressure field or the vorticity field.For example, if a two-dimensional turbulent velocity field is the fluid data that need to be reconstructed, then both the input and output should contain two channels: the velocity field matrices for streamwise velocity u in the x direction and spanwise velocity v in the y direction.Lastly, color contours of velocity u and velocity v are generated, respectively.

CNN-based model CNN performs well in detecting and extracting features from images through convolution operations.
As the network goes deeper, the convolutional layer aims to detect and extract higher-level features.Dong et al. [4] propose a CNN for single image SR, which can process three color channels simultaneously.They name it SRCNN, which contains three layers for patch extraction and representation, nonlinear mapping, and reconstruction.However, since no paddings are applied, the input LR images need to be upscaled by bicubic interpolation as a pre-processing.Moreover, the output only reconstructs the central pixels of the pre-processed input, though the reconstructed output utilizes the information from the whole pre-processed input.Inspired by the SRCNN model, Fukami et al. [6] use a padding operation incorporated with the periodic boundary condition to avoid the pre-processing and the size mismatch between the output and ground truth data.The original structure of CNN built up to perform two-dimensional turbulent flow SR reconstruction is demonstrated in Figure 1(a).It contains three layers, which is the same as SRCNN.Figure 1(b) is a simplified schematic diagram of the convolution operation and nonlinear activation in one convolutional layer.For the convenience of representation, the number of filters is set to 3. Based on the SRCNN model, Fukami et al. [6] further apply data compression, skipped connections, and multi-scale models to enhance the model's performance, as the DSC/MS model demonstrated in Figure 1(c).The "skipped connections" and "multiscale models" are also commonly used in GANs, as Yousif et al. [9] applied residuals in residual dense blocks and three parallel convolutional sub-models with different kernel sizes in the generator of their MS-ESRGAN model.J.Kim et al. [15] demonstrate that a very deep network can utilize more contextual information in an image and model more complex functions by adding nonlinear layers.Hence, deeper networks usually perform better than shallow ones.When training a deep neural network, "skipped connections" are applied to avoid the "vanishing gradients problem" [15] and "degradation problem" [16].In contrast to the over-fitting problem that the training error is low while the test error is high, the "degradation" occurs as both the training error and test error increase as network depth increases.To solve this problem, He et al. [16] construct an explicit identity mapping parallel to a residual mapping.The combination of those two mappings forms the eventual desired mapping.Without adding extra parameters, the solvers overcome the difficulty of approximating identity mappings if the sole identity mappings are optimal.An image usually contains low-level features such as edges and more semantically meaningful highlevel features such as objects, their states, and events [17].In image translation problems, lots of lowlevel information is shared between the input and the output [18].Since the LR image significantly correlates with the target HR image, residual learning makes the Very Deep Super Resolution (VDSR) network achieve convergence in a much shorter time [15].Since LR fluid and HR fluid data have a strong correlation, "skipped connections" are also commonly used in the deep CNN-based models and GAN-based models' generators when performing turbulent flows SR reconstruction.
"Multi-scale models" aim to extract multi-scale features from an image, applying convolutional filters with different scales in a parallel form.Du et al. [19] introduce that each image feature has its optimal scale, so filters with different scales can help to extract small-scale structures with short-range contextual information and large-scale smooth regions with long-range contextual information.The turbulent flow field data also has the same characteristics.Hence, the "multi-scale" technique can also enhance the performance of SR reconstruction of turbulent flows.
The input data for turbulent flow SR reconstruction could be the velocity, pressure, and vorticity fields.Each channel, also a two-dimensional matrix, can be regarded as a vector in a high-dimensional space.For a CNN-based model, the loss function in each channel is usually defined as the L2 norm or the Euclidean distance between the reconstructed SR data I HR and the ground truth HR data I SR .This type of loss function is also called the Mean Squared Error (MSE) loss function: where m is the number of training samples in a batch.W i and H i are the dimensions of the i-th pair of HR and SR data, which can be viewed as two-dimensional matrices.If there are multiple channels, the loss function can be the arithmetic mean of MSE for all channels.The model is trained to find the optimal parameters to minimize MSE characterized in Equation ( 1).Fukami et al. [5] found that the traditional bicubic interpolation method performs worse than deep learning-based SRCNN and DSC/MS, since bicubic interpolation will over-smooth the flow field rather than reconstruct small-scale structures in the velocity and vorticity field.The DSC/MS model outperforms the original SRCNN model based on the same pixel-wise loss function.Although the SRCNN model may achieve a lower L2 error norm than bicubic interpolation, the flow field reconstructed by SRCNN will be pixelized.The DSC/MS model achieves the lowest L2 error norms and reconstructs the most details, but the reconstructed fields are still not comparable to the ground truth HR flow fields.Moreover, the reconstructed fields are still likely pixelized if the inputs are the superlow max pooled data from the HR ground truth.The reason is that minimizing L2 error can only reduce the overall pixel-wise difference between the reconstructed data and the ground truth data, while the reconstructed fields are not realistic enough.SR reconstruction is an underdetermined inverse problem where there are several plausible reconstructions corresponding to the input LR data.Hence, solely minimizing L2 error will average these potential multiple potential solutions to create an overly smooth and unrealistic reconstruction [20].The texture and structure details, or in other words, the highfrequency details where the intensity of pixels changes drastically, will be blurred by solely applying MSE.Kim et al. [14] also find that when the resolution ratio is greater than four, the SR velocity fields reconstructed by MSE-based CNN show only slight improvement over outputs of bicubic interpolation, which blurs the small-scale structures of the target flow fields.Hence, an adversarial loss is needed, which can help to reconstruct detailed structures.

Theory of the GAN-based Models.
GAN is commonly used to generate new data with the same statistics as the training data.The network contains two models: the generator G and the discriminator D. In the application of single image SR, the generator G takes the input sampled from a noise distribution and outputs a target image that is as realistic as possible without any constraints.Suppose all training data are sampled from a specific sample distribution, which is the distribution of authentic images.Then the discriminator D is a binary classifier to output the probability that the discriminator's input is sampled from the distribution of authentic images rather than the generator output's distribution.Suppose the input training data is y, and the noise vector is indicated by z.First, the parameters of the generator are fixed, and the objective function of the discriminator is: To maximize the above objective function is to measure the Jensen-Shannon divergence between the probability distribution P sample and the generator output's probability distribution P G .Sequentially, the parameters of the discriminator are fixed, and the generator is updated as: After certain training iterations, the generator learns the authentic sample probability distribution P sample to output more realistic or more perceptual satisfying images with high-frequency details.
When it comes to the SR reconstruction of turbulent flows, rather than generating new data that reside on the manifold of authentic sample data, it's significant to ensure the generated more realistic SR fluid data are also faithful to the specific target ground truth.Meanwhile, the generator will still learn the characteristic distribution of authentic flow fields to make its outputs more realistic.Hence, the inputs of the generator will be the LR data instead of the noise vectors.Suppose a batch has m samples of paired LR and HR data.The given LR data are represented as X={x 1 ,…,x m }, and the corresponding HR data is Y= {y 1 ,...,y m }.In the training process, the objective function of the discriminator is represented as: Initially, the parameters of the generator have been fixed, and the parameters of the discriminator are updated through maximizing Equation ( 4).This update is equivalent to training the discriminator to judge whether the input is the generated SR data G(x i ) or the given HR data y i .Secondly, the discriminator's parameters have been fixed and the first term in Equation ( 4) could be eliminated because this term will not change with the updates of the generator's parameters.Since the generator initially aims to generate more realistic data, the generator will be trained to minimize Equation ( 4) only.However, to ensure that the generated data are faithful to the corresponding HR data, a content loss as the L2 loss defined by Equation (1) needs to be added to form the total loss function of the generator.Hence, the generator is trained to overall minimize: where λ is a hyper-parameter defined by researchers.

SRGAN and ESRGAN.
Ledig et al. [20] first introduced SRGAN, whose generator network's loss function has some modifications based on Equation (5).Firstly, instead of calculating pixel-wise losses, the content loss is switched to calculate the VGG-based content loss, which is better for evaluating the perceptual difference in the content.It is defined as the Euclidean distance between the feature representations of a reconstructed SR image G(I LR ) and the ground truth HR image I HR : ) where  , represents the feature map obtained by the j-th convolution (after activation) before the i-th max-pooling layer within the VGG19 network [21].W i,j and H i,j are dimensions of corresponding feature maps.Secondly, when training the generator, the loss function is modified to have a better gradient behavior as: This loss term is called adversarial loss since it is derived from the GAN model.The hyper-parameter λ is set to be λ=10 -3 [8] [22].Hence the total loss for the generator is: Parameters of the generator are updated during the to minimize the total loss function.Noticeably, this generator architecture uses parametric ReLU as a nonlinear activation function to adaptively learn the parameters of the rectifier and improve the accuracy without much computational cost.When applying to the SR reconstruction of turbulent flows, Deng et al. [8] demonstrate that SRGAN could accurately capture and reconstruct the fine flow structures in the ground truth data rather than blurring them.Figures 2(a) and 2(b) show the model's detailed structure.They also provide that their mean-squared error (MSE) is much smaller than the DSC/MS model from Fukami et al. [6], which shows that the SRGAN model performs better than the CNN-based models when reconstructing fluid data.Thirdly, a relativistic discriminator is applied.It aims to predict the probability that an authentic sample image is relatively more realistic than a generated fake one.When training the discriminator, both the first and second terms in Equation ( 4) are relative to the ground truth data and the generator's Hence, both the generated and authentic data can be utilized for training the generator, which can help to output sharper edges and detailed textures [22].Fourth, ESRGAN uses VGG features before the activation layer to compute the VGG-based content loss defined by Equation ( 6).This modification can help to avoid sparse activation.
Deng et al. also test ESRGAN's performance in reconstructing turbulent flow fields.They found that both SRGAN and ESRGAN can recover HR instantaneous velocity fields with high-frequency details.However, when they investigate the mean flow field calculated by 1000 consecutive instantaneous flow fields reconstructed by these two models, they observe that SRGAN will blur the contour edges of the time-averaged flow field, while ESRGAN can provide better reconstruction results with sharper edges as shown in Figure 3.However, the MSE of ESRGAN is higher than that of SRGAN in most cases.Since MSE only evaluates the pixel-wise difference, MSE may even favor over-smoothed edges.This observation is consistent with observations from Ledig et al. [20], where the over-smoothed highfrequency details will achieve a higher Peak Signal-to-Noise Ratio (PSNR) when reconstructing life scene images.

Conditional GAN Model (cGAN)
Deng et al. [8] propose that SRGAN may generate artifacts when reconstructing turbulent flow field data, and Wang et al. [22] also find artifacts will be generated when SRGAN reconstructs life scene pictures in RGB channels.This phenomenon may be understandable since GAN is originally designed to take random noise as input and transform it into a wide variety of perceptual-satisfying data creatively and artificially.However, when GAN is applied to perform SR reconstruction of images or turbulent flows, two problems must be addressed.
The first problem is model collapse, which is common in GAN-based models.Model collapse means the generator repeatedly produces a single output type, regardless of the input difference.The reason is that the generator finds that a certain kind of data can always get a high score under the judgment of the discriminator.When the generator is overfitted for the discriminator, the SR output will not vary based on the change of LR input.
The second problem is the ambiguous relation between the generator's input and output.Hence, using the input to control the generator's output is hard.For the GAN-based SR models, the content loss can mitigate these two problems by forcing the output SR data to be faithful to the corresponding ground truth HR data, especially for those smooth areas with low-frequency information.However, some artifacts will still be introduced, and the model collapse may still exist to result in a mismatch between the LR inputs and the HR outputs in single image SR reconstruction [12] [13].These created artifacts will increase the MSE loss of the GAN-based model.Ledig et al. [20] attribute the artifacts as compromise between the adversarial loss and the content loss.Furthermore, Qiao et al. [12] propose why SRGAN achieves a lower PSNR value than the model trained to minimize MSE only [20], which is that the generator will construct some fictitious details in images.Hence, based on the explicit alignment introduced by the content loss, assisted implicit alignment introduced by cGAN becomes a possible optimization method to reduce artifacts generated from the adversarial loss.
In the original cGAN model, the generator and discriminator are conditioned on auxiliary information such as class labels or data.This model constrains the generator's output depending on the conditioning information.Usually, the generator in the cGAN model has two inputs: conditioning information and a noise vector.When using cGAN to output images, the conditioning information usually controls underlying structures or objects in the image.Different noise vectors will give generated images various styles, such as different textures and backgrounds [23] [24].However, like the discussion of the GAN-based models in section 3.2.1, the use of cGAN for image SR reconstruction does not need the noise vector to bring variations.The LR image is the only input to the generator, which is also the conditioning information for the generator.The discriminator recognizes two types of errors: unrealistic images and realistic images with unmatched conditioning information.The second error represents plausible perceptual satisfying pictures with artifacts.Its input is a real HR or reconstructed SR image, along with the concatenated conditioning information, which can be a LR image [13] or a real HR image [12] [25].Suppose HR images are taken as conditioning information, and a batch has m samples of paired LR images (X) and HR images (Y) as {(x 1 ,y 1 ),....,(x m ,y m )}.In the training process, the objective function of the discriminator is: ) The new adversarial loss function is: The new adversarial loss function helps to eliminate the generation of artifacts brought by traditional adversarial loss.Both Hongxia et al. [13] and Xiaolu et al. [25] observe that cGAN based model achieves the highest PSNR values among other state-of-the-art single image SR algorithms, including the SRGAN model.The highest PSNR values represent the lowest MSE loss.Therefore, the cGAN-based models can produce the output most faithful to the corresponding authentic HR image content while still full of accurate details.Introducing conditioning information may help to solve the problem faced by Deng et al. [8], which is that ESRGAN produces finer flow field structures but higher MSE than SRGAN.Kim et al. [14] find that cGAN can generate high-quality velocity fields similar to the DNS ones, even if the resolution ratio reaches sixteen.They regard cGAN as a benchmark, representing the best performance among some supervised learning models, to compare with the performance of unsupervised learning models on SR reconstruction of turbulent flows.However, there is still a lack of research on the output improvement of cGAN-based models compared to various GAN-based models and the comparison of their computational consumption when performing flow field data SR reconstruction.
Based on Equation ( 9), a third input consisting of realistic HR data with mismatched conditioning information may be applied to further align reconstructed SR data with corresponding ground truth HR data.The discriminator is explicitly trained to penalize pairs of realistic data and unmatched conditioning information.Suppose other samples of HR data are taken from training data sets as {y ̂1,…,y ̂m}.The objective function of the discriminator is:

L D D y y D y G x D y y mm
CONF-CIAP 2023 Journal of Physics: Conference Series 2634 (2023) 012046 The training steps remain the same.After finding the maximum results of L(D) in Equation ( 11), the parameters of the are fixed, and the generator will be trained to minimize the total loss consisting the adversarial loss characterized by Equation ( 10) and the VGG-based content loss characterized by Equation ( 6).Reducing artifacts is vital for SR reconstruction of turbulent flows.However, further exploration is needed in researching cGAN and its derived models in SR reconstruction of fluid data.When cGAN is used for turbulent flow field data reconstruction, the reconstruction results will likely have more high-frequency information and match the HR target data better without artifacts.

Conclusion
Deep learning models help to save enormous time, cost, and computational power to perform the SR simulation or reconstruction of turbulent flow fields.These models outperform the traditional interpolation-based methods, which can not reconstruct high-frequency details.Inspired by the development of deep learning models to reconstruct life scene pictures in color channels, researchers improve these models and apply them to reconstruct LR turbulent flow fields.This paper first points out the unique features of flow field reconstruction.Then this work reviews the development of deep learning models for SR reconstruction of turbulent flows and the motivations behind these developments.We mainly compare two categories: CNN-based models and GAN-based models.Since CNN-based models only focus on reducing the pixel-wise difference between the two images, some details will be blurred or even pixelized when the reconstructed data are represented as contour plots.Introducing adversarial loss in GAN-based models can ensure that the reconstructed results are more realistic, which means small-scale structures can be reconstructed.Combined with content loss, the model can output relatively good results.However, the GAN-based models may cause the reconstructed flow field data not to match the input flow field data.Hence, we propose a possible research direction for using cGANbased models for data reconstruction to enhance the dependency of the reconstructed data on the given conditioning information.Conditional GAN has already been mentioned to be able to achieve great reconstruction results similar to the DNS ones [14].Nevertheless, this paper only analyzes several CNN and GAN-based models as general benchmarks for performing SR reconstruction of turbulent flows.Some other models, like efficient sub pixel CNN (ESPCN) and temporal coherent GAN (TecoGAN) have also been applied in turbulent flow reconstruction [26].Besides, the physics-informed model [4] [9], [27] and the unsupervised model [14] are also current research directions.Since many turbulent flow reconstruction models are based on image SR reconstructions in color channels, a bunch of general CNN-based models [28] and GANbased models [29] used for single image SR reconstruction could also be applied in the research of turbulent flow reconstruction.Lastly, when the resolution ratio between the target SR data and the input LR data is large, cGAN can still output high-quality results similar to the DNS ones.However, the comparison with various GAN-based models and the trade-off of computational consumption still requires more in-depth research.

Figure 1 .
Figure 1.(a) Original structure of CNN used to reconstruct LR two-dimensional turbulent flow.(b) Schematic diagram of the convolution operation and nonlinear activation in one convolutional layer with three filters.(c) Schematic diagram of the DSC/MS model.[6] sample

Figure 2 .
Figure 2. (a) The structure of the generator network in SRGAN.(b) The structure of the discriminator network in SRGAN.(c) The construction of RRDB without BN layers in ESRGAN.[8] Furthermore, Wang et al. [22] improve the SRGAN model and propose the ESRGAN model, which mainly contains four modifications.Firstly, all the generator's Batch Normalization (BN) layers are removed to avoid unpleasant artifacts.Secondly, the new Residual-in-Residual Dense Block (RRDB) replaces the original residual block in SRGAN, as shown in Figure 2(c).The RRDB structure contains several alternating convolutional and ReLU layers, and those skipped connections help to stabilize the training.The deeper structure could enhance the model's performance with those identity mappings.Thirdly, a relativistic discriminator is applied.It aims to predict the probability that an authentic sample image is relatively more realistic than a generated fake one.When training the discriminator, both the

Figure 3 .
Figure 3. Contour plots of the time-averaged flow field derived from different models: the streamwise velocity reconstructed by (a) SRGAN (x4), (b) ESRGAN (x4), (c) SRGAN (x8), (d) ESRGAN (x8), (e) ground truth; the spanwise velocity reconstructed by (f) SRGAN (x4), (g) ESRGAN (x4), (h) SRGAN (x8), (i) ESRGAN (x8), and (j) ground truth.[8] applied a multiscale enhanced SR generative adversarial network (MS-ESRGAN) to reconstruct HR velocity fields at different Reynolds numbers.MS-ESRGAN consisted of a multi-scale part before the final convolution layer in the generator.Their result showed that if only paired training data at specific Reynolds numbers were used for training, the trained model could still reconstruct LR data at other Reynolds numbers within the training Reynolds numbers range.This model showed excellent interpolation ability and broadened the SR reconstruction application at different Reynolds numbers not used in the training process.