2.3 Data Preparation and Training
The training samples consist of two datasets: (1) low-resolution
sub-images of micro-CT with a size of 64×64 and (2) high-resolution
sub-images of SEM with a size of 1024×1024. Considering that there are
only a limited number of real SEM images, we extract the high-resolution
sub-images with a 70% overlap between sequential sub-images (with a
sliding stride of 500 pixels) to increase the number of training
samples. In total, there are 371 high-resolution samples of SEM. The
micro-CT data are sufficient in number, so we randomly extract the
low-resolution sub-images without any overlap. In total, there are
200,000 low-resolution samples of micro-CT images.
The training of the networks includes two steps. We start with the
StyleGAN2-ADA training for SEM data augmentation. It aims to learn the
underlying probability distribution (or manifold) where the real SEM
sub-images lie. The StyleGAN2-ADA is trained in parallel with four
Nvidia A100 GPUs each with 40 GB memory. It takes approximately 24 hours
to converge. After training, we use the generator of the StyleGAN2-ADA
to simulate 200,000 high-resolution sub-images to keep the number
comparable with the low-resolution images. Such data augmentation would
be helpful for the training of the following CycleGAN to avoid
overfitting. After training the CycleGAN, we can use the generator\(G_{\text{LH}}\) to transfer the low-resolution images with a size of
64×64 to the high-resolution domain with a size of 1024×1024. The
training of CycleGAN takes approximately 24 hours with one Nvidia A100
GPU.
Results
Figures 2a and 2b show the high-resolution images extracted from the
real SEM data and those simulated by the trained StyleGAN2-ADA with
different global styles but constant noise as input, respectively. The
simulated images capture the fine details of microstructures in the SEM
data and are visually indistinguishable from the real samples. Moreover,
most pore types, i.e., inter- and intra-granular pore and microfracture
from the nano to micron scale, are accurately recovered. It indicates
that undesirable mode collapse does not occur in the training and the
trained StyleGAN2-ADA can guarantee the diversity of image generation.
To quantitively evaluate the quality of the generated images, we compute
the porosity, specific surface area and two-point correlation of both
the real and simulated images. As shown in Figure 3, the distributions
of the synthetic and real samples are consistent. It also indicates that
the microstructural details of the SEM data are well captured by the
StyleGAN2-ADA. The correlation curves shown in Figure 3c are close to
the exponential model defined as \(R\left(h\right)=e^{-h/\lambda}\)where \(h\) is the lag distance and \(\lambda\) is the correlation
length. The correlation length \(\lambda\) of each sample in Figure 3d
is obtained by fitting the exponential model \(R\left(h\right)\) to
the two-point correlation curve. As can be seen from the histograms of
correlation lengths, the synthetic samples, while broadly consistent
with the true samples, do tend to have slightly longer correlation
lengths, indicating that some of the generated synthetic samples may be
smoother than the true images. Figure S4 in the Supporting Information
shows the synthetic images generated with fixed styles but different
noise. The generated samples look similar in global features but
different in local features and they are very close in terms of the
porosity, specific surface area and two-point correlation (Figure S5 in
the Supporting Information). The above different behaviors with constant
and changing global style vectors indicate that the style-based GAN has
a good ability of separating the global and local styles underlying the
training images. Thanks to the disentangled representation of the latent
space, we can generate new images by interpolation in the latent style
space. As shown in Figure S6 in the Support Information, the generated
images are smoothly transformed from one end-member to another with
progressive interpolation between the latent space end-members.