2.3 Data Preparation and Training
The training samples consist of two datasets: (1) low-resolution sub-images of micro-CT with a size of 64×64 and (2) high-resolution sub-images of SEM with a size of 1024×1024. Considering that there are only a limited number of real SEM images, we extract the high-resolution sub-images with a 70% overlap between sequential sub-images (with a sliding stride of 500 pixels) to increase the number of training samples. In total, there are 371 high-resolution samples of SEM. The micro-CT data are sufficient in number, so we randomly extract the low-resolution sub-images without any overlap. In total, there are 200,000 low-resolution samples of micro-CT images.
The training of the networks includes two steps. We start with the StyleGAN2-ADA training for SEM data augmentation. It aims to learn the underlying probability distribution (or manifold) where the real SEM sub-images lie. The StyleGAN2-ADA is trained in parallel with four Nvidia A100 GPUs each with 40 GB memory. It takes approximately 24 hours to converge. After training, we use the generator of the StyleGAN2-ADA to simulate 200,000 high-resolution sub-images to keep the number comparable with the low-resolution images. Such data augmentation would be helpful for the training of the following CycleGAN to avoid overfitting. After training the CycleGAN, we can use the generator\(G_{\text{LH}}\) to transfer the low-resolution images with a size of 64×64 to the high-resolution domain with a size of 1024×1024. The training of CycleGAN takes approximately 24 hours with one Nvidia A100 GPU.
Results
Figures 2a and 2b show the high-resolution images extracted from the real SEM data and those simulated by the trained StyleGAN2-ADA with different global styles but constant noise as input, respectively. The simulated images capture the fine details of microstructures in the SEM data and are visually indistinguishable from the real samples. Moreover, most pore types, i.e., inter- and intra-granular pore and microfracture from the nano to micron scale, are accurately recovered. It indicates that undesirable mode collapse does not occur in the training and the trained StyleGAN2-ADA can guarantee the diversity of image generation. To quantitively evaluate the quality of the generated images, we compute the porosity, specific surface area and two-point correlation of both the real and simulated images. As shown in Figure 3, the distributions of the synthetic and real samples are consistent. It also indicates that the microstructural details of the SEM data are well captured by the StyleGAN2-ADA. The correlation curves shown in Figure 3c are close to the exponential model defined as \(R\left(h\right)=e^{-h/\lambda}\)where \(h\) is the lag distance and \(\lambda\) is the correlation length. The correlation length \(\lambda\) of each sample in Figure 3d is obtained by fitting the exponential model \(R\left(h\right)\) to the two-point correlation curve. As can be seen from the histograms of correlation lengths, the synthetic samples, while broadly consistent with the true samples, do tend to have slightly longer correlation lengths, indicating that some of the generated synthetic samples may be smoother than the true images. Figure S4 in the Supporting Information shows the synthetic images generated with fixed styles but different noise. The generated samples look similar in global features but different in local features and they are very close in terms of the porosity, specific surface area and two-point correlation (Figure S5 in the Supporting Information). The above different behaviors with constant and changing global style vectors indicate that the style-based GAN has a good ability of separating the global and local styles underlying the training images. Thanks to the disentangled representation of the latent space, we can generate new images by interpolation in the latent style space. As shown in Figure S6 in the Support Information, the generated images are smoothly transformed from one end-member to another with progressive interpolation between the latent space end-members.