2.2.2 Self-Organizing Maps
SOMs are a non-linear technique of artificial neural networks based on an unsupervised training process that allows multivariate analysis. The results obtained using SOMs retain the original data’s topology while projecting into a two-dimensional scheme for simplification. This allows for visualizing, classifying, grouping, and detecting complex patterns of any set of variables used in training simultaneously (Liu et al., 2006). SOMs have been used for long-term currents characterization and to study the possible hydrodynamic conditions in specific regions, such as in Liu & Weisberg (2005, 2007), who obtained the current patterns on the west Florida platform and established a relationship between local winds and coastal up/downwelling processes, or as in Vilibić et al. (2016), who used the SOMs method for forecasting system of surface currents. More recently, Orfila et al. (2021) used SOMs to establish the patterns and seasonal dynamics of the southern CS. These examples show the versatility and capacity of the method as a useful and robust technique for pattern recognition and feature extraction in variables where non-linearity is important, as may be the case in oceanographic processes. Further details of the SOMs method are in Liu & Weisberg (2011).
In this study, we determined the current patterns by applying SOMs over the HYCOM 25-year climatology. The method uses a neighborhood function, a unit search radius, and a linear initialization process. The training algorithm employed a group series approach, carefully analyzing parameters to ensure the lowest quantization and topological errors, following best practices outlined by Meza-Padilla et al. (2019) and Liu et al. (2006). Before the training process, each variable was spatially and temporally normalized to prevent any single component from dominating the map organization in cases where its magnitude is disproportionately higher than that of the other components. This normalization ensured that all variables contributed equally to the SOMs, leading to a more balanced and accurate data representation. After the training process, the components were denormalized and further analyzed under the terms of each variable. The determination of each map size (cluster) is a subjective and empirical process that depends on the desired detail for the analysis (Liu, Weisberg, Lenes, et al., 2016; Liu, Weisberg, Vignudelli, et al., 2016; Meza-Padilla et al., 2019; Weisberg & Liu, 2017; Zeng et al., 2015). After a series of sensitivity tests, we choose the map sizes to obtain the minimum number without losing essential pattern variation. The sensitivity tests were based on the quantization error (QE), measuring how much detail is being learned by the SOMs, and on the topological error (TE), measuring the properties of the preserved space and the variation percentage of each pattern. This empirical procedure depends intrinsically on the study (Polzlbauer, 2004). We determined the optimal number of patterns by quantifying the associated QE and TE errors through various tests using different cluster arrangements, including 2x2, 2x3, 3x3, 3x4, and 4x4. The results showed that the QE decreases using the 3x3 cluster, and although the TE increases when increasing the number of patterns, it remains an acceptable and small value in terms of space preservation. As a result, the spatial characterization was done for a cluster of nine patterns (3x3 cluster). An additional recommendation when using SOMs to integrate trajectories is to maximize the number of spatial patterns so that the temporal variability given by BMUs, will correctly approximate the data temporal variability. At some point, the method will detect that further spatial patterns do not provide any additional information, e.g., in our case with the 3x3 cluster, there is one pattern with 0% frequency; this means that eight spatial patterns are enough to extract the dominant patterns and therefore additional patterns will not improve the spatial or temporal representation of the data. We use the SOMs MATLAB Toolbox developed by Vesanto & Alhoniemi (2000) from the Laboratory of Computer and Information at the Helsinki University of Technology (Laboratory of Computer and Information Science, Adaptative and Informatics Research Center , 2015).