Lei Ye - Authorea

Despite advances in hydrological Deep Learning (DL) models using Single Task Learning (STL), the intricate relationships among multiple hydrological components and model inputs might not be comprehensively encapsulated. This study employed a Long Short-Term Memory (LSTM) neural network and the CAMELS dataset to develop a Multi-Task Learning (MTL) model, predicting streamflow and evapotranspiration across multiple basins. An optimal multi-task loss weight ratio was determined manually during the validation phase for all 591 selected basins with streamflow data-gaps under 5%. During test period, MTL showed median Nash-Sutcliffe Efficiency predictions for streamflow and evapotranspiration at 0.69 and 0.92, consistent with two STL models. The MTL’s strength appeared when predicting the non-target variable, surface soil moisture, using probes derived from LSTM cell states—representative of the internal DL model workings. This prediction showed a median correlation coefficient of 0.90, surpassing the 0.88 and 0.89 achieved by the streamflow and evapotranspiration STL models, respectively. This outcome suggests that MTL models could reveal additional rules aligned with hydrological processes through the inherent correlations among multiple hydrological variables, thereby enhancing their reliability. We termed this as “variable synergy,” where MTL can simultaneously predict varied targets with comparable STL performance, augmented by its robust internal representation. Harnessing this, MTL promises enhanced predictions for high-cost observational variables and a comprehensive hydrological model.