where \(\alpha_{t}\) is the balanced factor to solve category imbalances, \(\gamma\) is the focusing factor to control the model focus on hard negative samples, \(p_{t}\) is the positive probability predicted by the model.
After dividing the training set, validation set, and test set in accordance with 8:1:1 for model fine-tuning operations. The parameters of the backbone model and the newly constructed MLP layers are optimized using supervised learning methods. After mapping the output values to the interval from 0 to 1 using the Sigmoid activation function, the superiority probability of the reaction is generated to evaluate the reactions.

2.3.4 Details of model implementation

The pre-training model (Fig. 1(c) ) uses the SGD47 optimizer to optimize the parameters of backbone encoder model and projection head. The initial learning rate for backbone model pre-training process is set to 0.01 with cosine learning rate decay. It includes one warm-up epoch. The weight decay is set to 0.0005 and the momentum is set to 0.9 which can improve the prediction accuracy and learning efficiency. The total epoch number for model pre-training is 8 with a batch size 512, providing initial parameters for backbone model.
The fine-tuning model (Fig. 1(d) ) uses the Adam47 optimizer for gradient decent optimization. The initial learning rate for model fine-tuning is set to 0.001 with cosine learning rate decay, and the weight decay is set to 0.00001. The total number of epochs is controlled by the early stop strategy. Training is terminated when there is no improvement in accuracy in the validation set for 20 consecutive times with a batch size 256.
The backbone model is constructed with a depth of 5 GINE layers with 0.1 possibility of dropout. The hidden dimension of this backbone model is set to 300 and its readout dimension is 512. Through fully connected layers constructed in the fine-tune process with hidden dimension 256 and a dropout rate 0.5, the model could be trained to predict reaction superiority possibility easily.