where \(\text{sim}\left(u,\ v\right)\) denotes the cosine similarity between vector u and v , \(l_{i,\ j}\) denotes the loss between i and j , and \(\tau\) denotes the temperature coefficient to control the confident of SoftMax prediction.

2.3.3 Supervised learning fine-tuning model

The Reaction Superiority Classification fine-tuning model is constructed by the pre-trained backbone model and the MLP with a Sigmoid activation function is used for generating the reaction superiority probability. Since focal loss allows the model to focus more on hard negative samples, the binary focal loss46 is selected as the loss function for fine-tuning to improve the classification accuracy of the model. The formula of binary focal loss (FL) is shown in Eq. (4).