where \(\text{sim}\left(u,\ v\right)\) denotes the cosine similarity
between vector u and v , \(l_{i,\ j}\) denotes the loss
between i and j , and \(\tau\) denotes the temperature
coefficient to control the confident of SoftMax prediction.
2.3.3 Supervised learning fine-tuning
model
The Reaction Superiority Classification fine-tuning model is constructed
by the pre-trained backbone model and the MLP with a Sigmoid activation
function is used for generating the reaction superiority probability.
Since focal loss allows the model to focus more on hard negative
samples, the binary focal loss46 is selected as the
loss function for fine-tuning to improve the classification accuracy of
the model. The formula of binary focal loss (FL) is shown in Eq. (4).