2.3 | Estimation of model accuracy (EMA) for oligomeric
targets.
The EMA category has been an integral part of every CASP experiment
starting with CASP741-48. It has attracted
the attention of many developers, with over 70 methods tested in the
previous CASP experiment48. An emphasis on the
importance of this category led to very positive developments in protein
structure prediction as modelers now routinely integrate quality
estimates into their modeling pipelines. In particular, the
CASP14-winning AlphaFold2 method offers reliable estimates of global and
local accuracy of their models10,11.
In CASP15, the focus of the EMA category shifted from predicting
accuracy of single-sequence proteins to multi-molecular complexes.
2.3.1 | Model accuracy prediction format(https://predictioncenter.org/casp15/index.cgi
?page=format#QA). For global (whole model) accuracy
prediction (QMODE1), participants are asked to submit a fold similarity
score (SCORE, in 0-1 range), which estimates the similarity of model’s
overall fold to the target’s one, and an interface similarity score
(QSCORE, also in 0-1 range), which evaluates reliability of quaternary
structure interfaces. Submitting the QSCORE is optional, and predictors
can skip it by putting ‘X’ symbol in the corresponding place of a QA
prediction (see the link above). In QMODE2 (local accuracy), in addition
to the QMODE1 scores, the predictors are asked to assign confidence
scores to the interface residues of the model, indicating their
likelihood of being present in the native structure’s interface.
Interface residues are identified as having contact with at least one
residue from a different chain, with a Cβ-Cβ distance not exceeding 8Å
(or Cα, if the residue is Glycine).
Examples of EMA predictions in QMODE1 and QMODE2 are provided in Example
5 on the CASP15 format page.
2.3.2 | Submission collecting process. EMA predictions
in CASP15 are requested for all (and only) multimeric targets. In
contrast with previous CASPs, EMA targets are released after all models
(and not only server models) are collected on the corresponding
structure prediction target. A tarball with assembly predictions from
all CASP groups is created the next day after the TS target closure, and
a link to the tarball file is pushed to the EMA servers and posted at
the CASP15 website. All EMA groups, regardless of their type (i.e.,
‘server’ or ‘human’) have 2 days to return accuracy estimates for TS
models included in the tarball file. The predictions are checked with
the verification scripts, and successful predictions are saved for
subsequent evaluation.
2.3.3 | EMA evaluation measures. Global predictions
were compared with established evaluation metrics possessing the desired
attributes. This is the oligomeric Template Modelling score (TM-score)49 for overall topology
(SCORE) and the contact based QS-score50 which is interface
centric (QSCORE). To ensure a comprehensive evaluation, these metrics
were supplemented with additional measures. An oligomeric GDT-like
score, referred to as oligo-GDTTS, was employed for overall topology
analysis, and a variant of the interface centric DockQ score51. Notably, DockQ
evaluates pairwise interfaces, necessitating the introduction of a
weighted average metric—termed DockQ-wave—to effectively score
higher-order complexes. Local predictions were compared against the
per-residue lDDT 17 and
CAD (AA-variant) 52scores, which assess the accuracy of relative atom positions in the
neighborhood, including neighboring chains. Conceptually the scores are
contact-based, but do not penalize for added contacts, which is relevant
in case of incorrect interfaces. To address this limitation, two novel
local variants of the QS-score and DockQ have been introduced: PatchQS
and PatchDockQ. All evaluation metrics are described in detail in the
CASP15 EMA Assessment paper6.