Methods
We assessed the change in CoE between the original and updated Cochrane
systematic reviews, which reported rating of CoE as per the Grading of
Recommendations Assessment, Development and Evaluation (GRADE) system
for critical appraisal of medical evidence.6 We used
GRADE as this has been widely recognized as the most advanced system for
operationalization of fundamental principles of EBM and critical
evaluation of medical evidence. 1,7,8 GRADE was
developed in the first decade of 21the Century after critical appraisal
of 106 systems for rating the quality of medical research evidence
showed that none of them was capable of distinguishing low from high
quality evidence.1,9,10
We focused on the assessment of systematic reviews, rather on individual
trials, because the second important EBM principle is that assessment of
the true effects of health interventions is best accomplished by
evaluating total evidence on the topic rather than based on a study
selected to favor a particular claim.1 GRADE is also
considered a suitable method to asses certainty of evidence at the level
of systematic review/meta-analysis.8 Thus, the unit of
our analysis was a systematic review/meta-analysis (SR/MA).
Cochrane Reviews are regularly updated providing a unique opportunity to
assess when and whether the assessment of CoE changes between the
original and updated reviews as a result of new evidence generated
between two reviews. Since 2013 Cochrane Reviews have mandated the use
of GRADE Summary of Findings (SoF)11 to summarize CoE
and magnitude effects of interventions that the reviews assessed. We
evaluated all Cochrane reviews published in the last 5 years in the
Cochrane Database of Systematic Reviews [
https://www.cochranelibrary.com/cdsr/about-cdsr].
We used SoFs from the original and
updated reviews to extract data for the primary outcome related to CoE
and to assess the magnitude and direction of effect. (In case of
multiple primary outcomes, the data were extracted from the first one
listed in SoF table that contained data in both original and updated
review). Eligible SR/MAs were divided into 5 groups; data were extracted
from each group by pairs of independent reviewers. Kappa interrater
agreement was calculated for each pair regarding CoE. As explained, we
recorded CoE according to GRADE criteria (very low, low, moderate,
high).1,12
We also extracted summary meta-analytic estimates for the primary
outcome from each pair of reviews, i.e. point estimates, dispersion
(e.g. 95 % confidence interval), metric used (e.g., relative risk, odds
ratio, hazard ratio, standardized mean differences, etc.), number of
trials per meta-analysis, number of participants, type of comparator
(active vs placebo/no treatment), type of treatment (pharmaceutical vs
non-pharmaceutical), whether the authorship of the original and updated
reviews changed (to capture potential differences in judgment of CoE by
the review team), and type of studies (randomized controlled trials vs
observational studies ) that were meta-analyzed.
We converted all effect estimates into odds ratio (OR). We also
converted all effect sizes in the same direction, with OR<1
indicating reduction of undesirable
outcomes (i.e., more beneficial
treatment). Because GRADE separates recommendations as strong vs weak
based on the CoE13, typically endorsing strong vs weak
(conditional) recommendations based on moderate/high vs. low/very low,
respectively4,14, our key analysis focused on the
differences in effect sizes between these subgroups. We conducted
McNemar’s test for paired (before vs after) data to reject the null
hypothesis of equal probability that CoE remained the same i.e., in very
low/low CoE vs moderate/high CoE groups. To test for linear trend in
change of CoE over all categories -from very low to high- we employed a
symmetry test with marginal homogeneity tests (which reduces to
McNemar’s test for two non-independent categories of observations).
To asses for differences in the magnitude of effect size between
original and updated evidence as a function of change in the assessment
of CoE we calculated the ratio of odds ratio (ROR) across meta-analytic
estimates. 15 ROR compares intervention effects in
meta-analysis of trials with very low/low vs those with moderate/high
CoE (or vice versa). 15 Thus, if the comparison
referred to OR with very low/low vs those with moderate/high CoE
pertains to ROR<1, this would mean that treatment effects were
more beneficial in meta-analysis of trials with very low/low CoE, while
ROR>1 would indicate the opposite. 15,16A test of interactions was performed to assess the hypothesis of no
difference between the subgroups (i.e, treatments effects in very
low/low vs moderate/high
CoE).17 Because
of assumed correlations in comparison of treatment effects, we
calculated standard errors for ROR by correlating the effect sizes
observed in the original vs updated reviews. 17 We
obtained the values for correlation coefficients from the data. We
performed sensitivity analyses by: a) assuming one correlation
coefficient between effects sizes in the original vs updated reviews,
and b) calculating correlation coefficients for each subgroup according
to direction of treatment effects (i.e., we calculated separate
correlation coefficients for the subgroup showing positive, negative and
no change in direction of effects between the original vs updated
review- three correlation coefficients in total). We also repeated all
analyses assuming no correlations between the effect sizes. Since we
observed no differences in the results regardless of the postulated
assumption, we report the default analysis based on calculation with
three different correlation coefficients.
Our hypothesis was that ROR between the subgroups would differ; in
addition, we would expect that the effect size would be larger if CoE
change from moderate/high to very low/low than other way around.
The analyses were based on using random effect Sidik-Jonkman model. We
assessed heterogeneity i.e. dispersion of effect size across the
meta-analytic estimates by calculating τ (tau) statistic.16 We used I 2 statistic to
assess inconsistency; I 2 represents the
estimated proportion of the observed variance in true effect sizes
across individual meta-analyses rather than sampling
error;16 it depends both on heterogeneity and total
variation in the estimates between the analyses.1618
We complemented assessment of heterogeneity with calculation of the
absolute deviation of treatment effects (aROR) as a function of change
in CoE. 19 By definition, aROR is positive and
reflects the x-fold deviation of treatment effect from OR=1 on the OR
scale. Thus, if ROR=0.8 or ROR=1.25, the absolute deviation is equal to
aROR=1.25. aROR across all SR/MAs was expressed as (unweighted) median
and interquartile range (IQR). 19 We also evaluated
how the precision of the estimates changed by calculating the ratio of
standard errors for each subgroup summarized as (unweighted) median and
IQR.19 Values > 1 indicate larger
standard errors (less precision) associated with given category (e.g.,
very low/low vs moderate/high) of CoE .19
A number of subgroup analyses- all defined a priori and published
in the protocol to provide further methodological
details20 - were performed. These include assessment
of differences between patient-oriented (e.g., mortality, quality of
life etc) vs disease-oriented outcomes (e.g. disease response,
laboratory outcomes etc.), effect of a change in authorship between the
original and updated reviews, effect of comparator intervention (active
treatment vs placebo/no treatment control) and type of treatment
category (pharmaceutical vs non-pharmaceutical). Finally, in some cases,
the SRs included observational studies along with randomized controlled
trials (RCTs) and implausibly large ORs generated in conversion
processes from standardized mean differences. We further analyzed these
results by performing sensitivity analyses excluding SRs with
observational studies and large ORs from the analysis.
This paper is reported per PRISMA guidelines.21 All
analyses were conducted with the Stata,ver17 statistical
package.22