Statistical analysis
We first selected a set of cytokines most closely related to the viral
immune response based on our previous analyses and published literature.
These included IFN- γ, IL-1β, IL-6, IL-8, IL-10, IL-12, MIP-1β, MCP-1,
IP-10 (10, 11). Given the temporal variation for both the cytokine
concentrations and semi-quantitative SARS-CoV-2 RT-PCR, we standardized
and rescaled the values from 0 to 1 per week. The BAL SARS-CoV-2 RT-PCR
Ct values and the prespecified set of cytokines in BAL and plasma were
included in the clustering analyses. We used a hierarchical clustering
on principal components approach. This was employed separately for
plasma and BAL with complete linkage and Euclidean distances predefining
the number of clusters as 2 and 3. The number of clusters was a priori
specified given the known partitioning into 2 clusters of subjects with
COVID-19 ARDS and our hypothesis on the role of semi-quantitative
SARS-CoV-2 RT-PCR in differentiating a unique cluster. Hierarchical
clustering is a common unsupervised machine learning technique that aims
to group similar subjects by measuring the distance between them. The
variables of interest (i.e. SARS-CoV-2 Ct values and cytokine
concentrations) define a multidimensional space where the distances
between subjects are measured. We analyzed the variables explaining the
variance between the clusters and containing the most information in the
data by using principal component analysis. This was important as the
pro-inflammatory cytokines are correlated with each other. We retained
the first 3 principal components and the explanatory variables were
represented graphically as component loadings. The clinical, laboratory
data, immune characteristics and outcomes of the clusters were listed.
We compared the BAL and the plasma clusters by using the Dunn index.
Finally, we juxtaposed the semi-quantitative SARS-CoV-2 RT-PCR and
cytokine concentrations per week for both plasma and BAL. Analyses were
performed in R 4.0.3 (R Core Team. A language and environment for
statistical computing. R Foundation for Statistical Computing, Vienna,
Austria).