Data Modeling
We have applied Multiple Linear Regression approach to the simulated
data set for COVID-19 subjects generated and analyzed to understand the
impact of each of the parameters on the outcome of disease severity. We
chose a multiple regression model since both, the outcomes and
predictors, were numeric. We used regression models to establish a
predictive transfer function and evaluated significance of results. In
this model, the relationship between independent variables (x1,
x2…xn ) with dependent variable (y ) can
be visualized by the equation, y=f (x1,
x2…xn). This is the transfer function that is
derived through analysis. The validity of the model was established
using ‘Goodness of Fit’ and ANOVA. The statistical significance of the
model was tested by evaluating residuals and F Ratio in one-way ANOVA,
based on the criteria of p <0.05 and goodness-of-fit with Adj.
RSq >90%. The assumption for this analysis was that each
of the parameters was independent. However, in cases where factual
patient datasets will be subjected to this type of analysis, there may
be multi-co-linearity within the parameters that should be rationalized
using dimensionality reduction methods [14, 15].
Since the model may not exhibit multiple combination of parameters in
the limited dataset of 45 subjects, we have used resampling methods
using Monte Carlo simulation to achieve a better density of
combinations. The simulation was applied for resampling of the transfer
function with 2000 runs, where a convergence was achieved after multiple
runs. The simulation was performed in order to understand the impact of
possible parameter combinations on clinical outcomes. Monte Carlo
simulation uses random variates from selected range of values to model
the impact of progression of events leading to outcomes.