Data Modeling
We have applied Multiple Linear Regression approach to the simulated data set for COVID-19 subjects generated and analyzed to understand the impact of each of the parameters on the outcome of disease severity. We chose a multiple regression model since both, the outcomes and predictors, were numeric. We used regression models to establish a predictive transfer function and evaluated significance of results. In this model, the relationship between independent variables (x1, x2…xn ) with dependent variable (y ) can be visualized by the equation, y=f (x1, x2…xn). This is the transfer function that is derived through analysis. The validity of the model was established using ‘Goodness of Fit’ and ANOVA. The statistical significance of the model was tested by evaluating residuals and F Ratio in one-way ANOVA, based on the criteria of p <0.05 and goodness-of-fit with Adj. RSq >90%. The assumption for this analysis was that each of the parameters was independent. However, in cases where factual patient datasets will be subjected to this type of analysis, there may be multi-co-linearity within the parameters that should be rationalized using dimensionality reduction methods [14, 15].
Since the model may not exhibit multiple combination of parameters in the limited dataset of 45 subjects, we have used resampling methods using Monte Carlo simulation to achieve a better density of combinations. The simulation was applied for resampling of the transfer function with 2000 runs, where a convergence was achieved after multiple runs. The simulation was performed in order to understand the impact of possible parameter combinations on clinical outcomes. Monte Carlo simulation uses random variates from selected range of values to model the impact of progression of events leading to outcomes.