A Real-Time Identification Algorithm and Epidemic Model Associated with Social Distancing Index for Control Applications: COVID-19 Analysis and Simulations in the United States and Model Associated with Social Distancing

The COVID-19 pandemic, caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is already the most worrying health problem worldwide. The virus spread rapidly across the countries, making many researchers looking for answers to mitigate the disease’s effects. While vaccines are not available on a large scale, mathematical modeling has allowed the scientific community to perform forecasts for decision-making and social distancing policies to decrease the velocity of the COVID-19 transmission. However, dynamic models must represent the pandemic’s reality considering validation criteria and an appropriate procedure that ensures realistic simulations. These principles are the only way to make an epidemiological model useful for studying and analyzing practical effects. From a control theory point of view, representative models ensure optimal solutions and allow a more reliable and robust control strategy. Therefore, this paper proposes a parameter identification algorithm for a dynamic pandemic disease model spread, including a new time-varying parameter for control applications. The developed framework uses an epidemiological model, denoted here as , capable of associating the real pandemic dynamics to its biological parameters and the population social mobility in real-time, which can be led by an appropriate optimal control strategy. Simulations and forecasts are performed comparing with official data


Introduction
The COVID-19 pandemic is indeed the most severe problem in terms of public health of the last hundred years in the world. The Although professionals well know this type of subject from the health area [2,3], other different scientific communities from mathematics, physics, and engineering have proposed strategies of forecasts, decision-making, and social distancing to decrease the velocity of the disease transmission [4][5][6]. In this context, the art of mathematical modeling [7] appears as an efficient technique to analyze and develop practical solutions for dynamic systems like an epidemic. The first mathematical model on theoretical epidemiology was proposed by Kermack and McKendrick [8] whose differential equations can represent the dynamics of susceptible (not yet infected with the disease) individuals, the number of active infected, and the individuals that are recovered and became immune to the virus. Known as the SIR model, researchers have used its properties and structure to describe the COVID-19 epidemic. In the work of Wu et al. [9], the SIR model was used to analyze the real transmission and death dynamics of the coronavirus in Wuhan, China. Furthermore, official data were employed in the model to estimate clinical severity and risks, which could support public health decision-making. Postnikov [10] demonstrated that the SIR model could be simplified to a logistic function in which simulation results presented good accuracy with real data. This model is applied in different forecast scenarios for India's current epidemic and four of its cities, as presented by Malavika et al. [11].
Nevertheless, equivalent models and expansions of the SIR model are used in epidemiological studies for COVID19. Annas et al. [12] analyzed the stability and simulated outlines in Indonesia using a SEIR model and considering the epidemic effects due to isolation and vaccine. Rajagopal et al. [13] proposed a SEIRD model with fractional derivatives to represent the disease in Italy and predict the outbreak's peaks, in which the authors base the results on real data. Thus, as classes of individuals, dynamic parameters, and intervention measures are included, these extended models became more complex [14][15][16]. Besides, other types of model which use signals and time series, deterministic and stochastic approaches can be found in the literature [17][18][19].
It is clear that these models are indispensable to understand the epidemic dynamics in order to avoid new infection and death cases, and promote governmental policies which enhance recovered individuals and decrease the negative effects of the COVID-19.
However, validation procedures are required to use the models in practice. Generally, the recent literature has presented COVID-19 epidemiological models with incomplete validation criteria or numerical algorithms which deteriorate the real parameters. On the other hand, the epidemiological models that represent realistic scenarios guarantee practical and feasible solutions. Hence, in particular, engineering applications such as simulation [20], design [21], education [22], optimization and control [23] bring us an important perspective to face this terrible disease.
Since few vaccines are previewed to be available to the people, especially in developing nations, social distancing is a fundamental method adopted by the governors to opposite the pandemic spread.
The intention is to lower the probability of the virus transmission and slow the number of infected people. Flattening the curve of active infected prevents the collapse of the health systems caused by numerous patients with COVID-19 being treated and contributes to preventing the disease spread. The critical matter is when applying the social distancing and for how long. In this sense, the control engineering aided by epidemiological models can determine strategies of social distancing that balance health and economic aspects [24][25][26][27]. For this, a social distancing variable has to be incorporated into the epidemiological models. Based on the preceding, an identification algorithm for epidemic control is proposed in this work. The procedure is composed of two layers: the first one uses a discrete analytical equation to calculate the exact parameters which represent the system dynamics, and the second one, employing an optimization formulation used as a slight adjustment, approximates as closely as possible the epidemic curves to the real data, aiming to fit the dynamic model. Moreover, a social distancing variable is incorporated into the algorithm to relate its real effects on the epidemic, allowing proper predictions and investigation for control strategies.
The paper is organized as follows: in Section 2, the proposed parametric identification algorithm is explained in detail using the adapted SIRD + Ψ model structure. In Section 3, the model identification is performed, validated, and applied to simulate scenarios in the United States at different periods of the COVID-19 pandemic. In Section 4, a perspective and design of optimal control techniques that could be employed using the epidemiological model to mitigate the pandemic are approached; and the conclusions are stated in Section 5.

The Structure Model and Algorithm Description
The Susceptible-Infected-Recovered-Deceased (SIRD) model is used in this work since its equations are able to adequately describe the dynamic behavior of the SARS-CoV-2 [28,29]. As shown in the following, a dynamic variable representing the social distancing responses due to isolation policies is included in the model, denoted as SIRD+ψ.

Epidemiological Model
The SIRD model examined in this paper describes the dynamics Thus, the SIRD+ψ model is expressed by the following nonlinear differential equations: The size of the total population exposed is denoted by ( ) N t and, in this work, it is assumed that natural deaths balance the newborns; which holds that wherein 0 N is the initial population size (before the contagion).
Moreover, the term ( ) ( ) ( ) As will be seen later, the effective reproduction number is given according to the epidemiological parameters calculated and the registered social distancing index along time.

Identification Algorithm
Recent numerical algorithms have been applied to calculate the parameters of the COVID-19 model [30][31][32][33]. Nonetheless, due to the degree of freedom given by Eq. 1, different parametric values can produce equivalent results in relation to the number of susceptible, infected, recovered, and deceased individuals. In other words, there is no one unique solution for the parameters which solver the numerical estimators. Although mathematical and graphical criteria have been used to validate these dynamic models when compared with real data, different values for the same parameters affect the epidemic characterization, for instance, the effective reproduction number (R t ). Therefore, in order to be able to describe the pandemic behavior, especially for forecast and control, it is proposed a two-step identification method for the parameters ( )  Furthermore, note that, for any generic parameter, ( ) ( ) Nevertheless, assuming that the data might be corrupted by minor issues, for instance, cases that are not reported in the day and are accumulated for the next day, or misreported cases, the calculated parameters are strictly adjusted by the second layer (optimization stage). This method allows improving identification reliability since this situation has been seen in different countries [26]. Once again, the available real data from the same interval  algorithm also fit the SIRD ψ + model curve with real data using the optimal parameters. It is worth mentioning that this is an innovative and fundamental advantage of the method proposed in this work: to identify the relation between real social mobility, which may be used as a means of control strategy, and the parameters of the current pandemic. As depicted in the work of Fernandez-Villaverde and Jones [34], dealing the SIRD ψ + model with β associated with social distancing, the effects on the epidemic can be analyzed as a function of transmission rate inherent to the virus and health policies, which connects the model better and improves accuracy to fit data. As a result, the proposed methodology allows reliable forecasting, mainly regarding stringency formulations.
The parameters β , γ , ρ , and ψ are used to calculate the pandemic growth velocity since they are related to the rate of infected invidious. Thus, assuming that, at the beginning of the pandemic, , t S N R ≈ can be approximated as follows:

Design of Parameterization and Implementation
Since government data sources disclose daily samples of the pandemic dynamics, counting the infections cases and deaths, the unit of time as calculus basis used in the algorithm is day. Therefore, it is intuitive to choose a sampling time of T 1 = 1 day for the discrete-

Results and Discussion
The identification procedure considers the number of cumulative cases, cumulative deaths, and active infected cases, for which we use the available dataset from official entities [35][36][37]. In this scenario, we propose the identification algorithm for T 1 = 1 and T 2 = 14 days, which means that the model is simulated every day and the parameters would slightly change every two weeks. For the social distancing index, we consider the rates provided by OpenPath [35]. This metric considers people's access patterns and entries in workplaces and several locations, including business facilities, gyms, healthcare stations, government locations, and educational centers.
The methodology is to compare people's access to these locations to their accesses before the COVID-19 pandemic, thus, being possible to illustrate the social mobility trend around the country.     These effects can be shown in Figures 6-8. In these cases, we used the first 224 day to identify the model parameters to the same time window T 2 = 14 days, and we use the last 16 days to validate the estimated pandemic curves. As can be seen, the model curves and the data agreement improves, including for the curves of new cases, new deaths, active infected individuals, and recovered cases. Moreover, since the model fit improves with more available data, it allows the designer to perform longer forecasting periods, significantly influencing control strategies' design. Finally, we perform the identification procedure to the total available dataset, using the 327 days (also, with T 2 = 14 days). In this last scenario, we still perform the validation stage using the last 20 days, keeping the last identified parameters constants in the forecast period. The parameters are identified to the first 307 days of data considering. Figure 9 shows the identified parameters for the last scenario and The effective reproduction number of all time including the three scenarios is shown in Figure 13. Although the Rt values found are consistent, it is important to highlight that its estimations depend on the measured data and, thereby, unreported or delayed reported cases produce some errors in these calculations. Furthermore, to illustrate the model fit, we propose the coefficient of determination metrics, R-square, which analyzes how properly the dataset can be explained by the model curves given by the percentage variation between both variables. The more the Rsquare approaches 1, the more reliable the identified model is to represent the dataset. Table   1 depicts the R-square coefficients for the three proposed scenarios.

Perspectives of Control Engineering Applications
As commented before, the field of control engineering may the D-RTO operates processes in an economic viewpoint [38,39].
The fact is that for any of these strategies, a valid model is crucial to assure reliable predictions and optimal conditions. Also, at least one manipulated variable is required to control other variables in any system. Thus, the proposed identification algorithm presented in this work ensures the applicability of control systems on the COVID19 pandemic, associating the social mobility index as manipulated variable, updating epidemiological parameters, and validating the model in real-time from collected data.
The approach is to propose optimal levels of social mobility considering economic and health aspects. As can be seen in Figure   14,  [27] with relevant results. Therefore, the system presented here has a real possibility to be implemented in practice.

Conclusion
The paper proposed an identification algorithm able to calculate epidemiological parameters considering the social distance effects.
The procedure uses a SIRD+ Ψ structure and discrete analytical solutions to ensure correct parameters according to measured data and the epidemiological concepts. The pandemic's velocity can be estimated using the effective reproduction number, and simulations and forecasts are performed to support decisionmaking. The proposed scenarios have shown that the estimated model dynamics become more precise with more data. These results are achieved by performing the procedure for identifying parameters using a moving window associated with the real data. When it is provided more numerous and reliable data to the algorithm, the better is the model fitting results. Moreover, this can benefit algorithm forecasting, providing longer predictions with small errors. Furthermore, the parameter identification algorithm is adjustable and can be tuned in various manners to accomplish finely adjusted curves. The cost function can be customized to include other compartments to minimize the error or be chosen different weight constants to prioritize the optimized variable.
According to the data fidelity used to estimate the model curves, the uncertainty bounds can also be changed. The sample time of the parameters change can be chosen to adjust the local data situation correctly. Finally, it is demonstrated that control engineering can be an alternative to deal with the pandemic if no enough vaccines are still available. The proposed algorithm can be incorporated into a control system to offer an adaptive model and predictions that allow optimal social distancing guidance. Thus, future works including dynamic optimization and applications will be done to investigate the impact in the U.S.