A Novel Non-Invasive Estimation of Arterial Blood Pressure from Electrocardiography and Photoplethysmography Signals using Machine Learning

Arterial blood pressure (BP) is one of the most important vital
signs of the cardiovascular system and it is often used as a measure
of health in general...

The oscillometric method is based on the detection of pressure pulses inside a cuff wrapped around the biceps or the wrist of the person [5][6][7]. Continuous methods to measure APB include the following: the pulse wave velocity method, the tonometry method, plethysmography and the pulse transit time-based method, to name a few [2,8]. The pulse wave velocity method measures the pulse wave's propagation velocity along the arterial tree. This method requires both knowing the exact distance between the sites in which the signals are captured and the presence of an expert operator to manually locate the carotid and femoral arteries. For this reason, pulse wave velocity methods are operator-dependent and prone to errors [8]. In the tonometry method, an orthogonal force is applied to the wall of a superficial artery against a bone.
A force sensor measures the pressure at the contact so that a local occlusion occurs in the superficial artery [8]. This method is extremely sensitive to movement, so it is not often utilized to measure BP [19,20]. Plethysmography is partially occlusive as it uses a small cuff around the index finger that maintains a constant blood flow at each heartbeat. BP is not directly measured, instead one measures the change of blood volume in the arteries. This method has difficulties in patients with low peripheral perfusion, hypothermia or low blood flow states [8,19]. It is widely accepted that the pulse transit time (PTT) is an index of arterial stiffness and, as such, it is often used to indirectly estimate blood pressure [21]. In recent years, some alternatives to estimating blood pressure based on physiological parameters have been studied. These alternatives are mainly based on biomedical signal processing, which has become an essential technology in modern medicine for clinical and biomedical research [2,4,22]. Algorithms that estimate BP using physiological parameters, such as electrocardiograms (ECG), photoplethysmograms (PPG), PTT and heart rate, among others, have been recently proposed [23][24][25][26][27][28][29]. However, most of these approaches utilizes several physiological signals and parameters to perform the estimation of the BP, whereas in this work we are interested in using the minimum physiological signals to perform a reliable BP estimation.
According to the Association for the Advancement of Medical Instrumentation (AAMI), the estimation of systolic and diastolic blood pressure must have a mean absolute error less than 5 [mmHg] and a mean absolute deviation less than 8 [mmHg]. Most researchers use this standard to verify their results [2] and it is the reference value we use in this work as well. Let us describe the most relevant methods for estimating the systolic and diastolic blood pressure using physiological signals.Photoplethysmography (PPG) is the volumetric measurement of an organ using a pulse oximeter.
PPG technology simply consists in a light source from a side of a specific tissue and a light detector at the opposite side [19,30,31].
Several works have extracted features from PPG signals, creating models to estimate blood pressure [24,[32][33][34][35]. In general, they obtain better results in a diastolic blood pressure (DBP) estimation than in a systolic blood pressure (SBP) estimation.Currently, the pulse transit time (PTT) is used as an indirect blood pressure estimation method [2,4,24,36,37]. Different studies have assessed the relationship between PTT and arterial BP [8], whereas other studies have generated models to estimate the blood pressure [21,23,24,36,37]. On average, their results were considered tolerable according to the AAMI standard.
However, calibration is often needed and other physiological parameters such as the subject weight and height must be measured, requiring extended processing time for each patient.
With an ECG signal, it is possible to calculate the instantaneous heart rate; therefore, it is possible to use this parameter to estimate blood pressure. Nonetheless, using only the heart rate, the results are not conclusive. It is, however, a good complement to other parameters such as PTT. Studies have used machine learning tools such as artificial neural networks (ANN) and classification trees to estimate blood pressure [18,26,28,[38][39][40][41][42][43][44][45][46]. These studies use variables such as body mass index (BMI), waist circumference (WC), age, level of sedentary lifestyle, smoking status and gender, among others to outcome a possible BP estimation.In this work, we aim to continuously estimate the SBP and DBP using a very few available physiological signals (namely ECG and PPG signals only) and machine-learning algorithms. We are interested in using these two signals because they are regularly measured in a non-invasive way on hospitalized patients and estimating the blood pressure in a continuous way without invading the patients is highly desirable in hospitals. To the best of knowledge, there is no machine-learning approaches for continuously estimate the blood pressure that solely relies on these two signals. Here we evaluate the performance of known models and propose new features that can be extracted from the ECG and PPG signals for estimating the blood pressure with accuracy.

Materials and Methods
The database (DB) used in this work is the MIMIC II Waveform Database downloaded from Physionet [47]. This DB contains several waveforms from multiple physiological signals collected from bedside patient monitors in adult and neonatal intensive care units [48]. Some waveforms included are ECGs, continuous arterial BP, fingertip PPG signals, respiration, amongst other signals. The DB is completely described and characterized in [48]. For the purpose of this study, we only selected those records that contained the continuous arterial BP signal, the ECG and the PPG signals. Here, we utilized only the best 239 records, selected based on their quality.
For example, if a record had a segment that one of the signals was zero for any period of time, it was discarded from the set.

Database signal processing:
Once the records were selected, we performed signal processing over the waveforms to reduce their noise and improve the signal quality before addressing the main issue of this work, which is using the ECG and PPG signals to estimate DBP and SBP in a continuous manner. Here we explain all the details of the signal processing.   can be affected by this noise but not necessarily in a simultaneous way.We implemented a motion-noise detection algorithm that first calculated the amplitude range (the maximum amplitude value minus the minimum amplitude value) every 1,500 samples of each signal. Second, the amplitude range (AR) was used to set a lower and upper limit for which the signals were allowed to vary their range. These limits are set by weighting the ARs by experimentally determined factors. Let us denote by ARECG, ARPPG and ARBP the amplitude ranges of the ECG, PPG and arterial BP signals, respectively. We set 0.83ARECG and 1.17ARECG as the lower and the upper limit for the ECG signal, 0.89ARPPG and 1.11ARPPG as the limits for the PPG signals, and 0.7ARBP and 1.3ARBP as the limits for the BP signals. Third, we scanned each signal using a moving window of 125 samples to evaluate if the AR of the signal within the window exceeded the allowed range. If one signal exceeded the range the corresponding window in all three signals were marked as invalid and no feature extraction was performed for such sections. (We explain the feature extraction processes latter).
We show the result of applying our detection algorithm over the three signals in Figure 2, where the noisy segments of the signals are clearly marked by our algorithm.  Heartbeat Phase Shift Compensation: Before feature extraction and systolic and diastolic pressure calculation, each heartbeat per window must be correctly identified. The identification problem is due to the phase shift existing from the PPG signal with respect to ECG signal, and the APB signal with respect to PPG signal. This phase shift is shown in Figure 3, where the shaded area marks one corresponding heartbeat.A maximum of 700 heartbeats per patient were processed and the minimum depended on the signal noise level. Out of the 239 patients, a total of 68,262 heartbeats were processed.Once the phase shift was corrected over the signals, the SBP and DBP were computed and the total of records was splitted into training and testing sets as detailed latter. Next, we describe the features we used in this work to estimate the SBP and DBP from ECG and PPG signals.

Calculation of blood pressure values
The pressure calculation is performed for each heartbeat and it is used as the ground-truth for the regression algorithms. To obtain the systolic pressure, a local maximum of the BP is determined. To find the diastolic pressure, the minimum value between BP peaks is determined. This process is repeated for each of 68,262 heartbeats, having the input matrix mentioned above and its corresponding expected outcomes. It was experimentally determined that the Bayesian regularization algorithm was better for both estimators. After training, the mean absolute error, mean absolute deviation, mean squared error and the correlation between predicted and actual output were calculated.

Support Vector Machine (SVM):
With Matlab we implemented the linear epsilon-insensitive SVM. We performed a 5-fold validation to test each regression (one for systolic and one for diastolic pressure estimation). Three kernel functions in each model were tested: Gaussian functions, linear functions and radial basis functions (RBFs) in order to assess the performance of the regression since there is no previous knowledge of the distribution of the data. Each time the regression were trained with a different kernel, their performance was evaluated with the corresponding test set for each fold. Then, the predicted outputs of all 5 folds were saved to calculate the mean absolute error, mean absolute deviation, mean squared error and correlation. Finally, 3 models for systolic pressure and 3 for diastolic pressure were tested, varying the kernel function used.

Results
Results for systolic and diastolic blood pressure estimation using regression models are presented as follows. The implementation of the aforementioned algorithms was performed in Matlab version R2016a.    For diastolic blood pressure.  For diastolic blood pressure.

Discussion
Estimation of SBP and DBP using ECG and PPG signals through machine learning algorithms is a field with very few results. Using two well-known algorithms (namely ANNs and SVM) SBP was estimated with results that were very close to the AAMI standard, whereas the estimation of DBP satisfactorily complied with that standard.Despite the fact that signals used were from 239 different patients, they all correspond to a population of ICU patients. It was not explicit which patients in the DBP were sick and which were healthy, and therefore it would be interesting to have databases that identify the condition of individual patients to create specific models for each group, or to be consider for a transversal model.
Noise is generally expected with physiological signals. For this work, the noise detector was robust, because the chosen signals were mostly clean, facilitating the noise detection. The three signals (ECG, PPG and BP) were evaluated separately because they presented noise in different samples throughout the registry, and as the acquisition of these signals was performed in different body locations of the patients, many different noises can be present simultaneously or not.Feature extraction was performed manually as the first step in using ECG and PPG for pressure estimation. To improve results, a next step could be to deliver the full heartbeat to models that automatically extract the features.
Three neural network models were created each for the systolic pressure and three for the diastolic pressure using the Matlab "Neural Network" tool. The model parameters that could be varied with this tool included the number of neurons used in the hidden layer and the learning algorithm. The learning algorithm that provided the best results in the experiments was 'Bayesian Regularization', leaving most of the analysis to determining the level of neurons in the hidden layer. Systolic pressure estimation was able to partially fulfill the AAMI standard because the mean absolute deviation was within permitted limits, but the mean absolute error was not. On the other hand, the estimation model was capable of fully meeting the standard for diastolic pressure.

Conclusion
This section is not mandatory but can be added to the manuscript if the discussion is unusually long or complex.
According to the above study, it can be concluded that machine learning algorithms are useful tools for estimating blood pressure values. Systolic blood pressure estimation showed promising results and was close to meet the AAMI standard, though it will be necessary to improve upon the regression models implemented.
In addition, diastolic blood pressure could already correctly meet the AAMI standard.With artificial neural networks, the estimation of systolic blood pressure can be improved by changing the parameters that were not modified in the implemented models.
Regarding the SVM algorithm, it can be concluded that the Gaussian Kernel function obtains better results for the estimation of both pressures. Even so, improvements are necessary for the systolic blood pressure estimation model, so the mean absolute error can meet the AAMI standard. Likewise, with the neural networks, the diastolic pressure estimation obtained excellent results, according to the AAMI standard, and therefore only minor changes should be made to improve the model.It is not possible to conclude which algorithm is better for the estimation of blood pressures, but both show promising results.With the obtained results, it would be interesting to evaluate which feature has a stronger correlation with the estimation of pressure and use it to create new regression models.
As mentioned earlier, determination of whether automatic feature extraction improves these results is pending.Despite not being the focus of this work, the implementation of new noise detection algorithms is important to estimate blood pressure values in real time and thus to obtain the pressure while the patients are being monitored. Finally, besides improving the models with the algorithms that were already used, searching for new automatic learning algorithms for blood pressure estimation is an area that can be investigated. In the case of neural networks, the work was carried out with artificial neural networks of a totally connected type with a single hidden layer. To potentially enhance the obtained results, it is possible to implement neural networks with a greater abstraction capacity, such as deep neural network architectures.

Conflicts of Interest
Declare conflicts of interest or state "The authors declare no conflict of interest." Authors must identify and declare any personal circumstances or interest that may be perceived as inappropriately influencing the representation or interpretation of reported research results. Any role of the funders in the design of the study; in the collection, analyses or interpretation of data; in the writing of the manuscript, or in the decision to publish the results must be declared in this section. If there is no role, please state "The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results".