Deep Learning Analysis for Blood Glucose Monitoring Using Near Infrared Spectroscopy

Diabetes Mellitus (DM) is an intractable condition in which blood glucose levels cannot be regulated normally by the body alone [1] it has many complications, including heart disease and stroke, kidney failure, blindness or vision problems, diabetic neuropathy and diabetic foot. Treatment methods include dietary regulation to control blood glucose levels, oral medication, and insulin injection, and all of these treatments should rely on blood glucose measurement. Diabetic patients are encouraged to check their blood glucose levels several times per day [1]. Currently, the most common means of checking is by using a self-monitoring blood glucose meter [2]. In this way, diabetic patients can obtain a clear picture of their blood glucose levels for therapy optimization and for insulin dosage adjustment for those who need daily Received: September 11, 2019

injections. However many people dislike using sharp objects and seeing blood because there is a risk of infection. Over the long term, this practice may also result in damage to finger tissue. Given these realities, the advantages of a non-invasive technology are easily understood. The objective of this study is to construct a noninvasive blood glucose monitoring device based on Near-Infrared (NIR) sensing that is stable and easy to use to detect the spectral response from human tissue. NIR spectroscopy is selected because of its promising technology, among others, for non-invasive blood glucose monitoring [3]. This study adopts the four-stage conceptual framework of bio-signal processing [4] which shows that processing bio-signals should normally consist of a.
Signal acquisition or measurement b.
Signal transformation or signal pre-processing c.
Parameter selection or variable/feature selection d.

Signal classification or signal interpretation
In the first stage, signal acquisition, a non-invasive blood glucose monitoring device based on NIR spectroscopy is set up.
NIR light is transmitted through an optical cable to the human skin surface and the reflected light is collected for computing blood glucose concentration in human blood. Second stage, also called pre-processing, the signals are transformed in a way that such problems can be simplified through a suitable choice of pre-processing method, and misleading results can be avoided.
Since the pre-processing of NIR spectral data is important for the subsequent multivariate analysis, various normalization schemes and pre-processing techniques are evaluated and reported. The third stage delivers relevant variables, also called features, which can be used for decision making. The features are extracted by some complex searching algorithms to distinguish those signal features that contain discriminatory power, and normally they can reduce the size of data so as to compute the diagnostically most significant parameters. Once the signal parameters have been obtained, they are used for further decision making in the interpretation stage.
The classification stage of bio-signal processing is the stage of identifying which set of categories a new measurement belongs to, on the basis of a set of training data containing relevant signal parameters whose category membership is known. This study presents an improved algorithm to deal with the signal classification stage that can provide more accurate classification results as well as enhance the stability of signal interpretation.

Signal Acquisition
This study was based on clinical validation of blood glucose levels between laboratory results and NIR absorption spectroscopy analysis. In the in-vitro validation process, the human blood glucose level was employed as the gold standard, with the spectrometry method used for the testing. Subjects were recruited from the community by convenience. All subjects who were interested in the study and able to provide a voluntary informed consent were recruited to the study. All known infectious disease subjects were

Signal Transformation
Pre-processing of spectral data is an important procedure before chemometric analysis according to the four-stage framework of bio-signal processing because scaling differences arise from path-length effects, scattering effects, source variations, or other instrumental sensitivity effects in NIR spectroscopy that will influence the measurement of the recorded sample. Various pre-processing techniques such as normalization process and prefiltering process are explored and evaluated.

Normalization
In spectroscopic measurement, scaling differences arise from path length effects, scattering effects, source variations, or other instrumental sensitivity effects. A normalization process attempts to correct for these kinds of effects by identifying some aspect of each sample that should essentially be constant from one sample to the next and correcting the scaling of all variables based on this characteristic. Several different types of norms exist. 1-norm is used to divide each variable by the sum of the absolute value of all variables for the given samples [5]. Another commonly used norm in 2-dimensional Euclidean space is the 2-norm. The Standard Normal Variate (SNV) normalization process is a weighted normalization [6]. It is different from the 1-norm or 2-norm mentioned above because not all samples contribute to the normalization equally.
Multiplicative Scatter correction (MSC) is another method for normalization [6]. It is based on the idea of correcting the scatter level of all spectra of a group of samples to the level of an average spectrum [7].

Performance Comparison -Normalization
The performance analysis is assessed by the root mean square error (RMSE) and the correlation coefficient. RMSE helps describe the fit of the model to the training data. It is defined as follows: where y _i are the values of the predicted result and n is the number of training samples.
The correlation coefficient (R) which is used for comparing the correlation between the predicted value and the actual value. It is a measure of how well the predicted values from a forecast model fit with the data. As the strength of the relationship between the predicted values and actual values increases so does the correlation coefficient. Therefore, the higher the correlation coefficient the better the predicted model. The correlation coefficient is given by the formula: where x ̅ and y ̅ are the samples means of variables x and y The PLS regression model [8] was used to study the performance of various normalization processes as it is widely  (Table 1) shows the results for the four normalization processes. The results shows that 2-norm performs best, and that 1-norm and 2-norm provide very similar RMSE and R, while SNV and MSC deliver relatively higher RMSE than 1-norm and 2-norm and the correlation coefficients are relatively smaller than they are. Since 2-norm is a form of weighted normalization where larger values are weighted more heavily in the scaling and it is also the most commonly used normalization process in 2-dimensional Euclidean space, 2-norm was used in the following analysis to take advantage of the uneven weighting mechanism.

Pre-Filter Analysis
Before training the quantitative model, it is possible to manipulate the spectra using various pre-filter methods to try to improve the performance of the quantitative model. Many of the interferences, often described as noise, are caused by known background signals and artefacts. Pre-filtering the signals to remove this noise or the effects of signal variance can be useful for obtaining a more accurate quantitative model [9]. A variety of prefilter methods exist to remove the interferences [10]. This section evaluates various kinds of pre-filter methods for the NIR spectra to enhance performance of the PLS prediction model. The Savitzky-Golay (SG) derivation is commonly used for numerical derivation because it includes a smoothing step when it takes the derivative, in which it can improve the utility of data [11].
This method requires selection of the window size (filter width) and the order of the polynomial [12]. The Goal of a Generalized Least Squares (GLS) weighting pre-filter is to down-weight the differences between similar samples and thus make them appear more similar. A GLS weighting pre-filter can be used to remove variance from the spectral responses that are mostly orthogonal to the concentration information. The GLS weighting pre-filter method involves the calculation of a covariance matrix from the differences between similar samples. Orthogonal Signal Correction (OSC) was introduced to remove systematic variation from the spectral responses that is unrelated or orthogonal to concentration information [13]. Such variance is identified as some number of components of the spectral responses that have been made orthogonal to the concentration.

Performance Comparison -Pre-Filter Analysis
Various pre-filter methods mentioned previously were compared; each method had its own parameter settings and the results are listed in Table 2 The first SG derivative and second SG derivative methods were implemented. The GLS weighting pre-filter method with various α values were evaluated. Moreover, the OSC pre-filter with various PCs and the tolerance values is listed as well.
The results with and without normalization are shown together for comparison. This experiment considered performing the pre-filter with and without normalization for their own parameter settings.
Comparing the results with and without normalization showed that the normalization process can improve the RMSE and R for all prefilter methods, particularly for the GLS weighting pre-filter method.
Therefore the results further revealed the benefit of implementing the normalization process and the pre-filter process thereafter so as to provide better performance for analysis. According to the simulation results, GLS weighting pre-filter performed the best amongst all methods. It yielded the smallest RMSEcv and highest Rcv when α=0.005.

Variable Selection
The high dimensionality of spectral data increases the difficulty of using quantitative regression models. Reducing this high dimensionality helps reduce variable numbers, potentially improves the accuracy by removing irrelevant spectral information, and reduces computation time and cost. In this section, two heuristic optimization algorithms, sequential Floating Selection (SFS) and Genetic Algorithm (GA), are applied to variable selection for the spectral data. Sequential Floating Selection (SFS) is a well-known suboptimal search algorithm that is very efficient and effective even for problems of high dimensionality involving non-monotonic Feature Selection [14]. Genetic Algorithm (GA) is a search heuristic that mimics the process of natural evolution [15]. This heuristic is routinely used to generate useful solutions to optimization and search problems. The findings from previous section were applied to perform these features selection methods. Considering the cross validation results of SFS and GA feature selections in Table 3, the study found that the GA method that selected 120 features (wavelength) provided the smallest RMSEcv and the highest Rcv amongst other results. In addition, the runtime of SFS and GA were also noted.

Performance Comparison -Variable Selection
The analysis results must be interpreted with caution, especially for the GA method; it usually considered the entire set of selected features (120 wavelengths) to be used as a whole in the experiment because a feature may only be helpful to prediction when used in conjunction with other features included in an individual feature subset.

Signal Classification
In the classification approach, the sample properties that algorithms [17]. Support Vector Machine (SVM) is a supervised learning system that uses a hypothesis space of linear functions in a high dimensional feature space, trained with a learning algorithm from optimization theory that has originally been used for classification analysis [18]. With a suitable kernel, SVM can separate in the feature space the data that were non-separable in the original input space. PLS regression is a well-established tool in chemometric analysis [8]. This technique is commonly used in spectral quantitative analysis. Scientific research often involves using variables that are easily (or cheaply) measured to explain or predict the behavior of response variables that are often much more difficult (or expensive) to acquire. When the factors are many in number and are highly collinear such as in spectroscopy, PLS is a robust method used to construct predictive models. The advantage of the PLS is that the spectral and concentration information are included in the calculation of the factors and the scores. The idea of the MC-PLS-DA [19] is to create a large number of PLS-DA prediction models using different training subsets that are selected randomly from the whole training data set by the Monte Carlo method [20].
The stability of the corresponding coefficients is calculated by using the regression coefficients of these models. The MC-PLS-DA prediction model is then obtained by averaging the PLS-DA models so that the outliers can be removed. In the proposed method, the raw data are randomly divided into two parts -the training set

Performance Evaluation -Signal Classification
In this study, the blood glucose level (BGL) was divided into two classes, with BGL falling between 4mmol/L and 7mmol/L considered as normal BGL, and BGL greater than 7mmol/L defined as high BGL; that is, suffering from DM. The performance of the MC-PLS-DA model can be evaluated by the sensitivity, specificity, and overall accuracy of the model. Sensitivity is a measure of the ability of the classifier to identify normal BGL. Specificity is a measure of the ability to identify high BGL. Accuracy is a measure of the ability to identify the correct BGL. 840 Samples were randomly divided into the training data set and the prediction data set. The former contained 756 samples (90% of the total number of samples) while the latter contained 84 samples. 530 samples were then randomly taken from the training data set to create a PLS-DA model, where the correlation between the predicted and reference values was maximized.
All the samples in the prediction data set were used to fit the training model and evaluate the classification rate. By the same token, another 530 samples were drawn the same subset of training data to create a new PLS-DA model. The procedure was the carried out repeatedly until 1,000 PLS-DA models were built. The top 5% (50) of the models, in terms of classification rate, were selected to create the MC-PLS-DA model with the same training and prediction data sets. The entire algorithm was then executed repeatedly for a total of 20 runs. At each run, the 840 samples were randomly divided to produce different training and prediction data sets, so that a total of 20 MC PLS DA models were obtained. The sensitivity, specificity, and the accuracy of these 20 MC-PLS-DA models were averaged to evaluate the performance of the algorithm. Based on the findings obtained from the previous experiments, a MC-PLS-DA model was created to predict the BGL for NIR spectral data collected. The number of LVs and PLS-DA models was 6 and 5,000 respectively with reference to [19][20][21][22][23][24][25].

Discussion
This study successfully adopted the four-stage framework of bio-signal processing to handle the NIR spectroscopy that is used to measure blood glucose level. A vigorous systematic approach from signal acquisition to signal interpretation has been established.
These include, in the first stage, the assembly of the specific equipment for NIR spectroscopy and suggested the left fourth finger as the measurement site to acquire NIR spectral data. In addition, the study proposed using a 2-norm normalization process, together with the GLS weighting pre-filtering process, during the signal transformation stage so as to minimize the disturbance noise caused by the previous stage. After that, the study suggested using the genetic algorithm for wavelength selection in the variable selection stage. Finally, in the signal interpretation stage, improved methods based on the Monte Carlo approach for the PLS discriminant analysis were introduced, which were called MC-PLS-DA for classification analysis to predict the result (Figure 1) shows the proposed methods together with the four-stage framework for easy reference. The capability to determine blood glucose noninvasively in humans is a complex and difficult analytical problem.
Many researchers and industry groups have attempted to measure blood glucose by various non-invasive methods (Table 5) lists the manuscripts that the experimental setup used NIR spectroscopy with multivariate analysis in testing human subjects. It is relatively simple to measure data and study the correlation with glucose solution tests or blood glucose levels under controlled conditions in research laboratories with only a few subjects/samples. The challenge here is to measure these variables in normal and practical environments with large datasets of different subjects.   The classification output results for the relationship between the response and the independent variables is more accurate, hence enhancing the reliability of the classification model, especially for such a large training set in this study. These advantages make MC-PLS-DA a possible way for non-invasive blood glucose measurement using NIR spectroscopy.

Conclusion
In this century, particularly in wealthy developed countries, diabetes is a complex group of syndromes that have a disturbance in the body's use of glucose. It can be controlled by an appropriate regimen that includes patient education, weight management, diet control, sensible exercise, oral medication, and insulin therapy.
However, these diabetes management and medications rely heavily on blood glucose measurement. With ever-improving advances in diagnostic technology, the race for the next generation of bloodless, painless, accurate and consistent blood glucose measurement instruments has begun. Nevertheless, many hurdles remain before these products reach commercial markets. Considerable progress has been made in the development of non-invasive blood glucose monitoring devices. In principle, the approach can be used for screening purposes for DM prevention. However, for diabetes that needs frequent testing, using invasive blood glucose measurement via finger pricking remains a practical way to provide suitable information for diabetes management at this current stage.