Chemometric-Infrared Spectroscopic Model for the Taxonomy of Medicinal Herbs - The Case of Perennial Sideritis Species the

The use of medicinal plants and herbs is increasing worldwide and at the same time the demand of consumers to know the botanical and geographical origin of the herbal raw material is growing. Sideritis is a genus that comprises several of the most commonly used medicinal herbs in Greece and the Balkans. The taxonomy of the medicinal Sideritis species which are found in Greek flora was investigated using Diffuse Reflectance Fourier Transform Infrared Spectroscopy (DRIFTS) in combination with discriminant analysis. Dried flowers’ powder of 44 samples of seven Sideritis species and subspecies were analyzed using the 1700-1200 cm -1 spectral region in 1 st derivative form for the statistical model building. Forty two samples (95.4 %) were correctly classified. Another set of 14 Sideritis samples validated the statistical model at an 85.7 %. The proposed method is simple, rapid, non-destructive, economical, and environmentally friendly.

(Lamiaceae) comprises a taxonomically difficult group of perennial species distributed in the Eastern Mediterranean region [1][2][3][4][5]. All the members of the particular section contain essential oils and are traditionally widely used as tea-producing medicinal plants. In Greece Sideritis species are intensively picked from the wild; they are collectively named 'tsai tou vounou' (mountain-tea), highly cherished and occupy a distinguished position in folk medicine. It is worth mentioning that the name of the genus derives from the Greek word 'sidero' (iron) denoting a herb that heals wounds caused by iron weapons [6]. All the perennial species of the genus are well known for their anti-inflammatory, anti-ulcerogenic, digestive, antimicrobial and antioxidant properties. In folk medicine the plant material is used dried mainly to boil decoction for the treatment of cold, influenza, feverishness, cough and sore throat [4,[7][8][9][10].
The cytotoxic properties as well as the essential oils from Sideritis species have been investigated quite extensively through the last decades [7,11].
Perennial Sideritis taxa exhibit morphological similarities to a high degree and often their identification is problematic.
Moreover, they are characterized by a strong tendency to hybridize and consequently, well defined species in the East Mediterranean sect. Empedoclia are few [2][3][4][5]. Yet, works have been published on chemotaxonomy [5,10] and DNA barcode [12]. The Sideritis chemical composition and antioxidant activity is related to its taxonomic placement [8]. At the same time, the growing demand of consumers and pharmaceutical merchandisers to know the geographical origin as well as the botanical identity of medical and aromatic plants drives the relevant research to the development of efficient methods for the differentiation and determination of species. In the last years Fourier Transform Infrared (FTIR) spectroscopy has been used for discrimination of botanical origin in combination with chemometrics. We mention the differentiation of Leishmania species [13], honey samples from different botanical origins [14], and vegetable oil [15].
The aim of this work is the investigation of seven Sideritis taxa botanical origin using Diffuse Reflectance Fourier Transform Infrared Spectroscopy (DRIFTS) in combination with discriminant analysis.

FTIR Spectroscopy
Triplicate FTIR spectra of each sample was recorded in DRIFTS mode using a Thermo Nicolet 6700 spectrophotometer equipped with a deuterated triglycine sulfate (DTGS) detector, resolution 4 cm -1 and 100 scans per sample using a Spectra Tech microcup (diameter 3 mm, height 2 mm) DRIFTS accessory. Pure dried KBr in powder form was used for the background spectra. The collected spectra were smoothed, and the baselines were corrected with the automatic factions of the software accompanying the spectrophotometer (OMNIC 7.3, Thermo Fisher Scientific Inc.).
Then the average spectrum of each sample was measured, and its absorbance axis was normalized from a value 0 to 1, using the above software.

Discrimination Analysis
The TQ Analyst software (ver. 8.0.0.245, Thermo Fisher Scientific Inc.) was used for the discrimination analysis. Seven

Spectroscopic Study
A typical mid-FTIR spectrum is extended from 4000 to 400 cm -1 ( Figure 1). The spectral range of 4000-1700 cm -1 is similar for every Sideritis sample. The most interesting spectral range is the 1700-  2) S. scardica;

Chemometrics
The discriminant analysis shows that 42 of 44 samples (95.4 %) were correctly classified (Table 1). Table 2 shows the corresponding Mehalanobis distances validation set. The validation exhibits that 12 of the 14 samples (85.7 %) were correctly recognized. Figure   2, as it was extracted by the software, is a representative of the discriminant analysis based on Mahalanobis distance, between Sideritis clandenstina, Sideritis raeseri and Sideritis peloponnesiaca.

Spectroscopic Study
A typical mid-FTIR spectrum is presented in Figure 1. The differences between the spectra are expected in the range 1700-1200 cm -1 , because proteins, flavonoids, terpenes, polyphenols, nucleic acids, lignin and polysaccharides absorb. The above compounds are differentiated qualitatively or/and quantitatively depending on the Sideritis species [5,12]. Additionally, the range 1500-1200 cm -1 is the most important part of «fingerprint» area and characterizes every sample. In the spectral region 1700-1200 cm -1 of Sideritis species FT-IR spectra we observe that the differences between spectra are small and focus mainly on small shifts of the maxima, the width and the relationships between the peaks' heights.
Eight major peaks appear in the above spectral region. The first peak at 1655-1647 cm -1 has been correlated with the C=O stretching of proteins (amide I) [16]. In the same spectral region water [16,17], the C=O of flavones [16], the bases of nucleic acids [18,19] and the C=C [16] absorb as well. The peak centered at 1618-1607 has been assigned to -COOasymmetric stretching [19,20].
Also, the N-H bending and C-N stretching vibrations of proteins (amide II) and the aromatic C=C absorb [16]. The third absorption at 1509-1508 cm -1 has attributed to the deformation of phenyl ring [16] and has been associated with the existence of lignin [21,22].
The next peak which presents maximum at 1454-1446 cm -1 consists of several overlapping peaks. The stretching of C=C, the deformation of -CH 3 and -CH 2 CO-, C-N stretching and N-H bending (amide III) [16]. The fifth spectral region centered at 1429-1417 cm -1 is a result of -CH 2 -bending, -OH deformation, -COH bending of phenols, -COOsymmetric stretching vibrations and C=O of uronic acids [16,23]. The next absorption at 1380-1371 corresponds to -CH 2 -bending [16], in-plane O-H deformation and C-O combination, and C-C skeletal vibration [16]. The seventh peak at 1325-1317 cm -1 is a convolution of skeletal vibrations of C-C and C-O mainly of polysaccharides [16,23]. Finally, the last peak at 1263-1252 cm -1 has been assigned to the -OH of polysaccharides, asymmetric stretching of PO 2 -of nucleic acids and C-O stretching [16,19,22].

Chemometrics
As mentioned above, the most suitable spectral region for Sideritis seven species and subspecies discrimination is 1700-1200 cm -1 . Indeed, the largest differences between the spectra were located in this spectral region, but most peaks appear as shoulders.
The spectra obtain a finer texture and the differences between them are maximized using the 1 st derivative (Savitzky -Golay method with a 15-point window and a 2 nd -order polynomial). Twenty principal components (PC) were used, according to cumulative eigenvalues diagnostic plot, with a 99.99 % cumulative value (Figure 3). TQ Analyst software creates one principal component spectrum (PCS) for each PC. Each PCS represents an independent source of variation in a data set that represents the amount of variability described by a PC measured across the entire spectral range of the standards [24]. PCS20 (Figure 4) shows that the major changes are highlighted in the spectral range 1700-1200 cm -1 , which was chosen for the discrimination. This observation enhances the selection of this spectral range to distinguish the samples.

Conclusion
In this study the use of DRIFTS in combination with chemometrics was investigated for the taxonomic determination of seven Sideritis species and subspecies. The spectroscopic model developed showed a 95.4 % success rate and was validated 85.7 % using unknown samples. The results are considered very satisfactory. Furthermore, the proposed method is rapid, simple, non-destructive for the samples, economical, and environmentally friendly.