Machine Learning is a branch of artificial intelligence that provides algorithms able to learn automatically, improve from experience, and make previsions. In the last years several machine learning algorithims have been developed in medical field, from imaging to big data analysis, obtaining applications for both diagnosis and prognosis. In this mini review, we report three our applications of machine learning in medicine: the first regards the research and classification of pulmunary nodules in computer tomography studies; the second, based on magnetic resonance studies, provides a classification method to be use as an aid in multiple sclerosis diagnosis; the third concerns the probability to be positives to miocardial perfusion imaging, using demographic and clinical data of patients.
Machine learning algorithms is having a large impact in life science and medical research. In the last years many progresses have been made in the development of machine learning in clinical diagnosis, prognosis and drug development. In this mini review, we reported our experiences in three developed pattern recognition and classification methods, applied on imaging and clinical data as decision support tools.
What is Machine Learning?
Machine Learning (ML) is a branch of artificial intelligence that provides algorithms able to learn automatically and improve from experience without being explicitly programmed. Usually, ML tasks are categorized as supervised, unsupervised or semi-supervised. In supervised learning, the algorithm builds a mathematical model from a dataset of variables where both the inputs and the outputs are known. On the contrary, in unsupervised learning, the outputs are not known. Algorithms that develop mathematical models from incomplete (missing) training data are called semi-supervised learning.
Starting from the analysis of a known training dataset, a supervised learning algorithm can apply to new data what it has learned in the past to predict future outputs, by an inferred function. This kind of algorithm can also compare its output with the correct output, in order to modify the model accordingly. Exploring a dataset, an unsupervised learning can infer a function to describe a hidden structure. Semi-supervised learning is chosen when dataset require particular skills and relevant resources to train it and learn from it .
Arthur Samuel first conied the term Machine Learning in the last ‘950 . At this time, more used supervised ML are: Support Vector Machine (SVM), Decision Tree (DT), Artificial Neural Network (ANN), Fuzzy Logic (FL), Naive Bayes, Random Forest, and k-Nearest Neighbors algorithms. Among unsupervised ML we mention Hierarchical Clustering and k-means algorithms [1,2]. Semi-supervised learning machine can be viewed either as an extension of supervised or unsupervised learning machine, and one of the most commonly used algorithms is the Transductive Support Vector Machine algorithm .
Machine Learning in Medicine
Machine learning algorithms is having a large impact in life science and medical research. In the last years many progresses have been made in the development of ML in clinical diagnosis, prognosis and drug development [4-6]. Several factors influence the design of ML for medical implementation: data type and volume, model interpretability, inferences time. Another important issue is how the model balances overfitting and underfitting. Typically, to evaluate a ML models for healthcare two metric classes are used: discrimination and calibration. The first metric measures the ability to correctly rank or distinguish the membership class. For this purpose, the most common threshold-free discriminative metric is the area under the receiver operating characteristic curve. The second metric evaluate how well the predicted probabilities match the actual probabilities. In this setting, calibration metrics (i.e., the Hosmer-Lemeshow statistic) are crucial for real-world use. Another approach is based on decision curve analysis, which puts benefits and harms on the same scale to evaluate the expected costbenefit according to threshold probabilities .
Evaluating the accuracy of a developed clinical model is a necessary step of the translational process, but a good performance may be not sufficient to generate clinical impact. The implementation of a ML model in clinical practice may have several challenges and no model will be 100% accurate in real-world scenarios. Other elements in effectively using a ML models are the possibility of overreliance or under-reliance, i.e. accepting indiscriminately or entirely ignoring the predictions obtained. Finally, the implementation of these technologies can be complicated by factors as the need of expensive hardware or the lack of software infrastructure support, such as a reliable network access.
We Briefly Report our ML Applications, both in Imaging and Clinical Data
A Computer Aided Detection (CAD) systems was developed for classifying suspect nodules in pulmonary computer tomography scans . It consisted of several pattern recognition modules, based on statistical and adaptive algorithms:
i) Lung parenchyma segmentation;
ii) Detection of nodule candidates;
iii) Morphological and statistical features extraction;
iv) False positive reduction;
v) Classification of nodule candidates. The CAD system produced a large number of false positives, reduced through a linear filter on morphological features (volume and roundness), while the classification of the remaining nodule candidates was obtained by means of SVM and ANN, both trained and tested in cross validation mode. The CAD system was tested on two low dose CT image databases acquired from different institutions for dissimilar purposes: 20 CT images made available by the Pisa centre of the Randomized Controlled Trial ITALUNG-CT and 83 CT images from the LIDC publicly available research database. The results obtained by the CAD system were of about 68% efficiency at 4 false positives for both ITALUNG-CT and LIDC-CT, while a little best performance of ANN whit respect to SVM was observed .
A multivariate FL analysis of brain tissue volumes and relaxation rates (R1 = 1/T1 and R2 = 1/T2, where T1 is the spin spinlattice and T2 is the spin-spin relaxation times) for supporting the diagnosis of Relapsing-Remitting Multiple Sclerosis (RR-MS) was developed on a retrospective Magnetic Resonance Imaging (MRI). The study population comprised 81 patients with diagnosis of RRMS and 29 healthy people . The MRI studies was segmented with a multiparametric relaxometric method previously developed for tissues classification  based on the relaxation rates and proton density of the voxels. For the FL inference process, several operators were employed, and one- and multi-dimensional models were generated and compared with those obtained with other state-ofthe- art methods, confirming the goodness of FL-based method in terms of performance, interpretability and robustness. The twodimensional model, based on abnormal white matter fraction and R2 gray matter, was the most promising, presenting an accuracy of 89%, a sensitivity of 94%, a negative predictive value of 82%, and a positive predictive value of 92% .
We also developed a DT classifier in a study evaluating the temporal trends of abnormal myocardial perfusion imaging . This study comprised 8886 patients underwent to stress single photon emission computer tomography (MPS) in 12-year time for evaluation of suspected or known Coronary Artery Disease (CAD). The algorithm classified patients with normal or abnormal stress MPS using demographic and traditional cardiovascular risk factors as features. The decision tree for predicting abnormal MPS produced several terminal groups. The initial split was on known or suspected CAD followed by gender. For women without known CAD, no further split was performed with an overall prevalence of abnormal MPS of only 12%. Conversely, the highest prevalence of an abnormal study (75%) was found in men with known CAD, typical angina, and hypercholesterolemia .
In the last decades ML algorithms applied to the medicine have slowly entered into the process of decision-making. Depending from data type, several tools have been developed in data analysis, imaging, and data mining, with the aim to do predictions and discovering new relationship among diseases. Actually, SVM and ANN are the more used algorithms, but new tools such as Deep Machine Learning (DL), obtained using more ANN, are more and more frequently used. Actually, specific DL-models have been built based on complex disease as the congenital heart disease in adult patients . Future challenges reside in develop of more sophisticated algorithms, such as DL, to solve complex problems, but the objectives are also those of having an increasingly personalized medicine [13,14].
- Alpaydın E (2010) Introduction to machine learning. The MIT Press Cambridge, Massachusetts London, England, pp. 316.
- Zhu X (2008) Semi-Supervised Learning Literature Survey. Computer Sciences TR 1530.
- Samuel A (1959) Some studies in machine learning using the game of checkers. IBM Journal of Research and Development 3(3): 210-229.
- Chen P, Liu Y, Peng L (2019) How to develop machine learning models for healthcare. Nature Materials 18: 410-427.
- Deo RC (2015) Machine learning in medicine. Circulation 132: 1920-1930.
- Rajkomar A, Dean J, Kohane I (2019) Machine learning in medicine. N Engl J Med 380: 1347-1358.
- Vickers AJ, Elkin EB (2006) Decision curve analysis: A novel method for evaluating prediction Models. Med Decis Making 26: 565-574.
- Gargano G, Bellotti R, De Carlo F, Megna R, Tangaro S, et al. (2010) A new CAD system for lung nodule detection on low dose CT validated on publicly research database. Int J CARS 5(Suppl 1): S207-S215.
- Pota M, Esposito M, Megna R, De Pietro G, Quarantelli M, et al. (2019) Multivariate fuzzy analysis of brain tissue volumes and relaxation rates for supporting the diagnosis of relapsing-remitting multiple sclerosis. Biomedical Signal Processing and Control 53: 101591.
- Alfano B, Brunetti A, Larobina M, Quarantelli M, Tedeschi E, et al. (2000) Automated segmentation and measurement of global white matter lesion volume in patients with multiple sclerosis. J Magn Reason Imaging 12(6): 799-807.
- Megna R, Zampella E, Assante R, Nappi C, Gaudieri V, et al. (2019) Temporal trends of abnormal myocardial perfusion imaging in a cohort of Italian subjects: Relation with cardiovascular risk factors. J Nucl Cardiol Vol 7.
- Diller GP, Kempny A, Babu-Narayan SV, Henrichs M, Brida M, et al. (2019) Machine learning algorithms estimating prognosis and guiding therapy in adult congenital heart disease: Data from a single tertiary centre including 10019 patients. Eur Heart J 40(13): 1069-1077.
- Handelman GS, Kok HK, Chandra RV, Razavi AH, Lee MJ, et al. (2018) eDoctor: Machine learning and the future of medicine. J Intern Med 284(6): 603-619.
- Shah P, Kendall F, Khozin S, Goosen R, Hu J, et al. (2019) Artificial intelligence and machine learning in clinical development: A translational perspective. NPJ Digit Med 2: 69.