Investigating the Influence of Rois Selection in Breast Ultrasound Segmentation Using the Eicamm Technique

The false-negative interpretation represents serious problems in breast lesions diagnosis. In order to reduce the number of these cases and increase the diagnostic sensibility, computational tools have been developed to aid the early detection of breast cancer. However, such computer schemes can be influenced directly or indirectly by the user mainly regarding the selection of the type of image to be processed. In this context, this work evaluates how the non-standardization in cutting regions of interest (ROIs) in the image can affect the computed detection and computer segmentation step. A total of 54 lesions recorded in images from breast ultrasonography were used for the tests. An experienced radiologist cropped each lesion three times varying the amount of surrounding tissue-three different sets were formed, and a test group was added to the study containing 18 lesions of each case selected. A previous developed segmentation procedure based on the use of the EICAMM technique was applied to the images. The most accurate result with the EICAMM technique was obtained in the first set, in which the clipping was made as close to the lesion, providing greater accuracy in the comparison between the segmentation by the computational process and the lesion delineation by the radiologist with lower rates of over and under segmentation.


Introduction
Mammography is the best method for early detection of breast cancer, and its interpretation remains a challenge to the specialist [1].However, in women with dense breast, the mammography sensitivity may be low, allowing to miss about 10% of all cancers [2][3].Breast ultrasonography has emerged as an important adjunct to diagnostic mammography and it has been used to distinguish between mammographically identified cystic and solid masses.Some problems in diagnosis accuracy are related to miss of lesions, being possible causes: dense parenchyma obscuring a lesion, poor positioning, noisy nature of images, perception error, incorrect interpretation of a suspect finding, subtle features of malignancy, and slow growth of a lesion [4].In this sense, about 10-30% of breast lesions are missed in routine exam due to limitations of human observers [1].With the advance of digital technology, mainly of the digital image processing -including pattern recognition and artificial intelligence-radiologists have the opportunity to improve the diagnostic accuracy with the aid of computer systems.Computer Aided-Detection (CAD) is a relatively new technology which has been implemented in some mammography centers with the purpose of providing double reading, working as a second opinion.CAD schemes are useful when there is high interobserver variability, absence of trained observers or impossibility of performing double reading with two or more radiologists.Clinical studies have demonstrated that CAD increases sensitivity in the detection of breast cancer by radiologists in up to 20% [1].Some CAD schemes allow the user interaction in performing particular procedures.For such processes the interobserver variability becomes a problem, which in most cases is related to the lack of standardization to do these tasks.One of them is associated to the selection of the region of interest (ROI), because the way a particular lesion is determined has direct influence on the system output.Therefore, this feature can change the lesion segmentation and hence its classification-highly dependent on the result coming from the previous procedure [5].Therefore, here the ROI selection effect on these two steps is investigated, by using a novel methodology for masses segmentation based on the Enhanced Independent Component Analysis Mixture Model (EICAMM) [6][7].Also this work proposes a study of how the lack of standardization for the manual ROIs selection may affect the automatic detection process in breast ultrasound procedures.

Database
For the investigation 54 breast ultrasound images containing suspicious lesions were selected.These images were obtained at Diagnosis Imaging Integrated Center-Santa Casa Hospital in Sao Carlos, SP, Brazil.The acquisition was provided by a Siemens G50 equipment, operating with linear array transducer of 7.5-Megahertz (MHz) B-mode.An experienced radiologist carried out the manual contour delineation of the lesions by using Wacom Cintiq 13 HD graphical monitor with digital pen which provides high accuracy, easy and fast manual design, pressure sensitivity and tilt recognition.Thereafter this same radiologist was instructed to perform three different selections for the same lesion considering the follow categories: In order to study how the lack of standardization in performing the cut can influence the computational segmentation, we create a differentiated group containing 18 images of each of the three setsa total of 54 ROIs, so that each lesion were included only once.This group was called Test Group.

Digital Processing
The EICAMM method [6] is based on adaptations of the Independent Component Analysis Mixture Model (ICAMM) technique [7], which incorporated improvements in some aspects of nonlinear optimization in order to overcome some limitations of this latter [7].Some of the major changes are: formulation of a more informative learning rule for the element bias from the approach to maximize the mutual information that the processor output of the neural network has relative to the input (the network takes into account, in the current iteration, the results obtained in previous iterations); regarding the convergence acceleration of the algorithm and the assurance of the local minimum of the function, the update rule was modified by incorporating the second derivative, using the methods of Newton and Levenberg-Marquardt; orthogonalization of the base matrix [6][7][8].EICAMM model considers that the points to be clustered are generated by a mathematical process described as a mixture of k probability densities classes.
In this way, the aim of the clustering is to find the parameters of each class distribution and assign each sample according to the highest probability [7].The input data X corresponds to the coordinates of each pixel of the original images and its neighbors, as exemplified in Figure 1.Convolution is used to indicate the value of the neighborhood of pixels located at the edge of the image [7].For the convolution, Ribeiro [6] choose to take the values of the pixel edges, for reducing possible errors edge information with neighborhood of size 3x3.After segmentation, morphological operations of opening and closing are applied for smoothing the edges and removing some erroneously connected pixels that joined the lesion and the background.Such a procedure was validated by using another segmentation technique that provided similar results as to noisy appearance of the contour, described by Marcomini [9].Then a new operation was performed on the image, the internal valleys ("Holes") were added to the segmented region and the unconnected areas to the object of interest were eliminated [10].The crop size and a standard criterion for its performance are extremely important to the lesions detection in medical images.The variations in the gray scale involve changes in the mathematical calculations which aim to cluster similar regions and consequently differentiate the object of interest from the background.In automatic processes this influence is even greater because the initial parameters of the segmentation algorithm are not adjusted according to the characteristics of the input image.With this aim

Results
a) a cut with plenty of tissue adjacent to the lesion; b) an intermediate amount of surrounding tissue; and c) a cut as close as possible to the lesion.