- Original article
- Open Access
A machine learning approach for differentiating malignant from benign enhancing foci on breast MRI
European Radiology Experimental volume 4, Article number: 5 (2020)
Differentiate malignant from benign enhancing foci on breast magnetic resonance imaging (MRI) through radiomic signature.
Forty-five enhancing foci in 45 patients were included in this retrospective study, with needle biopsy or imaging follow-up serving as a reference standard. There were 12 malignant and 33 benign lesions. Eight benign lesions confirmed by over 5-year negative follow-up and 15 malignant histopathologically confirmed lesions were added to the dataset to provide reference cases to the machine learning analysis. All MRI examinations were performed with a 1.5-T scanner. One three-dimensional T1-weighted unenhanced sequence was acquired, followed by four dynamic sequences after intravenous injection of 0.1 mmol/kg of gadobenate dimeglumine. Enhancing foci were segmented by an expert breast radiologist, over 200 radiomic features were extracted, and an evolutionary machine learning method (“training with input selection and testing”) was applied. For each classifier, sensitivity, specificity and accuracy were calculated as point estimates and 95% confidence intervals (CIs).
A k-nearest neighbour classifier based on 35 selected features was identified as the best performing machine learning approach. Considering both the 45 enhancing foci and the 23 additional cases, this classifier showed a sensitivity of 27/27 (100%, 95% CI 87–100%), a specificity of 37/41 (90%, 95% CI 77–97%), and an accuracy of 64/68 (94%, 95% CI 86–98%).
This preliminary study showed the feasibility of a radiomic approach for the characterisation of enhancing foci on breast MRI.
Radiomic signature could distinguish malignant from benign enhancing foci on magnetic resonance imaging of the breast
In this study, we applied a “training with input selection and testing “machine learning algorithm on 45 foci, using 8 confirmed benign lesions and 15 confirmed malignant lesions as reference cases
Over 200 radiomic features were extracted.
Overall, a k-nearest neighbour classifier based on 35 selected features showed an over 90% accuracy.
Contrast-enhanced magnetic resonance imaging (MRI) has emerged as a non-invasive radiation-free imaging technique for the detection and diagnosis of breast lesions, substantially influencing the diagnosis, prognosis, and treatment of patients with breast cancer [1,2,3,4,5,6].
This technique is able to detect also small enhancing lesions, with 5 mm or lower maximum diameter, which might be difficult to further characterise. These small findings were defined by the American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) as enhancing foci . Depending on the spatial resolution, it is difficult to evaluate their morphology and dynamic behaviour, while the small size makes difficult to perform MRI-guided needle biopsy, so that their changes are commonly longitudinally monitored with serial examinations to reach a conclusive diagnosis . Foci were frequently associated with an increased hormonal stimulation, but they can also represent the early onset of a malignant lesion [9, 10]. Studies addressing the malignancy rate of foci showed highly variable results, ranging from 2 to 23% [9,10,11,12,13]. Thus, the best management of foci is still under discussion. The issue is of particular interest in high-risk women, especially considering the importance of early diagnosis in this group of patients.
Until the recent rising of radiomics , computer-based medical image analysis was focused on computer-aided detection systems supporting the identification of suspicious lesions deserving the attention of the radiologist and on computer-aided diagnosis systems, assisting radiologists in decision-making . Although radiomics was a natural evolution of these systems, the objectives of the two approaches were different. While computer-aided detection or diagnosis systems aimed at delivering a single answer (i.e. presence/absence of lesions; malignant versus benign differentiation), radiomics was designed to combine radiomic data from images with patient history, risk factors, clinical investigation, and other patient information to provide more powerful decision support models . Radiomics assumes that medical images contain quantitative information that radiologists are not able to perceive and that may be correlated to clinical end-points (such as lesion nature or evolution as well as predictive information about treatment efficacy) based on big data. Although there were no universally recognised guidelines yet, the radiomic workflow consists usually of the following main steps : clinical data and images collection; image segmentation, features extraction (i.e. to obtain quantitative information about the tissue, also called “descriptors”); definition of a machine learning (ML) model and model validation, preferably against an independent dataset.
The aim of this observational retrospective study was to test the ability to differentiate malignant from benign foci on breast MRI through radiomic signature.
Study design and population
The local Ethics Committee of Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico approved this retrospective study (protocol code CE-MRm; approved on December 13, 2018). This study was supported by local research funds of CDI Centro Diagnostico Italiano, a clinical diagnostic centre. Due to the retrospective nature of this study, no specific informed consent was necessary. In this observational retrospective study, we reviewed contrast-enhanced breast MRI examinations performed at our Institution between January 2012 and December 2017, to create training/testing sets on which to apply and evaluate the performance of our algorithms.
This data set consisted of:
Patients with contrast-enhancing breast foci (enhancing lesions smaller than 5 mm in diameter) with definitive characterisation (benign or malignant) confirmed by histopathology or with breast MRI or ultrasound examination performed after at least 1 year
Patients with benign breast lesions with 5 years of MRI stability (unambiguous cases)
Patients with malignant breast lesions histopathologically confirmed (unambiguous cases)
Patients with incomplete or negative breast MRI examinations were excluded. Breast foci were defined following the ACR BI-RADS Atlas® 5th edition as tiny dots of enhancement that does not clearly represent a space-occupying lesion or mass and does not clearly show a mass on unenhanced imaging .
Images were acquired on a 1.5-T scanner (Philips Achieva, Philips Medical Systems, Best, The Netherlands). According to clinical practice, examinations were performed with the patient laying in prone position, with the breasts inserted into a surface 16-channel phase-array coil. The sequence taken into account was an axial T1-weighted fast field-echo including inversion recovery with spectral attenuated fat suppression, with a repetition time of 5.1 ms, an echo time of 2.5 ms, a slice thickness of 1 mm, and a field of view of 340 × 340 mm (in-plane resolution 1 × 1 mm). The protocol consisted of one unenhanced and four contrast sequences, with a temporal resolution of 60 s. Gadobenate dimeglumine (Multihance®, Bracco, Milan, Italy) was used as a contrast agent at the dose of 0.1 mmol/kg (0.2 mL/kg); the injection rate was 2 mL/s. The images acquired soon after contrast agent injection were compared to the unenhanced ones.
Image and data analysis
MRI series were slice-wise co-registered to compensate for patient motion. Registration was done using the ImageJ StackReg plugin , based on an automatic subpixel registration algorithm that minimises the mean square difference of intensities between a target and a floating image . Briefly, subvolumes of the MRI volume including lesions and surrounding tissues were cropped from unenhanced and contrast-enhanced datasets. After that, the five extracted subvolumes were automatically slice-wise co-registered by two-dimensional-affine transform. The accuracy of co-registration was assessed by an expert radiologist.
Manual lesion segmentation was carried out by one expert radiologist with more than 10 years of experience in breast MRI on the co-registered images using ImageJ . Due to the spatial coherence of the unenhanced and enhanced images after co-registration, only one lesion mask was defined for each contrast-enhanced series. Images defining segmented lesion areas for each slice were defined as label images. Label and contrast-enhanced images were cropped to the bounding box containing lesions to avoid the analysis of unnecessary parts of the image. In Fig. 1, an example of a focus on unenhanced (T0) and contrast-enhanced (T1–T4) images, with its segmentation, is shown.
Feature extraction, selection, and classification
Features were calculated using a dedicated software developed in C++ based on the ITK framework . The extracted features were three-dimensional (3D) shape features, which describe the geometric shape of the segmented area and the geometric properties [15, 19, 20], intensity, histogram-based features which reduce the 3D information of a volume into a single histogram, and 3D texture-based features based on grey level co-occurrence matrix [17, 19, 20] or grey level run length matrix, also known as second-order statistics features, which are obtained calculating the relationship between adjacent voxels (Table 1) . For texture-based 3D features, the mean and standard deviation of the values calculated along all the 3D directions were computed. Features extracted separately from the five images, distinguished using the code T0, T1, T2, T3, and T4, provide a description of the dynamic evolution of features over time due to the contrast wash-in/wash-out. Semeion’s training with input selection and testing (TWIST) algorithm  is based on an evolutionary strategy aimed at solving the features selection and training/test splitting problems simultaneously. To speed up the selection process, performances of selected features were evaluated by means of k-nearest neighbour (kNN), a fast and robust classification algorithm. The optimal feature set was used to build and validate the final kNN model.
Performances of the optimal classification model were expressed in terms of sensitivity, specificity, accuracy, positive predictive value, negative predictive value, area under the curve (AUC) at receiver operating characteristic (ROC) analysis, positive likelihood ratio, and negative likelihood ratio. For each parameter, 95% confidence intervals (CI) were calculated according the binomial distribution. For k-nearest neighbour analysis, a k value of 3 was chosen. The probability of a case to belong to the positive class, P (+), was calculated on the basis of the class of the three nearest neighbours. P (+) can assume only three values: 1 if all three neighbours belongs to the positive class, 0.66 if 2 of the neighbours belongs to the positive class and one to the negative, 0.33 if 1 neighbour belongs to the positive class, and 0 if all 3 neighbours are of the negative class. The probability threshold applied to assign a case to the positive or negative class was set to 0.5. Probability values P (+) and class assignments were finally used to draw the ROC curve. The obtained ROC curve was used to obtain the AUC value.
A total of 1538 contrast-enhanced breast MRI examinations in female patients performed at our institution were retrieved from our database. After selection, following the inclusion criteria, a total of 68 lesions were analysed using the TWIST algorithm. Results are summarised in Table 2. The dataset obtained consisted of:
Forty-five patients (median age 49 years, interquartile range [IQR] 44–54 years) had one enhancing focus each: 33 of them were benign with ≥ 1-year negative imaging follow-up, while 12 of them were malignant lesions. Among the 12 malignant lesions, for 8 lesions, the histopathology exams were retrieved (2 ductal cribriform, 5 ductal no special type, 1 ductal solid, and 1 ductal papillary) and were in situ; for the other 4 malignant lesions, no detailed pathologic information was available
Eight patients (median age 46 years, IQR 44–56 years; median size 10 mm, IQR 7–14 mm) had one benign nodule each, confirmed after 5 years of MRI stability.
Fifteen patients had one malignant lesions each (median age 55 years, IQR 45–66 years; median size 15 mm, IQR 10–24 mm) confirmed at the histopathology exam: 4 ductal cribriform, 7 ductal no special type, 2 ductal solid, 1 ductal papillary, and 1 tubular; of these 15 malignant lesions, 13 were invasive and 2 in situ.
The final features set obtained using the TWIST algorithm was reported in Table 3. In total, 43 features were extracted from five-time points, resulting in 215 features for each case, which described the dynamic evolution of the contrast agent in the focus. The most discriminating features extracted using the TWIST algorithm are also summarised in Table 3. Intensity- and texture-based features, which resulted to be the most important for the ML systems to differentiate benign from malignant lesions, were selected. As shown in Table 3, three of the extracted features refer to the pre-contrast image and 32 refer to images acquired after the contrast injection split over the different time points.
The second result of the TWIST algorithm was the subdivision of the overall dataset into two statistically homogeneous features-based groups: group A and B consisted of 37 lesions (16 malignant and 21 benign) and of 31 lesions (11 malignant and 20 benign), respectively. On these, classification performances were calculated twice: firstly, using A as training set and B as test set and subsequently vice versa. Results of the final kNN model built with the 35 selected input variables are shown in Table 4. The classifier showed a sensitivity of 27/27 (100%, 95% CI 87–100%), a specificity of 37/41 (90%, 95% CI 77–97%), and an accuracy of 64/68 (94%, 95% CI 86–98%). In particular, 3 out of the 4 misclassified cases were enhancing foci and one was an unambiguous benign case. All errors were false positives.
This preliminary study demonstrated that ML associated with radiomics may successfully distinguish malignant form benign enhancing foci on breast MRI examinations, potentially outperforming human assessment.
During this study after the patient selection step, the following steps were applied: an image registration, a manual lesion segmentation, and the feature extraction, selection, and classification step.
Feature selection and model validation are two significant methodological issues related to the application of ML, especially when dealing with small databases and a large number of variables. Feature selection is a procedure to identify and select the most informative variables to feed the statistical model. Validation is the evaluation step of the classification procedure, and its objective was to test if the procedure was generally applicable or fitted to the particular dataset used to build the classification system (overfitting). Validation can be carried out by splitting the dataset into two subsets, one used to train the classifier and one to test it. Training/testing sets splitting is critical especially when dealing with small datasets because random splitting can lead to statistically different sets containing not homogeneous information.
The proposed approach, ultimately based on a simple kNN classifier, provided 100% sensitivity and 90% specificity. Notably, all the misclassification errors were false positives that are preferred to false negatives from a clinical perspective. Features selected by the TWIST algorithm were mainly from contrast-enhanced images (eight features/image) while only three were selected from the unenhanced images. This suggests that contrast enhancement provides information that can be beneficially exploited by ML methods. Interestingly, the imaging time-point with the highest prediction relevance for the proposed ML system was the second (T2) after injection, with 12 features selected from this time-point, obtained 140 s after injection, taking into consideration our temporal resolution (60 s) and the initial 20 s of waiting time between the contrast agent injection and the first acquisition. This result was coherent, according to our breast radiologists, to what happens in the human-based diagnosis, where the first one-two subtracted series were the basis for diagnosis and usually represented on maximum intensity projections.
These preliminary results were evaluated in the general frame work of breast cancer management. GLOBOCAN  estimated 2,088,849 new breast cancer cases and 626,679 deaths worldwide in 2018. Only in the USA, 138,000 women die every year. In general, a woman has a 1 to 8 chance of developing breast cancer in her lifetime. High tumour stage at diagnosis was related to a worse prognosis for the patient and to higher costs for the health care systems [22, 23]. In fact, early breast cancer detection and prediction of response to treatments became the main objective of the actual clinical practice and research . In recent years, breast MRI was included among the diagnostic methodologies as third level examination. Technical improvements, uprising availability of breast coils, and increasing care to minimise radiation has expanded the number of performed breast MRI investigations.
However, breast MRI can detect equivocal lesions, especially small enhancing foci, with imaging features that do not allow a clear human-based malignant/benign differentiation. The impact of the proposed ML method could be positive from the clinical, economic, and psychological point of view. Forecasting a likely benign enhancing focus would lead the patient to a more serene approach to the next follow-up. Conversely, defining an enhancing focus as probably malignant would suggest to carry out a targeted biopsy.
In this study, only data from the dynamic data set was used to build the statistical model. However, additional clinical data, not necessarily derived from imaging examinations, could be added to the dataset to enhance the performance and robustness of the method.
The small sample size used in this study was the main limitation to take into consideration. We are aware that with small samples and unbalanced dataset (i.e. datasets containing much more features than patients), the assessment of model reliability is weak and models are associated with a high risk of overfitting. In these cases, cross-validation methods could mitigate the risk of overfitting and provide more reliable estimation of models performances. Cross-validation methods were generally based on the random splitting of the available data in two subsets used for parameters estimation and testing respectively. TWIST, instead, adopts a statistically driven approach to split the available dataset into training and test sets that have been demonstrated to outperform traditional methods such as the k-fold approach and was successfully applied on several clinical datasets . Another common problem with ML was imbalanced population samples, when cases are not equally distributed across classes. To avoid this problem, this study adopted a biased patient selection, with a high percentage of malignant patients included to balance benign cases. As a consequence, malignancy rate of the current study dataset was higher compared to other studies, for which a malignancy rate for foci from 2 to 23% [9,10,11,12,13] was reported.
Despite these limitations, this preliminary study suggests that ML could support the radiologist in the clinical decision making for enhancing foci on breast MRI. To turn this result into a robust clinical tool, two further steps should be carried out: first, the variability associated to differences in MRI sequences, devices and contrast agents should be addressed, and second, the interobserver variability in tumour segmentation as well as the patient-related variability must be investigated. The result of this work, if confirmed to a larger scale, might lead to decrease the uncertainty in the clinical decision making regarding enhancing foci on breast MRI.
Availability of data and materials
The datasets generated and/or analysed during the current study are available from the corresponding author on reasonable request.
Area under the curve
Breast Imaging Reporting and Data System
Magnetic resonance imaging
Receiver operating characteristic
Training with input selection and testing
Lee-Felker SA, Tekchandani L, Thomas M et al (2017) Newly diagnosed breast cancer: comparison of contrast-enhanced spectral mammography and breast MR imaging in the Evaluation of Extent of Disease. Radiology 285:389–400. https://doi.org/10.1148/radiol.2017161592
Fallenberg E, Dromain C, Diekmann F et al (2014) Contrast-enhanced spectral mammography versus MRI: initial results in the detection of breast cancer and assessment of tumour size. Eur Radiol 24:256–264. https://doi.org/10.1007/s00330-013-3007-7
Fallenberg EM, Schmitzberger FF, Amer H et al (2017) Contrast-enhanced spectral mammography vs. mammography and MRI--clinical performance in a multi-reader evaluation. Eur Radiol 27:2752–2764. https://doi.org/10.1007/s00330-016-4650-6
Mauri G, Sconfienza LM, Pescatori LC et al (2017) Technical success, technique efficacy and complications of minimally-invasive imaging-guided percutaneous ablation procedures of breast cancer: a systematic review and meta-analysis. Eur Radiol 27:3199–3210. https://doi.org/10.1007/s00330-016-4668-9
Lewis TC, Pizzitola VJ, Giurescu ME et al (2017) Contrast-enhanced digital mammography: a single-institution experience of the first 208 cases. Breast J 23:67–76. https://doi.org/10.1111/tbj.12681
Ali-Mucheru M, Pockaj B, Patel B et al (2016) Contrast-enhanced digital mammography in the surgical management of breast cancer. Ann Surg Oncol 23:649–655. https://doi.org/10.1245/s10434-016-5567-7
Clauser P, Cassano E, De Nicolò A et al (2016) Foci on breast magnetic resonance imaging in high-risk women: cancer or not? Radiol Med 121:611–617. https://doi.org/10.1007/s11547-016-0644-3
Cheng L, Li X (2012) Breast magnetic resonance imaging: focus/foci. Gland Surg 1:136–138. https://doi.org/10.3978/j.issn.2227-684X.2012.08.06
Ha R, Sung J, Lee C, Comstock C, Wynn R, Morris E (2014) Characteristics and outcome of enhancing foci followed on breast MRI with management implications. Clin Radiol 69:715–720. https://doi.org/10.1016/j.crad.2014.02.007
Liberman L, Mason G, Morris EA, Dershaw DD (2006) Does size matter? Positive predictive value of MRI-detected breast lesions as a function of lesion size. AJR Am J Roentgenol 186:426–430. https://doi.org/10.2214/AJR.04.1707
Bahrs SD, Baur A, Hattermann V et al (1987) (2014) BI-RADS® 3 lesions at contrast-enhanced breast MRI: is an initial short-interval follow-up necessary? Acta Radiol 55:260–265. https://doi.org/10.1177/0284185113501304
Jansen SA, Shimauchi A, Zak L, Fan X, Karczmar GS, Newstead GM (2011) The diverse pathology and kinetics of mass, nonmass, and focus enhancement on MR imaging of the breast. J Magn Reson Imaging 33:1382–1389. https://doi.org/10.1002/jmri.22567
Sardanelli F, Boetes C, Borisch B et al (2010) Magnetic resonance imaging of the breast: recommendations from the EUSOMA working group. Eur J Cancer 46:1296–1316. https://doi.org/10.1016/j.ejca.2010.02.015
Lambin P, Rios-Velazquez E, Leijenaar R et al (2012) Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer 48:441–446. https://doi.org/10.1016/j.ejca.2011.11.036
Bland K, Copeland E, Klimberg VS, Gradishar W (2018) The Breast - 5th Edition. https://www.elsevier.com/books/the-breast/bland/978-0-323-35955-9. Accessed 6 Mar 2018
Gillies RJ, Kinahan PE, Hricak H (2015) Radiomics: images are more than pictures, they are data. Radiology 278:563–577. https://doi.org/10.1148/radiol.2015151169
ACR BI-RADS Atlas® 5th Edition https://www.acr.org/Clinical-Resources/Reporting-and-Data-Systems/Bi-Rads. Accessed 6 Mar 2019
(2018) StackReg. Available via http://bigwww.epfl.ch/thevenaz/stackreg/. Accessed 6 Mar 2018
(2018) Image J. Available via https://imagej.nih.gov/ij/. Accessed 6 Mar 2018
(2018) ITK - Segmentation & Registration Toolkit. Available via https://itk.org/. Accessed 6 Mar 2018
(2018) ITK: itk::LabelGeometryImageFilter< TLabelImage, TIntensityImage > Class Template Reference. Available via https://itk.org/Doxygen/html/classitk_1_1LabelGeometryImageFilter.html. Accessed 6 Mar 2018
Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A (2018) Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. https://doi.org/10.3322/caac.21492
(2018) Female Breast Cancer - Cancer Stat Facts. Available via https://seer.cancer.gov/statfacts/html/breast.html. Accessed 12 Mar 2018
Blumen H, Fitch K, Polkus V (2016) Comparison of treatment costs for breast cancer, by tumor stage and type of service. Am Health Drug Benefits 9:23–32
Buscema M, Breda M, Lodwick W (2013) Training With Input Selection and Testing (TWIST) algorithm: a significant advance in pattern recognition performance of machine learning. J Intell Learn Syst Appl 5:29. https://doi.org/10.4236/jilsa.2013.51004
This study was supported by local research funds of CDI Centro Diagnostico Italiano S.p.A., a Clinical Diagnostic Center.
Ethics approval and consent to participate
The local Ethics Committee of Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico approved this retrospective study (protocol code CE-MRm; approved on December 13, 2018). Due to the retrospective nature of this study, specific informed consent was waived.
Consent for publication
Due to the retrospective nature of this study, specific informed consent was waived.
Enzo Grossi and Giovanni Valbusa are employees of Bracco Imaging. The Centro Diagnostico Italiano is a company subjected to management and coordination by Bracco corporation. The remaining authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
D’Amico, N.C., Grossi, E., Valbusa, G. et al. A machine learning approach for differentiating malignant from benign enhancing foci on breast MRI. Eur Radiol Exp 4, 5 (2020). https://doi.org/10.1186/s41747-019-0131-4
- Artificial intelligence
- Breast neoplasms
- Gadobenic acid
- Machine learning
- Magnetic resonance imaging