Institutional review board approval
In this retrospective study, all patient data were Health Insurance Portability and Accountability Act-compliant and acquired under institutional review board approval with a waiver of the need for informed consent.
Patient population
We retrieved data from an open-source de-identified database, The Cancer Imaging Archive (TCIA) [12], which is the imaging counterpart of TCGA. TCGA, in brief, is a coordinated effort led by the National Cancer Institute to accelerate the molecular and genomic understanding of cancer [13]. The TCGA program performs genomic sequencing and characterisation of tissue from cancers diagnosed and treated at cancer centres around the United States.
Breast cancer (both invasive ductal and lobular) was one of the cancers selected for study, and by December 16, 2014, 1100 breast cancer cases had been collected by TCGA and 1098 also had available genomic and clinical data (available at: https://gdc.cancer.gov/; accessed February 17, 2016). To complement the genomic and clinical data, the National Cancer Institute enriched the TCGA open-source data portal by collecting MRI studies for storage and analysis in the TCIA. However, in 2014, only 108 breast cancers had TCIA MRI (pre-operative examinations with pathologically verified cancer) to correlate with available TCGA data. Clinical, pathologic and genomic data were extracted using the TCGA assembler, an open-source, publicly available, free tool [14].
Because our patients were drawn from the TCGA, an open-source database, some patients in our cohort were included in several previously published studies [9,10,11, 15, 16]. However, this paper has no scientific overlap with the other published studies. We report new findings from our study done to investigate if CEIP of breast cancer could replicate HEIP using MRI from TCGA. Thus, whereas we assessed the correlation between CEIP and HEIP, the other studies assessed the correlation between CEIP and clinical features.
MRI acquisition
All breast MRI data were downloaded from the Breast Cancer Risk Assessment collection within TCIA (http://www.cancerimagingarchive.net). The data had previously been generated under MRI studies originally performed between 1999 and 2004 at four institutions: the Mayo Clinic, Memorial Sloan Kettering Cancer Center, Roswell Park Cancer Institute and the University of Pittsburgh Medical Center. To avoid differences in image quality caused by different equipment vendors, only MR images obtained with 1.5-T systems were included in our study, which limited the number of patients considered for inclusion in our analysis to 93. Of these 93 patients, two were excluded because of missing genomic data (n = 1) and missing images (n = 1).
Therefore, in total, 91 female patients with pre-operative breast MRI studies underwent analysis in our study. All images were acquired with a 1.5-T system (Signa or Signa HDX; General Electric Medical Systems, Waukesha, WI, USA). In all patients, a dedicated surface breast coil was used. T1-weighted fat-suppressed images were acquired before and after intravenous administration of a gadolinium-based contrast agent (gadodiamide, Omniscan®; Nycomed-Amersham, Princeton, NJ, USA), with three to five post-contrast images because the protocol was institution-dependent. In-plane spatial resolution ranged from 0.53 to 0.86 mm, and slice spacing ranged from 2 to 3 mm. Only pre- and post-contrast T1-weighted fat-suppressed images were included in our study. Further information, including full clinical breast MRI protocols, can be accessed from the open-source TCIA.
HEIP
To generate HEIP, a pool of eleven board-certified breast-imaging radiologists, with experience ranging from 4 to 29 years (ESB, 14 years; GJW, 25 years; EJS, 4 years; JMN, 5 years; MG, 29 years; and EAM, 25 years; the other five radiologists were non-authors), participated in the manual assessment of MRI data. The image location of the index breast cancer and maximal tumour size were identified on the first post-contrast image and annotated by each radiologist using an open-source and open-access software platform [17]. For each patient, three radiologists from this pool were randomly assigned to review the imaging data. Visual assessments of tumour characteristics of the cancer were made according to BI-RADS 5 descriptors (lesion-shaped, internal enhancement and margin), yielding HEIP. In patients with multifocal or multicentric disease, the largest mass was used as the index lesion.
Four HEIP characteristics were assessed: lesion size (largest size as per Response Evaluation Criteria in Solid Tumours 1.1 recommendation [18]), lesion shape (whether the lesion was irregularly shaped or round/oval), internal enhancement (whether the enhancement was heterogeneous or homogeneous) and margin (whether the margin was circumscribed, irregular or spiculated). The radiologists performed their reviews independently and were blinded to all health information. Enhancement kinetics/curves were not generated, owing to the variable temporal resolution across sites and over time, which would have significantly impacted the accuracy of the results. HEIP for each feature from the three radiologists was summarised into a single representative consensus value for each patient; these summary values served as the case-based HEIP in the subsequent statistical analysis. The decision rule for consensus was simple majority. There were no cases where all readers disagreed.
CEIP (quantitative radiomics)
Given the approximate tumour centre location, each index breast tumour was automatically segmented in the three-dimensional (3D) space. The quantitative radiomics workstation used for this study was able to yield 38 CEIP from dynamic contrast-enhanced MRI scans to characterise tumour size, shape, margin, enhancement texture, kinetics, and variance kinetics [6, 7, 19,20,21]. However, because kinetic characteristics were not assessed by the radiologists (i.e., as HEIP) for this study, only 24 phenotypes (from four categories) were used to compare the HEIP and CEIP. The 14 kinetics-related CEIP were excluded. All measurements were extracted from the first post-contrast MR images.
The 24 CEIP were calculated on the basis of automatically derived 3D tumour segmentations [19]. They were further separated into four phenotypic categories: (a) size-measuring tumour dimensions (4 CEIP); (b) shape, quantifying the 3D tumour geometry (3 CEIP); (c) morphology, combining tumour shape and margin characteristics such as margin sharpness (3 CEIP); and (d) enhancement texture, describing the texture of the contrast uptake (heterogeneity of the uptake) in the tumour on the first post-contrast MRI image (14 CEIP) (Fig. 1). Owing to our small sample size, we report only the CEIP extracted for each HEIP BI-RADS descriptor but do not provide the coefficients.
Statistical methods
Inter-observer agreement in HEIP
Inter-observer agreement for human-extracted lesion size was based on inferences on the (−log 1.20, log 1.20) coverage probability of the natural logarithm of the size. This is the probability that the absolute difference between the natural logarithm of the size measurements from any two radiologists differs by less than log 1.20, or equivalently, that the size measurements from any two radiologists are within 20% of each other. Inter-observer agreement for human-extracted tumour shape, margin and heterogeneity was based on inferences on Krippendorff’s α [22]. The 95% CIs for Krippendorff’s α and the coverage probability were constructed using the nonparametric bootstrap [23].
Associations between HEIP and CEIP
Associations for lesion size were assessed through inferences on the Kendall τ rank correlation coefficient [24]; p values were obtained through permutation tests. Associations for both shape and internal enhancement were assessed through the Mann-Whitney U test [25]. Associations for margin were assessed through inferences on the Kendall τ rank correlation coefficient; p values were obtained through permutation tests [26]. The Benjamini-Hochberg procedure was used to correct for multiple hypothesis testing. All tests of associations between HEIPs and individual CEIPs were performed at the α = 0.05 level [27].
Replicating HEIP using CEIP
We analysed the feasibility of replicating each of the four HEIP using the corresponding CEIP. These were treated as either prediction or classification problems that we evaluated in terms of accuracy. The replication of HEIP of tumour size given the CEIP of tumour size was treated as a prediction problem. A predictor for the HEIP of tumour size based on the size-related CEIP was constructed using multivariate linear regression subject to elastic net constraints [28]. The elastic net constraints set some coefficients in the model exactly to zero because of the model’s geometry, thus performing variable selection, and also simultaneously stabilises the coefficient estimates in the presence of substantial correlation between the CEIP. Ten-fold cross-validation [29] was used to select the values of the tuning parameters controlling the severity of the elastic net constraints.
The replications of tumour shape, internal enhancement, and margin were treated as classification problems. Classifiers for tumour shape and internal enhancement were constructed using multiple logistic regression subject to elastic net constraints. Classifiers for tumour margin were constructed using ordinal logistic regression, also subject to elastic net constraints. For these classifiers, tuning parameters for the elastic net constraints were also selected using ten-fold cross-validation.
These predictors and classifiers were then assessed in terms of their prediction and classification accuracy through nested ten-fold by ten-fold cross-validation [30]. For tumour size, performance was evaluated in terms of the mean squared deviation [31]. For tumour shape and internal enhancement, performance was evaluated in terms of AUC in ROC analysis. For tumour margin, performance was evaluated in terms of the Kendall τ rank correlation coefficient between the classifier score and the actual value of the HEIP of tumour margin [24].
For each of these four tumour characteristics (size, shape, internal enhancement, and margin), the p values of predictive signal contained within CEIP were computed through permutation tests. The Benjamini-Hochberg procedure was applied to these p values to correct for multiple hypothesis testing [27].