Enhancing diagnostic deep learning via self-supervised pretraining on large-scale, unlabeled non-medical images

Tayebi Arasteh, Soroosh; Misera, Leo; Kather, Jakob Nikolas; Truhn, Daniel; Nebelung, Sven

doi:10.1186/s41747-023-00411-3

European Radiology Experimental

Table 5 Comparison of pretrained weights: self-supervised learning with large non-medical images versus supervised learning with a large, task-specific chest radiograph dataset

From: Enhancing diagnostic deep learning via self-supervised pretraining on large-scale, unlabeled non-medical images

Labels	VinDr-CXR		ChestX-ray14		CheXpert		UKA-CXR		PadChest
Labels	DINOv2	MIMIC-CXR	DINOv2	MIMIC-CXR	DINOv2	MIMIC-CXR	DINOv2	MIMIC-CXR	DINOv2	MIMIC-CXR
Cardiomegaly	94.53 ± 0.52	97.17 ± 0.34	88.51 ± 0.47	89.54 ± 0.44	87.96 ± 0.31	87.27 ± 0.31	85.86 ± 0.18	85.45 ± 0.18	92.30 ± 0.27	92.68 ± 0.26
Pleural effusion	97.62 ± 0.68	98.31 ± 0.52	81.01 ± 0.32	82.00 ± 0.32	87.81 ± 0.20	87.64 ± 0.20	91.23 ± 0.19	91.41 ± 0.19	95.66 ± 0.26	95.85 ± 0.24
Pneumonia	91.99 ± 0.98	94.46 ± 0.66	70.17 ± 1.03	69.85 ± 1.04	76.42 ± 0.88	76.29 ± 0.84	92.15 ± 0.18	91.94 ± 0.18	83.93 ± 0.67	84.96 ± 0.66
Atelectasis	88.55 ± 1.71	92.21 ± 1.48	75.56 ± 0.43	75.87 ± 0.41	69.57 ± 0.40	69.28 ± 0.39	86.36 ± 0.23	86.30 ± 0.24	83.62 ± 0.58	83.59 ± 0.55
Consolidation	91.35 ± 1.56	94.82 ± 0.74	73.60 ± 0.57	75.11 ± 0.54	75.14 ± 0.56	74.13 ± 0.56	N/A	N/A	88.26 ± 0.82	89.95 ± 0.76
Pneumothorax	90.96 ± 2.91	97.39 ± 1.27	84.70 ± 0.38	85.93 ± 0.37	87.29 ± 0.33	86.03 ± 0.34	N/A	N/A	86.37 ± 2.01	92.89 ± 1.00
Lung opacity	86.86 ± 1.27	87.89 ± 1.26	N/A	N/A	73.98 ± 0.28	73.62 ± 0.29	N/A	N/A	N/A	N/A
Lung lesion	N/A	N/A	N/A	N/A	76.56 ± 0.73	75.79 ± 0.73	N/A	N/A	N/A	N/A
Fracture	N/A	N/A	N/A	N/A	77.93 ± 0.67	76.92 ± 0.66	N/A	N/A	N/A	N/A
No finding (healthy)	90.79 ± 0.56	93.51 ± 0.46	72.37 ± 0.33	72.48 ± 0.33	87.61 ± 0.30	87.53 ± 0.31	86.86 ± 0.18	86.49 ± 0.18	85.11 ± 0.26	85.20 ± 0.26
Average	91.58 ± 3.45	94.47 ± 3.30	77.99 ± 6.38	78.68 ± 6.77	80.03 ± 6.60	79.45 ± 6.60	88.49 ± 2.65	88.32 ± 2.77	87.89 ± 4.30	89.30 ± 4.45
p-value	0.001		0.001		0.001		0.001		0.001

The table showcases area under receiver operating characteristic curve (ROC-AUC) percentages for each individual label across datasets: VinDr-CXR, ChestX-ray14, CheXpert, UKA-CXR, and PadChest. These datasets were pretrained using SSL on non-medical images (DINOv2) and fully supervised learning on a dedicated chest radiograph dataset (MIMIC-CXR). The total fine-tuning training images for VinDr-CXR, ChestX-ray14, CheXpert, UKA-CXR, and PadChest were n = 15,000, n = 86,524, n = 128,356, n = 153,537, and n = 88,480, respectively, with corresponding test images totals of n = 3,000, n = 25,596, n = 39,824, n = 39,824, and n = 22,045, respectively. p-values signify the comparison between the average ROC-AUCs from DINOv2 and MIMIC-CXR. For details about each dataset’s labels, refer to Table 3
N/A Not available

Back to article page