Skip to main content

Table 4 Comparative evaluation of pretraining with self-supervision on non-medical images versus full supervision on non-medical images

From: Enhancing diagnostic deep learning via self-supervised pretraining on large-scale, unlabeled non-medical images

 

Pretraining

VinDr-CXR

ChestX-ray14

CheXpert

MIMIC-CXR

UKA-CXR

PadChest

ROC-AUC

DINOv2

88.92 ± 4.59

79.79 ± 6.55

80.02 ± 6.60

80.52 ± 6.17

89.74 ± 3.57

87.62 ± 4.86

ImageNet-21 K

86.38 ± 6.27

79.10 ± 6.34

79.56 ± 6.51

79.92 ± 6.35

89.45 ± 3.62

87.12 ± 5.05

Accuracy

DINOv2

82.49 ± 6.92

72.81 ± 7.43

72.37 ± 8.29

73.08 ± 5.32

80.68 ± 4.00

79.82 ± 6.69

ImageNet-21 K

81.92 ± 6.50

71.69 ± 7.29

71.36 ± 8.39

73.00 ± 5.37

79.94 ± 4.29

78.73 ± 7.49

Sensitivity

DINOv2

83.58 ± 6.93

73.14 ± 8.94

75.68 ± 6.45

74.87 ± 10.01

83.42 ± 4.57

81.66 ± 6.91

ImageNet-21 K

78.50 ± 8.97

73.04 ± 8.23

75.43 ± 6.00

73.91 ± 9.51

83.76 ± 4.37

81.80 ± 5.30

Specificity

DINOv2

81.69 ± 7.37

73.32 ± 8.00

70.95 ± 9.69

72.25 ± 6.04

80.32 ± 4.44

79.49 ± 6.97

ImageNet-21 K

81.80 ± 6.88

72.10 ± 7.94

70.23 ± 9.33

72.30 ± 6.16

79.39 ± 4.61

78.37 ± 7.80

ROC-AUC p-value

0.001

0.001

0.001

0.001

0.001

0.001

  1. The metrics used for comparison include the area under the receiver operating characteristic curve (ROC-AUC), accuracy, sensitivity, and specificity percentage values, all averaged over all labels for each dataset. The datasets in question are those pretrained with self-supervision on non-medical images (DINOv2 [18]) and those under full supervision with non-medical images (ImageNet-21 K [13]). The datasets employed in this study are VinDr-CXR, ChestX-ray14, CheXpert, MIMIC-CXR, UKA-CXR, and PadChest, with fine-tuning training images totals of n = 15,000, n = 86,524, n = 128,356, n = 170,153, n = 153,537, and n = 88,480, respectively, and test images totals of n = 3,000, n = 25,596, n = 39,824, n = 43,768, n = 39,824, and n = 22,045, respectively. For more information on the different labels used for each dataset, please refer to Table 3. p-values are given for the comparison between the ROC-AUC results obtained from DINOv2 and ImageNet-21 K pretraining weights