From: Are deep models in radiomics performing better than generic models? A systematic review
Internal validation cohorts | External validation cohorts | ||||||||
---|---|---|---|---|---|---|---|---|---|
Network characteristic | Median gain in AUC | Better | Equal | Worse | Median gain in AUC | Better | Equal | Worse | |
Dimension | Two-dimensional | + 0.05 | 78% (31/40) | 8% (3/40) | 15% (6/40) | + 0.08 | 82% (9/11) | 0% (0/11) | 18% (2/11) |
Three-dimensional | + 0.02 | 69% (18/26) | 4% (1/26) | 27% (7/26) | + 0.00 | 44% (4/9) | 11% (1/9) | 44% (4/9) | |
Weights | Pretrained | + 0.07 | 86% (24/28) | 7% (2/28) | 7% (2/28) | + 0.09 | 67% (6/9) | 22% (2/9) | 11% (1/9) |
Trained from scratch | + 0.02 | 66% (25/38) | 5% (2/38) | 29% (11/38) | + 0.01 | 64% (7/11) | 9% (1/11) | 27% (3/11) | |
Approach | End-to-end | + 0.05 | 72% (26/36) | 8% (3/36) | 19% (7/36) | + 0.02 | 60% (6/10) | 10% (1/10) | 30% (3/10) |
Feature extractor | + 0.05 | 77% (23/30) | 3% (1/30) | 20% (6/30) | + 0.09 | 70% (7/10) | 20% (2/10) | 10% (1/10) |