Machine learning (ML) tools and artificial neural networks, the latter nowadays progressing to deep learning (DL), are known to be data-driven models often treated as black boxes. They are currently employed in many fields of human life, including healthcare, in particular medical image analysis [1,2,3].
DL models are characterised by a set of parameters and hyperparameters (e.g., network topology and optimisation parameters), which allow to define a non-linear mathematical function that maps input data to target values [4, 5].
During model development, the massive set of parameters are iteratively tuned either to fit an annotated training set (supervised methods) or to achieve optimal clustering performances in a non-annotated one (unsupervised methods), while model hyperparameters parameters are empirically chosen applying grid or random searching strategies on the validation set. Next, the model is tested on the testing set to prove model generalizability. Therefore, DL models are the indissoluble result all the steps involved in training and validation phases that include data collection and preparation as well as augmentation and split and training and validation pipeline [4, 6]. Indeed, even freezing model hyperparameters changing the training dataset results in completely different models.
This whole process exploits limited or no a-priori knowledge about the physical/biological behaviour of the modelled system without being explicitly programmed for a specific task [7]. However, versatility of use and ability to model complex relationship within data are reached through the design of extremely complex models.
DL data-driven approach opposes to internal modelling, which allows to define the mathematical structure of the model based on physiological a-priori knowledge and to parametrise it with few physical/physiological meaningful variables. Indeed, final DL parameters and hyperparameters do not have any meaning other than contributing to high classification performance of trained models.
Not surprisingly, the overall outcome of DL models is rather obscure, apart that it works, that is to say “the proof of the pudding is in the eating”. However, we should admit that data-driven and internal models share many issues concerning the insight of the underlying mechanisms, when real clinical cases are under analysis. Indeed, the needed simplifications and approximations are transparent to few scholars. Moreover, even the most renown and established models in medicine are practically useless if the statistics of biological variability was not included.
Many issues have risen about the use of data-driven black-box classifiers in diagnostic decisions making, such as the possible reduction of physician skills, reliability of digital data, intrinsic uncertainty in medicine and need to open the DL black box [8]. Those concerns involve model real usefulness, reliability, safety and effectiveness in a clinical environment [9, 10].
While clinical standards may be defined to test model safety and effectiveness, model opacity represents an open issue. Indeed, the General Data Protection regulation introduced by the European Union (articles 13, 14, and 15) includes some clauses about the right for all individuals to obtain “meaningful explanation of the logic involved” when automated decision-making takes place [11]. Thus, the development of enabling technologies and/or good practices able to explain the opaque inner working of these algorithms is mandatory to comply with the important principles behind these clauses [12].
We assume that model opacity may be alleviated by enriching the DL outcomes using the information that the model derives from its training and validation dataset in a user-friendly approach, letting radiologists take their final decision with due criticism.
Paradoxically, the learning strategies of black-box DL models do facilitate this task. As mentioned above, DL trained models are defined by their architecture and massive set of parameters encrypting the information of the training and validation sets. So, the training/validation sets and the trained models are assumed as being strongly and binomially linked, which bears the non-trivial consequence that also the training/validation data set should be available to users. In our vision, if data is the only prior of a black box model, this should be made transparent in the same way as physical/physiological priors must be stated for internal (alias white-box) models. Nonetheless, we illustrate a transparency principle based on highlighting annotated cases (ACs) proximal to the current case (CC) out of a library linked to the DL model. The basic requirements are as follows (i) to furnish the library of ACs (training and/or validation sets), as annex of the trained algorithm; (ii) save the coordinates of the ACs in the classification space, to be used as indexing within the library; (iii) to define a metric in the classification space permitting to univocally define the proximity of ACs to the CC.
In this article, we describe our approach focusing on a specific DL model, namely a convolutional multi-layer neural networks used to perform a binary classification task.