Evaluating Reliability of Single Classifications of Neural Networks
Current machine learning algorithms perform well on many problem domains, but in risk-sensitive decision making, for example in medicine and finance, common evaluation methods that give overall assessments of models fail to gain trust among experts, as they do not provide any information about single predictions. We continue the previous work on approaches for evaluating the reliability of single classifications where we focus on methods that are model independent. These methods have been shown to be successful in their narrow fields of application, so we constructed a testing methodology to evaluate these methods in straightforward, general-use test cases. For the evaluation, we had to derive a statistical reference function, which enables comparison between the reliability estimators and the model’s own predictions. We compare five different approaches and evaluate them on a simple neural network with several artificial and real-world domains. The results indicate that reliability estimators CNK and LCV can be used to improve the model’s predictions.
KeywordsReliability estimation Classification Prediction accuracy Prediction error
Unable to display preview. Download preview PDF.