Authors: Emma Beauxis-Aussalet, Lynda Hardman
Abstract: Machine Learning techniques can automatically extract information from a variety of multimedia sources, e.g., image, text, sound, video. But it produces imperfect results since the multimedia content can be misinterpreted. Machine Learning errors are commonly measured using confusion matrices. They encode type I and II errors for each class of information to extract. Non-expert users encounter difficulties in understanding and using confusion matrices. They need to be read both column- and row-wise, which is tedious and error prone, and their technical concepts need explanations. Further, the visualizations commonly used by Machine Learning experts make use of complex metrics derived from confusion matrices (e.g., Precision/Recall, F1 scores). These can be overwhelming and misleading for non-experts. Derived metrics convey specific types of errors, and may be inappropriate for specific use cases. For instance, type II errors (False Negative) are critical for medical diagnosis while type I errors (False Positive) are more tolerated. In the case of optical sorting of manufactured products (defect detection), the sensitivity to errors can be the opposite. Non-experts may use inappropriate metrics for their use case, or misinterpret them. We propose a novel visualization design that addresses such issues with non-experts users. We specify the potential misinterpretations that can arise in typical use cases of machine learning applications. We argue that our visualization is likely to be easier to understand and to minimize the risk of misinterpretation, and so for all kind of use cases. We conclude by discussing future empirical evaluations of our design.