Authors: Olga Kazakova, Po-shen Lee, Jevin West, Bill Howe
Abstract: Scientific communication depends on visual representations of data, results, analysis, and models. Given the important role of these information objects, we have built a figure-centric search engine called VizioMetrics.org. We have used millions of figures from PubMed Central to better understand effective visual communication and scholarly impact. In this poster we present preliminary results for automatically identifying key figures in scholarly papers. We conducted a large-scale survey asking authors to identify their central figures – the single visualization that encapsulates key aspects of a paper. If participants were able to identify such figures, they were asked to indicate what the selected figures represent. Our results show that for over 90% of evaluated papers the authors were able to identify a single central figure. In most cases such figures represent results and the most common figure class is the composite, followed by the diagram. We then use this training set to test early-stage algorithms for identifying these graphical abstracts.