Abstract
Machine learning is widely used for mining collections, such as images, sounds, or texts, by classifying their elements into categories. Automatic classification based on supervised learning requires groundtruth datasets for modeling the elements to classify, and for testing the quality of the classification. Because collecting groundtruth is tedious, a method for estimating the potential errors in large datasets based on limited groundtruth is needed. We propose a method that improves classification quality by using limited groundtruth data to extrapolate the potential errors in larger datasets. It significantly improves the counting of elements per class. We further propose visualization designs for understanding and evaluating the classification uncertainty. They support end-users in considering the impact of potential misclassifications for interpreting the classification output. This work was developed to address the needs of ecologists studying fish population abundance using computer vision, but generalizes to a larger range of applications. Our method is largely applicable for a variety of Machine learning technologies, and our visualizations further support their transfer to end-users.
Similar content being viewed by others
Notes
A measure of features’ similarity comparing an item to classify and a class model (Sect. 3.1).
References
Alsallakh, B., Hanbury, A., Hauser, H., Miksch, S., Rauber, A.: Visual methods for analyzing probabilistic classification data. Vis. Comp. Graph. IEEE Trans. 20(12), 1703–1712 (2014)
Beauxis-Aussalet, E., Hardman, L., van Ossenbruggen, J.: Deliverable D2.1 of the Fish4Knowledge Project-User information needs. Tech. rep. http://groups.inf.ed.ac.uk/f4k/DELIVERABLES/Del21.pdf
Beauxis-Aussalet, E., Arslanova, E., Hardman, L., van Ossenbruggen, J.: A case study of trust issues in scientific video collections. In: Proceedings of the 2nd ACM international workshop on Multimedia analysis for ecological data. ACM (2013)
Beauxis-Aussalet, E., Arslanova, E., Hardman, L., van Ossenbruggen, J.: A video processing and data retrieval framework for fish population monitoring. In: Proceedings of the 2nd ACM international workshop on Multimedia analysis for ecological data. ACM (2013)
Beauxis-Aussalet, E., Hardman, L.: Visualization of confusion matrix for non-expert users. In: Poster at the IEEE Conference on Visualization—IEEE VIS (2014)
Bliss, C.: The method of probits. Science 79(2037), 38–39 (1934)
Chan, A.B., Liang, Z.S., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: Conference on computer vision and pattern recognition—CVPR, pp. 1–7. IEEE (2008)
Chen, C.: Top 10 unsolved information visualization problems. Computer Graphics and Applications, IEEE 25(4), 12–16 (2005)
Correa, C.D., Chan, Y.H., Ma, K.L.: A framework for uncertainty-aware visual analytics. In: IEEE Symposium on visual analytics science and technology—VAST, pp. 51–58 (2009)
Gibson, R., Barnes, M., Atkinson, R.: Practical measures of marine biodiversity based on relatedness of species. Oceanogr. Mar. Biol. Ann. Rev. 39, 207–231 (2001)
Hay, A.: The derivation of global estimates from a confusion matrix. Int. J. Remote Sens. 9(8), 1395–1398 (1988)
Hetrick, N.J., Simms, K.M., Plumb, M.P., Larson, J.P.: Feasibility of using video technology to estimate salmon escapement in the Ongivinuk River, a clear-water tributary of the Togiak River. US Fish and Wildlife Service, King Salmon Fish and Wildlife Field Office (2004)
Houben, S., Stallkamp, J., Salmen, J., Schlipsing, M., Igel, C.: Detection of traffic signs in real-world imagvucetices: the german traffic sign detection benchmark. In: International Joint Conference on Neural Networks. No. 1288 (2013)
Huang, P.X., Boom, B.J., Fisher, R.B.: GMM improves the reject option in hierarchical classification for fish recognition. In: IEEE Winter Conference on applications of computer vision—WACV, pp. 371–376 (2014)
Irvine, J., Ward, B., Teti, P., Cousens, N.: Evaluation of a method to count and measure live salmonids in the field with a video camera and computer. North Am. J. Fish. Manag. 11(1), 20–26 (1991)
Johnson, C.: Top scientific visualization research problems. Comput. Graph. Appl. IEEE 24(4), 13–17 (2004)
Jupp, D.L.B.: The stability of global estimates from confusion matrices. Int. J. Remote Sens. 10(9), 1563–1569 (1989)
Lehmussola, A., Ruusuvuori, P., Selinummi, J., Huttunen, H., Yli-Harja, O.: Computational framework for simulating fluorescence microscope images with cell populations. Med. Imaging IEEE Trans 26(7), 1010–1016 (2007)
Lempitsky, V.S., Zisserman, A.: Learning to count objects in images. In: NIPS. vol. 1, p. 2 (2010)
Lip, C., Ramli, D.: Comparative study on feature, score and decision level fusion schemes for robust multibiometric systems. In: Sambath, S., Zhu, E. (eds.) Frontiers in computer education, advances in intelligent and soft computing, vol. 133, pp. 941–948. Springer, Berlin, Heidelberg (2012)
McCullagh, P., Nelder, J.A.: Generalized linear models, vol. 2. Chapman and Hall, London (1989)
Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in large margin classifiers. Citeseer (1999)
Saerens, M., Latinne, P., Decaestecker, C.: Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural Comput. 14(1), 21–41 (2002)
Tulp, I., Bolle, L.J., Rijnsdorp, A.D.: Signals from the shallows: in search of common patterns in long-term trends in dutch estuarine and coastal fish. J. Sea Res. 60(1), 54–73 (2008)
Visser, H.: Estimation and detection of flexible trends. Atmos. Environ. 38(25), 4135–4145 (2004)
Vucetic, S., Obradovic, Z.: Classification on data with biased class distribution. In: Proceedings of the 12th European conference on machine learning. pp. 527–538. Springer (2001)
Ware, C.: Information visualization: perception for design. Elsevier (2013)
Watson, D.L., Harvey, E.S., Anderson, M.J., Kendrick, G.A.: A comparison of temperate reef fish assemblages recorded by three underwater stereo-video techniques. Mar. Biol. 148(2), 415–425 (2005)
Willis, T.J., Babcock, R.C.: A baited underwater video system for the determination of relative density of carnivorous reef fish. Mar. Freshw. Res. 51(8), 755–763 (2000)
Yoshida, T., Akagi, K., Toda, T., Kushairi, M., Kee, A., Othman, B.: Evaluation of fish behaviour and aggregation by underwater videography in an artificial reef in tioman island, malaysia. Sains Malays. 39(3), 395–403 (2010)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Boom, B.J., Beauxis-Aussalet, E., Hardman, L. et al. Uncertainty-aware estimation of population abundance using machine learning. Multimedia Systems 22, 737–749 (2016). https://doi.org/10.1007/s00530-015-0479-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-015-0479-0