Abstract
In this paper, we study (normalized) disjoint information as a metric for image comparison and its applications to perceptual image quality assessment, image registration, and video tracking. Disjoint information is the joint entropy of random variables excluding the mutual information. This measure of statistical dependence and information redundancy satisfies more rigorous metric conditions than mutual information, including self-similarity, minimality, symmetry and triangle inequality. It is applicable to two or more random variables, and can be computed by vector histogramming, vector Parzen window density approximation, and upper bound approximation involving fewer variables. We show such a theoretic advantage does have implications in practice. In the domain of digital image and video, multiple visual features are extracted and (normalized) compound disjoint information is derived from a set of marginal densities of the image distributions, thus enriching the vocabulary of content representation. The proposed metric matching functions are applied to several domain applications to demonstrate their efficacy.
Similar content being viewed by others
References
Aczel, J., & Daroczy, Z. (1975). Measures of information and their characterizations. New York: Academic Press.
Bennett, C. H., Gacs, P., Li, M., Vitanyi, P. M. B., & Zurek, W. H. (1998). Information distance. IEEE Transactions on Information Theory, 44(4), 1407–1423.
Bohm, C., Berchtold, S., & Keim, D. A. (2001). Searching in high-dimensional spaces: index structures for improving the performance of multimedia databases. ACM Computing Surveys, 33(3), 322–373.
Cover, T., & Thomas, J. (1991). Elements of information theory. New York: Wiley.
Daly, S. (1993). The visible differences predictor: an algorithm for the assessment of image fidelity. In Watson, A. (Ed.), Digital images and human vision (pp. 179–206). Cambridge: MIT Press.
Fano, R. M. (1961). Transmission of information: a statistical theory of communication. Cambridge: MIT Press.
Han, T. S. (1980). Multiple mutual informations and multiple interactions in frequency data. Information and Control, 46(1), 26–45.
Haralick, R. M. (1979). Statistical and structural approaches to texture. Proceedings of the IEEE, 67(5), 786–804.
Horn, B. K. P., & Schunk, B. G. (1981). Determining optical flow. Artificial Intelligence, 17, 185–203.
Huang, J., Kumar, S. R., Mitra, M., Zhu, W. J., & Zabih, R. (1999). Spatial color indexing and applications. International Journal of Computer Vision, 35(3), 245–268.
Huttenlocher, D. P., Klanderman, G. A., & Rucklidge, W. J. (1993). Comparing images using the Hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(9), 850–863.
Ibanez, L., Schroeder, W., Ng, L., & Cates, J. (2004). The ITK software guide: the insight segmentation and registration toolkit. Kitware.
Kelly, D. H. (1979). Motion and vision. II. Stabilized spatio-temporal threshold surface. Journal of Optical Society of America, 69, 1340–1349.
Kunt, M., & van den Branden Lambrecht, C. (1998). Special issue on image and video quality metrics. Signal Processing 70(3).
Li, M., Chen, X., Li, X., Ma, B., & Vitanyi, P. M. (2004). The similarity metric. IEEE Transactions on Information Theory, 50(12), 3250–3264.
Li, K., Chen, M., & Kanade, T. (2007). Cell population tracking and lineage construction with spatiotemporal context. In International conference on medical image computing and computer assisted intervention (MICCAI).
Linde, Y., Buzo, A., & Gray, R. (1980). An algorithm for vector quantizer design. IEEE Transactions on Communications, 28(1), 84–95.
Lowe, D. G. (1995). Similarity metric learning for a variable-kernel classifier. Neural Computation, 7(1), 72–85.
Maes, F., Collignon, A., Vandermeulen, D., Marchal, G., & Seutens, P. (1997). Multimodality image registration by maximization of mutual information. IEEE Transactions on Medical Imaging, 16(2), 187–198.
Mallat, S. (1989). A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7), 674–693.
Mannos, J. L., & Sakrison, D. J. (1974). The effects of a visual fidelity criterion on the encoding of images. IEEE Transactions on Information Theory, 20, 525–536.
Mattes, D., Haynor, D., Vesselle, H., Lewellen, T., & Eubank, W. (2003). PET-CT image registration in the chest using free-form deformations. IEEE Transactions on Medical Imaging, 22(1), 120–128.
McGill, W. J. (1954). Multivariate information transmission. Psychometrika, 19, 97–116.
McInerney, T., & Terzopoulos, D. (1996). Deformable models in medical image analysis: a survey. Medical Image Analysis, 1(2), 91–108.
Mundy, J. L., & Chang, C. F. (2004). Fusion of intensity, texture, and color in video tracking based on mutual information. In Proceedings of the 33rd applied imagery pattern recognition workshop (AIPR’04) (pp. 10–15). Washington: IEEE Computer Society.
Pluim, J. P. W., Maintz, J. B. A., & Viergever, M. A. (2003). Mutual-information-based registration of medical images: a review. IEEE Transactions on Medical Imaging, 22(8), 986–1004.
Rajski, C. (1961). A metric space of discrete probability distributions. Information and Control, 4(4), 371–377.
Rohaly, A. et al. (2000). Video quality experts group: current results and future directions. In SPIE visual communications and image processing (vol. 4067, pp. 742–753).
Rubner, Y., Puzicha, J., Tomasi, C., & Buhmann, J. M. (2001). Empirical evaluation of dissimilarity measures for color and texture. Computer Vision and Image Understanding, 84(1), 25–43.
Santini, S., & Jain, R. (1999). Similarity matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(9), 871–883.
Schiele, B., & Crowley, J. (2000). Recognition without correspondence using multidimensional receptive field histograms. International Journal of Computer Vision, 36(1), 31–50.
Schwartzkopf, W., Bovik, A., & Evans, B. (2005). Maximum-likelihood techniques for joint segmentation-classification of multispectral chromosome images. IEEE Transactions on Medical Imaging, 24(12), 1593–1610.
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423, 623–656.
Sheikh, H., Sabir, M., & Bovik, A. (2006a). A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Transactions on Image Processing, 15(11), 3440–3451.
Sheikh, H. R., Wang, Z., Cormack, L., & Bovik, A. C. (2006b). Live image quality assessment database release 2. http://live.ece.utexas.edu/research/quality/subjective.htm.
Studholme, C., Hill, D. L. G., & Hawkes, D. J. (1999). An overlap invariant entropy measure of 3D medical image alignment. Pattern Recognition, 32(1), 71–86.
Sun, Z. (2003). Adaptation for multiple cue integration. In IEEE international conference on computer vision and pattern recognition (vol. I, pp. 440–445), Madison, WI.
Sun, Z. (2006). Video halftoning. IEEE Transactions on Image Processing, 15(3), 678–686.
Sun, Z., & Hoogs, A. (2006). Image comparison by compound disjoint information. In IEEE international conference on computer vision and pattern recognition (pp. 857–862), New York, NY.
Sun, Z., & Ray, L. A. (2005). Multimodal image registration based on compound mutual information. In SPIE medical imaging conference (vol. 5747, pp. 1274–1282), San Diego, CA.
Swain, M., & Ballard, D. (1991). Color indexing. International Journal of Computer Vision, 7(1), 11–32.
Teo, P., & Heeger, D. (1994). Perceptual image distortion. In IEEE international conference on image processing (Vol. 2, pp. 982–986).
Thevenaz, P., & Unser, M. (2000). Optimization of mutual information for multiresolution image registration. IEEE Transactions Image Processing, 9(12), 2083–2099.
Tversky, A. (1977). Features of similarity. Psychological Review, 84(2), 327–352.
Viola, P., & Wells, III W. M. (1997). Alignment by maximization of mutual information. International Journal of Computer Vision, 24(2), 137–154.
VQEG. (2000). Final report from the video quality experts group on the validation of objective models of video quality assessment. http://www.vqeg.org.
Wandell, B. A. (1995). Foundations of vision. Sunderland: Sinauer Associates.
Wang, Z., Bovik, A., Sheikh, H., & Simoncelli, E. (2004). Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.
Wells, III, W. M., Viola, P., Atsumi, H., Nakajima, S., & Kikinis, R. (1996). Multi-modal volume registration by maximization of mutual information. Medical Image Analysis, 1(1), 35–51.
Winkler, S. (1999). Issues in vision modeling for perceptual video quality assessment. Signal Processing, 78(2), 231–252.
Wu, B., & Nevatia, R. (2007). Detection and tracking of multiple, partially occluded humans by Bayesian combination of edgelet based part detectors. International Journal of Computer Vision, 75(2), 247–266.
Yianilos, P. N. (1991). Normalized forms for two common metrics (Technical Report 91-082-9027-1). NEC Research Institute.
Yilmaz, A., Javed, O., & Shah, M. (2006). Object tracking: a survey. ACM Computing Surveys, 38(4), 1–45.
Zhang, J., & Rangarajan, A. (2004). Affine image registration using a new information metric. In: IEEE international conference on computer vision and pattern recognition (vol. 1, pp. 848–855).
Zhang, J., & Rangarajan, A. (2005). Multimodality image registration using an extensible information metric and high dimensional histogramming. In: International conference on information processing in medical imaging (pp. 725–737), Glenwood Springs, CO.
Zurek, W. H. (1989). Thermodynamic cost of computation, algorithmic complexity and the information metric. Nature, 341, 119–124.
Author information
Authors and Affiliations
Corresponding author
Additional information
The work was done when A. Hoogs was with GE Global Research, Niskayuna, New York.
Rights and permissions
About this article
Cite this article
Sun, Z., Hoogs, A. Image Comparison by Compound Disjoint Information with Applications to Perceptual Visual Quality Assessment, Image Registration and Tracking. Int J Comput Vis 88, 461–488 (2010). https://doi.org/10.1007/s11263-010-0316-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-010-0316-z