Multimedia Tools and Applications

, Volume 30, Issue 1, pp 89–108 | Cite as

Spatial interest pixels (SIPs): useful low-level features of visual media data

  • Qi LiEmail author
  • Jieping Ye
  • Chandra Kambhamettu


Visual media data such as an image is the raw data representation for many important applications. Reducing the dimensionality of raw visual media data is desirable since high dimensionality degrades not only the effectiveness but also the efficiency of visual recognition algorithms. We present a comparative study on spatial interest pixels (SIPs), including eight-way (a novel SIP detector), Harris, and Lucas‐Kanade, whose extraction is considered as an important step in reducing the dimensionality of visual media data. With extensive case studies, we have shown the usefulness of SIPs as low-level features of visual media data. A class-preserving dimension reduction algorithm (using GSVD) is applied to further reduce the dimension of feature vectors based on SIPs. The experiments showed its superiority over PCA.


Dimensionreduction Low-levelfeatures Spatial interest pixels Facial expression recognition Face recognition 


  1. 1.
    Arya S (1995) Nearest neighbor searching and applications. In Ph. D. Thesis, University of Maryland, College Park, MarylandGoogle Scholar
  2. 2.
    Belhumeur P, Hespanha J, Kriegman D (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE TPAMI 19(7):711–720Google Scholar
  3. 3.
    Bergen J, Landy M (1991) Computational modeling of visual texture segregation. In computational models of visual perception. MIT, Cambridge Massachusetts, 1991, pp 253–271Google Scholar
  4. 4.
    Chellappa R, Wilson C, Sirohey S (1995) Human and machine recognition of faces: a survey. Proc IEEE 83(5):705–740CrossRefGoogle Scholar
  5. 5.
    Ekman P, Friesen W (1976) Pictures of facial affect. In Consulting psychologist, Palo Alto, CaliforniaGoogle Scholar
  6. 6.
    Fisher R (1936) The use of multiple measurements in taxonomic problems. In Annals of Eugenics 7:179–188Google Scholar
  7. 7.
    Gevers T, Smeulders AWM (1998) Image indexing using composite color and shape invariant features. In ICCV, pp 576–581Google Scholar
  8. 8.
    Hancock P, Burton A, Bruce V (1996) Face processing: human perception and principal components analysis. Mem Cogn 24:26–40Google Scholar
  9. 9.
    Harris C, Stephens M (1988) A combined corner and edge detector. In Proc. 4th Alvey Vision Conference, Manchester, pp 147–151Google Scholar
  10. 10.
    Huber P (1981) Robust statistics. WileyGoogle Scholar
  11. 11.
    Jolliffe I (1986) Principle component analysis. J Educ Psychol 24:417–441Google Scholar
  12. 12.
    Joyce D, Lewis P, Tansley R, Dobie M, Hall W (2000) Semiotics and agents for integrating and navigating through multimedia representations of concepts. In Proceedings of SPIE Vol. 3972, Storage and Retrieval for Media Databases 2000, pp 132–143Google Scholar
  13. 13.
    Lin W-H, Hauptmann A (2002) News video classification using SVM-based multimodal classifiers and combination strategies. In ACM Multimedia, Juan-les-Pins, France, pp 323–326Google Scholar
  14. 14.
    Loan CV (1976) Generalizing the singular value decomposition. SIAM J Numer Anal 13(1):76–83zbMATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Loupias E, Sebe N (1999) Wavelet-based salient points for image retrieval. In RR 99.11, Laboratoire Reconnaissance de Formes et Vision, INSA Lyon, NovemberGoogle Scholar
  16. 16.
    Lu Y, Hu C, Zhu X, Zhang H, Yang Q (2000) A unified framework for semantics and feature based relevance feedback in image retrieval systems. In ACM Multimedia, pp 31–37Google Scholar
  17. 17.
    Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In International Joint Conference on Artificial Intelligence, pp 674–679Google Scholar
  18. 18.
    Lyons M, Budynek J, Akamatsu S (1999) Automatic classification of single facial images. IEEE transcations on PAMI 21(12):1357–1362Google Scholar
  19. 19.
    Martinez A, Benavente R (1998) The AR face database. Technical Report CVC Tech. Report No. 24Google Scholar
  20. 20.
    Martinez A, Kak A (2001) PCA versus LDA. IEEE TPAMI 23(2):228–233Google Scholar
  21. 21.
    Howland P, Jeon M, Park H (2003) Cluster structure preserving dimension reduction based on the generalized singular value decomposition. SIAM J Matrix Anal Appl 25(1):165–179Google Scholar
  22. 22.
    Schmid C, Mohr R, Bauckhage C (2000) Evaluation of interest point detectors. Int J Comput Vis 37(2):151–172zbMATHCrossRefGoogle Scholar
  23. 23.
    Sim T, Sukthankar R, Mullin M, Baluja S Memory-based face recognition for visitor identification. In Proc. 4th Intl. Conf. on FG'00, pp 214–220Google Scholar
  24. 24.
    Smith J (1997) Integrated spatial and feature image systems: retrieval and compression. In PhD thesis, Graduate School of Arts and Sciences, Columbia University, New York, New YorkGoogle Scholar
  25. 25.
    Swain M, Ballard D (1991) Color indexing. Int J Comput Vis 7:11–32CrossRefGoogle Scholar
  26. 26.
    Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86CrossRefGoogle Scholar
  27. 27.
    Ye J, Janardan R, Park C, Park H (2003) A new optimization criterion for generalized discriminant analysis on undersampled problems. Technical Report TR-026-03, Department of Computer Science and Engineering University of Minnesota, Twin Cities, U.S.A., 2003Google Scholar
  28. 28.
    Ye J, Janardan R, Park C, Park H (2003) A new optimization criterion for generalized discriminant analysis on undersampled problems. In IEEE Intl. Conf. on Data Mining, pp 419–426Google Scholar
  29. 29.
    Zhang Z (1999) Feature-based facial expression recognition: experiments with a multi-layer perceptron. Int J Pattern Recogn Artif Intell 13(6):893–911CrossRefGoogle Scholar
  30. 30.
    Zhao W, Chellappa R, Rosenfeld A, Phillips P (2000) Face recognition: a literature survey. Technical Report CAR-TR-948Google Scholar

Copyright information

© Springer Science + Business Media, LLC 2006

Authors and Affiliations

  1. 1.Video/Image Modeling and Synthesis Lab Computer Information & SciencesUniversity of DelawareNewarkUSA
  2. 2.Computer Science & EngineeringArizona State UniversityTempeUSA

Personalised recommendations