International Journal of Computer Vision

, Volume 60, Issue 2, pp 111–134 | Cite as

Modelling and Interpretation of Architecture from Several Images

  • A.R. Dick
  • P.H.S. Torr
  • R. Cipolla


This paper describes the automatic acquisition of three dimensional architectural models from short image sequences. The approach is Bayesian and model based. Bayesian methods necessitate the formulation of a prior distribution; however designing a generative model for buildings is a difficult task. In order to overcome this a building is described as a set of walls together with a ‘Lego’ kit of parameterised primitives, such as doors or windows. A prior on wall layout, and a prior on the parameters of each primitive can then be defined. Part of this prior is learnt from training data and part comes from expert architects. The validity of the prior is tested by generating example buildings using MCMC and verifying that plausible buildings are generated under varying conditions. The same MCMC machinery can also be used for optimising the structure recovery, this time generating a range of possible solutions from the posterior. The fact that a range of solutions can be presented allows the user to select the best when the structure recovery is ambiguous.

architectural modelling structure and motion object recognition 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Baillard, C., Schmid, C., Zisserman, A., and Fitzgibbon, A.W. 1999. Automatic line matching and 3d reconstruction of buildings from multiple views. In ISPRS Congress, pp. 69–80.Google Scholar
  2. Baker, S. and Kanade, T. 2001. Super-resolution: Reconstruction or recognition? In Proc. IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing, Baltimore, Maryland.Google Scholar
  3. Baker, S., Szeliski, R., and Anandan, P. 1998. A layered approach to stereo reconstruction. In Proc. IEEE Computer Vision and Pattern Recognition, pp. 434–441.Google Scholar
  4. Beardsley, P.A., Zisserman, A., and Murray, D.W. 1997. Sequential updating of projective and affine structure from motion. International Journal of Computer Vision, 23(3):235–259.Google Scholar
  5. Biederman, I. 1985. Human image understanding: Recent research and a theory. Computer Vision Graphics and Image Processing, 32(1):29–73.Google Scholar
  6. Borges, D.L. and Fisher, R.B. 1997. Class-based recognition of 3d objects represented by volumetric primitives. Image and Vision Computing, 15(8):655–664.Google Scholar
  7. Canny, J. 1986. A computational approach to edge detection. IEEE Trans. Pattern Analysis and Machine Intelligence, 8(6):679–698.Google Scholar
  8. Cipolla, R., Okamoto, Y., and Kuno, Y. 1993. Robust structure from motion using motion parallax. In Proc. IEEE International Conference on Computer Vision, pp. 374–382.Google Scholar
  9. Collins, R. 1992. Projective reconstruction of approximately planar scenes. In Interdisciplinary Computer Vision: An Exploration of Diverse Applications, pp. 174–185.Google Scholar
  10. Dick, A.R. 2001. Modelling and interpretation of architecture from several images. PhD thesis, University of Cambridge.Google Scholar
  11. Dick, A.R., Torr, P., and Cipolla, R. 2000. Automatic 3d modelling of architecture. In Proc. 11th British Machine Vision Conference (BMVC'00), Bristol, pp. 372–381.Google Scholar
  12. Dickinson, S.J., Bergevin, R., Biederman, I., Eklundh, J.O., Munck-Fairwood, R., Jain, A.K., and Pentland, A.P. 1997. Panel report: The potential of geons for generic 3-d object recognition. Image and Vision Computing, 15(4):277–292.Google Scholar
  13. Efros, A. and Leung, T. 1999. Texture synthesis by non-parametric sampling. In Proc. IEEE International Conference on Computer Vision, pp. 1033–1038.Google Scholar
  14. Faugeras, O.D., Mundy, J.L., Ahuja, N., Dyer, C.R., Pentland, A.P., Jain, R., Ikeuchi, K., and Bowyer, K.W. 1992. Why aspect graphs are not (yet) practical for computer vision. Computer Vision Graphics and Image Processing, 55(2):212–218.Google Scholar
  15. Ferryman, J.M., Worrall, A.D., Sullivan, G.D., and Baker, K.D. 1995. A generic deformable model for vehicle recognition. In Proceedings British Machine Vision Conference, pp. 127–136.Google Scholar
  16. Fisher, R., 1989. From Surfaces to Objects: Computer vision and three dimensional scene analysis. John Wiley and Sons.Google Scholar
  17. Gelman, A., Carlin, J., Stern, H., and Rubin, D. 1995. Bayesian Data Analysis. Chapman and Hall: Boston.Google Scholar
  18. Gilks,W., Richardson, S., and Spiegelhalter,D. (Eds.), 1996. Markov Chain Monte Carlo in Practice. Chapman and Hall: London.Google Scholar
  19. Green, P. 1995. Reversible jump markov chain monte carlo computation and bayesian model determination. Biometrika, 82:711–732.Google Scholar
  20. Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In Proc. 4th Alvey Conference, pp. 147–152.Google Scholar
  21. Irani, M. and Peleg, S. 1993. Using motion analysis for image enhancement. Journal of Visual Communication and Image Representation, 4(4):324–335.Google Scholar
  22. Jaynes, E.T. 1996. Probability Theory: The Logic of Science. Unpublished but available online at Scholar
  23. Koenderink, J.J. and van Doorn, A.J. 1979. The internal representation of solid shape with respect to vision. Biological Cybernetics, 32:211–216.Google Scholar
  24. Lowe, D.G. 1991. Fitting parameterized three-dimensional models to images. IEEE Trans. Pattern Analysis and Machine Intelligence, 13(5):441–450.Google Scholar
  25. Marr, D. 1982. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W.H. Freeman.Google Scholar
  26. Mendonca, P.R.D.S. and Cipolla, R. 1999. A simple technique for self-calibration. In Proc. IEEE Computer Vision and Pattern Recognition, pp. I:500–505.Google Scholar
  27. Neal, R.M. 1993. Probabilistic inference using monte carlo markov chains. Technical Report CRG-TR-93-1, University of Toronto.Google Scholar
  28. Oliensis, J. 2000. A critique of structure-from-motion algorithms. Computer Vision and Image Understanding, 80(2):172–214.Google Scholar
  29. Papageorgiou, C. and Poggio, T. 2000. A trainable system for object detection. International Journal of Computer Vision, 38(1):15–33.Google Scholar
  30. Pilu, M. and Fisher, R.B. 1996. Recognition of geons by parametric deformable contour models. In Proc. 4th European Conference on Computer Vision, Lecture Notes in Computer Science 1064, Springer-Verlag. pp. I:71–82.Google Scholar
  31. Pope, A.R. 1994. Model-based object recognition:Asurvey of recent research. Technical Report 94-04, University of British Columbia.Google Scholar
  32. Portilla, J. and Simoncelli, E.P. 2000. A parametric texture model based on joint statistics of complex wavelet coefficients. International Journal of Computer Vision, 40(1):49–70.Google Scholar
  33. Schmid, C. and Zisserman, A. 2000. The geometry and matching of lines and curves over multiple views. International Journal of Computer Vision, 40(3):199–233.Google Scholar
  34. Schneiderman, H. and Kanade, T. 2000. A statistical method for 3d object detection applied to faces and cars. In Proc. IEEE Computer Vision and Pattern Recognition, pp. I:746–751.Google Scholar
  35. Slama, C.C. 1980. Manual of Photogrammetry, 4th ed. American Society of Photogrammetry.Google Scholar
  36. Sullivan, J., Blake, A., Isard, M., and Maccormick, J.P. 1999. Object localization by bayesian correlation. In Proc. IEEE International Conference on Computer Vision, pp. 1068–1075.Google Scholar
  37. Taylor, C.J., Debevec, P.E., and Malik, J. 1996. Modeling and rendering architecture from photographs:Ahybrid geometry-and imagebased approach. ACMSIGGraph, Computer Graphics, pp. 11–20.Google Scholar
  38. Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization method. International Journal of Computer Vision, 9(2):137–154.Google Scholar
  39. Torr, P., Dick, A., and Cipolla, R. 2000. Layer extraction with a bayesian model of shapes. In Proc. 6th European Conference on Computer Vision, Lecture Notes in Computer Sciences, Vol. 1843, Springer-Verlag, pp. II:273–289.Google Scholar
  40. Torr, P., Szeliski, R., and Anandan, P. 1999. An integrated bayesian approach to layer extraction from image sequences. In Proc. IEEE International Conference on Computer Vision, pp. 983–990.Google Scholar
  41. Triggs, B. 2000. Plane + parallax, tensors and factorization. In Proc. 6th European Conference on Computer Vision, Lecture Notes in Computer Sciences. vol. 1842, Springer-Verlag, pp. I:522–538.Google Scholar
  42. Triggs, B., Mclauchlan, P., Hartley, R., and Fitzgibbon, A. 2000. Bundle adjustment-A modern synthesis. In Vision Algorithms: Theory and Practice, W. Triggs, A. Zisserman, and R. Szeliski (Eds.), LNCS vol. 1883, Springer Verlag, pp. 298–375.Google Scholar
  43. Wang, J. and Adelson, E.H. 1993. Layered representation for motion analysis. In Proc. IEEE Computer Vision and Pattern Recognition, pp. 361–366.Google Scholar

Copyright information

© Kluwer Academic Publishers 2004

Authors and Affiliations

  • A.R. Dick
    • 1
  • P.H.S. Torr
    • 2
  • R. Cipolla
    • 1
  1. 1.Department of EngineeringUniversity of CambridgeCambridgeUK
  2. 2.Department of ComputingOxford Brookes UniversityWheatley, OxfordUK

Personalised recommendations