Learning Structural Similarity of User Interface Layouts Using Graph Networks

Manandhar, Dipu; Ruta, Dan; Collomosse, John

doi:10.1007/978-3-030-58542-6_44

Dipu Manandhar¹²,
Dan Ruta¹² &
John Collomosse^12,13

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12367))

Included in the following conference series:

European Conference on Computer Vision

3226 Accesses
9 Citations

Abstract

We propose a novel representation learning technique for measuring the similarity of user interface designs. A triplet network is used to learn a search embedding for layout similarity, with a hybrid encoder-decoder backbone comprising a graph convolutional network (GCN) and convolutional decoder (CNN). The properties of interface components and their spatial relationships are encoded via a graph which also models the containment (nesting) relationships of interface components. We supervise the training of a dual reconstruction and pair-wise loss using an auxiliary measure of layout similarity based on intersection over union (IoU) distance. The resulting embedding is shown to exceed state of the art performance for visual search of user interface layouts over the public Rico dataset, and an auto-annotated dataset of interface layouts collected from the web. We release the codes and dataset (https://github.com/dips4717/gcn-cnn.)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ashual, O., Wolf, L.: Specifying object attributes and relations in interactive scene generation. In: Proceedings of ICCV (2019)
Google Scholar
Beltramelli, T.: pix2code: generating code from a graphical user interface screenshot. arXiV 1705.07962v2 (2017)
Google Scholar
Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: going beyond euclidean data. IEEE Signal Process. Mag. 34(4), 18–42 (2017)
Article Google Scholar
Bui, T., Ribeiro, L., Ponti, M., Collomosse, J.: Compact descriptors for sketch-based image retrieval using a triplet loss convolutional neural network. Comput. Vis. Image Understand. (CVIU) 164, 27–37 (2017)
Article Google Scholar
Bylinskii, Z., et al.: Learning visual importance for graphic designs and data visualizations. In: Proceedings of ACM UIST (2017)
Google Scholar
Chen, J., Ma, T., Xiao, C.: FastGCN: fast learning with graph convolutional networks via importance sampling. In: Proceedings of International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Deka, B., et al.: Rico: a mobile app dataset for building data-driven design applications. In: Proceedings of the 30th Annual Symposium on User Interface Software and Technology. UIST 2017 (2017)
Google Scholar
Geigel, J., Loui, A.: Automatic page layout using genetic algorithms for electronic albuming. In: Proceedings of Electronic Imaging (2001)
Google Scholar
Goldenbert, E.: Automatic layout of variable-content print data. Master’s thesis, School of Cognitive & Computing Sciences, University of Sussex, UK (2000)
Google Scholar
Gordo, A., Almazán, J., Revaud, J., Larlus, D.: Deep image retrieval: learning global representations for image search. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 241–257. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_15
Chapter Google Scholar
Gu, J., Joty, S., Cai, J., Zhao, H., Yang, X., Wang, G.: Unpaired image captioning via scene graph alignments. In: Proceedings of ICCV (2019)
Google Scholar
Guo, L., Liu, J., Tang, J., Li, J., Luo, W., Lu, H.: Aligning linguistic words and visual semantic units for image captioning. In: ACM Multimedia (2019)
Google Scholar
Guo, M., Chou, E., Huang, D., Song, S., Yeung, S., Fei-Fei, L.: Neural graph matching networks for few shot 3D action recognition. In: Proceedings of ECCV (2018)
Google Scholar
Harrington, S., Naveda, J., Jones, R., Roetling, P., Thakkar, N.: Aesthetic measures for automated document layout. In: Proceedings of the 2004 ACM Symposium on Document Engineering (2004)
Google Scholar
Huang, C., Loy, C.C., Tang, X.: Local similarity-aware deep feature embedding. In: Advances in Neural Information Processing Systems (2016)
Google Scholar
Hurst, N., Li, W., Marriott, K.: Review of automatic document formatting. In: Proceedings of the ACM Document Engineerin (2009)
Google Scholar
Khan, N., Chaudhuri, U., Banerjee, B., Chaudhuri, S.: Graph convolutional network for multilabel remote sensing scene recognition. J. Neurocomput. 357, 36–46 (2019)
Article Google Scholar
Kuen, J., Wang, Z., Wang, G.: Recurrent attentional networks for saliency detection. In: Proceedings of the CVPR (2016)
Google Scholar
Li, J., Yang, J., Hertzmann, A., Zhang, J., Xu, T.: LayoutGAN: generating graphic layouts with wireframe discriminators. In: Proceedings of the International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Liu, T.F., Craft, M., Situ, J., Yumer, E., Mech, R., Kumar, R.: Learning design semantics for mobile apps. In: The 31st Annual ACM Symposium on User Interface Software and Technology, UIST 2018, pp. 569–579. ACM, New York (2018). https://doi.org/10.1145/3242587.3242650
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the CVPR (2015)
Google Scholar
O’Donovan, P., Agarwala, A., Hertzmann, A.: Learning layouts for single-page graphic designs. IEEE Trans. Visual. Comput. Graph. 20(8), 1200–1213 (2014)
Article Google Scholar
O’Donovan, P., Agarwala, A., Hertzmann, A.: Designscape: design with interactive layout suggestions. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 1221–1224 (2015)
Google Scholar
Oh Song, H., Xiang, Y., Jegelka, S., Savarese, S.: Deep metric learning via lifted structured feature embedding. In: Proceedings of the CVPR (2016)
Google Scholar
Radenović, F., Tolias, G., Chum, O.: CNN image retrieval learns from BoW: unsupervised fine-tuning with hard examples. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 3–20. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_1
Chapter Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of the NIPS (2015)
Google Scholar
Sangkloy, P., Burnell, N., Ham, C., Hays, J.: The sketchy database: learning to retrieve badly drawn bunnies. In: Proceedings of the ACM SIGGRAPH (2016)
Google Scholar
Schlichtkrull, M., Kipf, T.N., Bloem, P., van den Berg, R., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 593–607. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_38
Chapter Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the CVPR (2015)
Google Scholar
Swearngin, A., Dontcheva, M., Li, W., Brandt, J., Dixon, M., Ko, A.: Rewire: interface design assistance from examples. In: Proceedings of the ACM CHI (2018)
Google Scholar
Tripathi, S., Sridhar, S., Sundaresan, S., Tang, H.: Compact scene graphs for layout composition and patch retrieval. In: Proceedings of the CVPR (2019)
Google Scholar
Wang, J., et al.: Learning fine-grained image similarity with deep ranking. In: Proceedings of the CVPR, pp. 1386–1393 (2014)
Google Scholar
Wang, R., Yan, J., Yang, X.: Learning combinatorial embedding networks for deep graph matching. In: Proceedings of the ICCV (2019)
Google Scholar
X. Pang, Y. Cao, R.L., Chan, A.: Directing user attention via visual flow on web designs. In: Proceedings of the ACM SIGGRAPH (2016)
Google Scholar
Yang, X., Yumer, E., Asente, P., Kraley, M., Kifer, D., Giles, C.: Learning to extract semantic structure from documents using multimodal fully convolutional neural networks. In: Proceedings of the CVPR, pp. 5315–5324 (2017)
Google Scholar
Zhang, Z., Cui, P., Zhu, W.: Deep learning on graphs: a survey. arXiV 1812.04202v2 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

CVSSP, University of Surrey, Guildford, UK
Dipu Manandhar, Dan Ruta & John Collomosse
Adobe Research, Creative Intelligence Lab, San Jose, CA, USA
John Collomosse

Authors

Dipu Manandhar
View author publications
You can also search for this author in PubMed Google Scholar
Dan Ruta
View author publications
You can also search for this author in PubMed Google Scholar
John Collomosse
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dipu Manandhar .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 3171 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Manandhar, D., Ruta, D., Collomosse, J. (2020). Learning Structural Similarity of User Interface Layouts Using Graph Networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12367. Springer, Cham. https://doi.org/10.1007/978-3-030-58542-6_44

Download citation

DOI: https://doi.org/10.1007/978-3-030-58542-6_44
Published: 17 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58541-9
Online ISBN: 978-3-030-58542-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics