A Method for Integrating Interfaces Based on Cluster Ensemble in Digital Library Federation

Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 269)


Recently, there are more demands in a digital library federation to integrate multiple query interfaces into one for users. Since different interfaces have various descriptions for the same concept and the amount of interfaces are numerous, it is hard to provide complete and exact domain knowledge. Hence, the methods of clustering are usually adopted to generate an integrated interface. However, over one same properties set, the results for clustering may be diverse according to the differences of clustering algorithms or parameters setting for the same algorithm. Nevertheless, we could obtain one more complete and exact integrated interface with the aid of cluster ensemble by merging multiple clustering results. In this paper, based on the principle of cluster ensemble, we propose a single clustering algorithm with uncertainty regarding that one property may belong to more than one possible cluster division during integration. We also propose a fusing cluster algorithm to obtain cluster ensemble that satisfying interface integration and it shows favorable performances than the existing methods.


Interface integration Deep web Cluster ensemble Uncertainty Digital library 


  1. 1.
    Fox EA (1993) Source book on digital libraries. Technical Report TR-93-35 Virginia Polytechnic Institute and State UniversityGoogle Scholar
  2. 2.
    Harter SP (1996) What is a digital library? definitions, content, and issues. In: KOLISS DL 1996Google Scholar
  3. 3.
    Birmingham B et al.(2001) EU-NSF digital library working group on interoperability between digital libraries. Accessed 15 Dec 2001
  4. 4.
    He H, Meng W, Yu C, Wu Z (2003) WISE-integrator: an automatic integrator of web search interfaces for e-commerce. In: Proceedings of the 29th international conference on very large data bases (VLDB), Berlin, 2003 pp 357–368Google Scholar
  5. 5.
    He B, Chang KC-C (2003) Statistical schema matching across web query interfaces. In: Proceedings of the 2003 ACM SIGMOD international conference on management of data, San Diego, California, 2003 pp 217–228Google Scholar
  6. 6.
    Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323CrossRefGoogle Scholar
  7. 7.
    Wu W, Yu C, Doan A, Meng W (2004) An interactive clustering-based approach to integrating source query interfaces on the deep web. In: Proceedings of the 23th ACM SIGMOD international conference on management of data, Paris, 2004 pp 95–106Google Scholar
  8. 8.
    He B, Chang KC-C, Han J (2004) Discovering complex matchings across web query interfaces: a correlation mining approach. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, Seattle, 2004 pp 148–157Google Scholar
  9. 9.
    He B, Chang KC-C, Han J (2004) Mining complex matchings across web query interfaces. In: Proceedings of the 9th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, Paris, 2004 pp 3–10Google Scholar
  10. 10.
  11. 11.
    Wang J, Wen JR, Lochovsky F, Ma WY (2004) Instance-based schema matching for web databases by domain-specific query probing. In: Proceedings of the thirtieth international conference on very large data bases, Toronto, Canada, 2004, pp 408–419Google Scholar
  12. 12.
    Wu W, Doan AH, Yu C (2006) WebIQ: learning from the web to match deep-web query interfaces. In: Proceedings of the 22nd international conference on data engineering, Washington, DC, USA, 2006, pp 44Google Scholar
  13. 13.
    Topchy AP, Jain AK, Punch WF (2004) A mixture model for clustering ensembles. In: Proceedings of the fourth SIAM international conference on data mining, Lake Buena Vista, Florida, USA pp 379–390Google Scholar
  14. 14.
    Chang KC-C, He B ,Li C, Zhang Z (2003) The UIUC web integration repository. Computer Science Department, University of Illinois at Urbana-Champaign., 2003

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.School of Computing Science and TechnologyShandong UniversityShandongChina
  2. 2.School of Information Science and EngineeringShandong Normal UniversityShandongChina

Personalised recommendations