Carbon: Domain-Independent Automatic Web Form Filling

  • Samur Araujo
  • Qi Gao
  • Erwin Leonardi
  • Geert-Jan Houben
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6189)


Web forms are the main input mechanism for users to supply data to web applications. Users fill out forms in order to, for example, sign up to social network applications or do advanced searches in search-based web applications. This process is highly repetitive and can be optimized by reusing the user’s data across web forms. In this paper, we present a novel framework for domain-independent automatic form filling. The main task is to automatically fill out a correct value for each field in a new form, based on web forms the user has previously filled. The key innovation of our approach is that we are able to extract relevant metadata from the previously filled forms, semantically enrich it, and use it for aligning fields between web forms.


Auto-filling auto-completion concept mapping web forms semantic web 


  1. 1.
    Autofill Forms – Mozilla Firefox Add-on,
  2. 2.
    Bast, H., Weber, I.: Type Less, Find More: Fast Autocompletion Search with a Succinct Index. In: The Proceedings of SIGIR 2006, Seattle, USA (August 2006)Google Scholar
  3. 3.
    Bouquet, P., Serafini, L., Zanobini, S.: Semantic Coordination: A New Approach and an Application. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 130–145. Springer, Heidelberg (2003)Google Scholar
  4. 4.
    Doan, A.H., Domingos, P., Halevy, A.Y.: Learning to Match the Schemas of Data Sources: A Multistrategy Approach. Machine Learning 50(3), 279–301 (2009)CrossRefGoogle Scholar
  5. 5.
    Doan, A.H., Madhavan, J., Domingos, P., Halevy, A.Y.: Learning to Map between Ontologies on the Semantic Web. VLDB Journal, Special Issue on the Semantic Web 12(4), 303–319 (2003)Google Scholar
  6. 6.
    Google Toolbar Autofill,
  7. 7.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufman, San Francisco (2001)Google Scholar
  8. 8.
    He, H., Meng, W., Yu, C.T., Wu, Z.: Automatic Extraction of Web Search Interfaces for Interface Schema Integration. In: the Proceedings of WWW 2004 - Alternate Track Papers & Posters, New York, USA (May 2004)Google Scholar
  9. 9.
    Hyvönen, E., Mäkelä, E.: Semantic Autocompletion. In: Mizoguchi, R., Shi, Z.-Z., Giunchiglia, F. (eds.) ASWC 2006. LNCS, vol. 4185, pp. 739–751. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    iOpus Internet Macros,
  11. 11.
    Knuth, D.E.: The Art of Computer Programming. Sorting and Searching, vol. 3, pp. 394–395. Addison-Wesley, Reading (1973)Google Scholar
  12. 12.
    Lehmann, J., Schüppel, J., Auer, S.: Discovering Unknown Connections - the DBpedia Relationship Finder. In: The Proceedings of CSSW 2007, Leipzig, Germany (September 2007)Google Scholar
  13. 13.
    Levenshtein, V.I.: Binary Codes Capable of Correcting Deletions, Insertions, and Reversals. Soviet Physics Doklady 10(8), 707–710 (1966)MathSciNetGoogle Scholar
  14. 14.
    Nguyen, H., Nguyen, T., Freire, J.: Learning to Extract Form Labels. In: the Proceedings of VLDB 2008, Auckland, New Zealand (August 2008)Google Scholar
  15. 15.
    Noy, N.F., Musen, M.A.: PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment. In: The Proceedings of AAAI/IAAI 2000, Austin, USA (July-August 2000)Google Scholar
  16. 16.
    Pedersen, T., Patwardhan, S., Michelizzi, J.: WordNet: Similarity - Measuring the Relatedness of Concepts. In: The Proceedings of AAAI/IAAI 2004, San Jose, USA (July 2004)Google Scholar
  17. 17.
    Raghavan, S., Garcia-Molina, H.: Crawling the Hidden Web. In: The Proceedings of VLDB 2001, Rome, Italy (Septmeber 2001)Google Scholar
  18. 18.
  19. 19.
    Smith, T., Waterman, M.: Identification of Common Molecular Subsequences. Journal of Molecular Biology 147(1), 195–197 (1981)CrossRefGoogle Scholar
  20. 20.
  21. 21.
    Toda, G.A., Cortez, E., de Sá Mesquita, F., da Silva, A.S., de Moura, E.S., Neubert, M.S.: Automatically Filling Form-based Web Interfaces with Free Text Inputs. In: the Proceedings of WWW 2009, Madrid, Spain (April 2009)Google Scholar
  22. 22.
    Winkler, W.E.: The State of Record Linkage and Current Research Problems. Statistics of Income Division, Internal Revenue Service Publication R99/04 (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Samur Araujo
    • 1
  • Qi Gao
    • 1
  • Erwin Leonardi
    • 1
  • Geert-Jan Houben
    • 1
  1. 1.Delft University of TechnologyDelftThe Netherlands

Personalised recommendations