Fast, Accurate Creation of Data Validation Formats by End-User Developers

  • Chris Scaffidi
  • Brad Myers
  • Mary Shaw
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5435)


Inputs to web forms often contain typos or other errors. However, existing web form design tools require end-user developers to write regular expressions (“regexps”) or even scripts to validate inputs, which is slow and error-prone because of the poor match between common data types and the regexp notation. We present a new technique enabling end-user developers to describe data as a series of constrained parts, and we have incorporated our technique into a prototype tool. Using this tool, end-user developers can create validation code more quickly and accurately than with existing techniques, finding 90% of invalid inputs in a lab study. This study and our evaluation of the technique’s generality have motivated several tool improvements, which we have implemented and now evaluate using the Cognitive Dimensions framework.


Data validation web macros web applications 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Blackwell, A.: SWYN: A Visual Representation for Regular Expressions. In: Your Wish Is My Command: Programming by Example, pp. 245–270. Morgan Kaufmann, San Francisco (2001)CrossRefGoogle Scholar
  2. 2.
    Burnett, M., et al.: End-User Software Engineering with Assertions in the Spreadsheet Paradigm. In: Proc. 25th Intl. Conf. on Software Engineering, pp. 93–103 (2003)Google Scholar
  3. 3.
    Chakrabarti, S.: Mining the Web: Discovering Knowledge from Hypertext Data. Morgan Kaufmann, San Francisco (2002)Google Scholar
  4. 4.
    Fisher II, M., Rothermel, G.: The EUSES Spreadsheet Corpus: A Shared Resource for Supporting Experimentation with Spreadsheet Dependability Mechanisms. Tech. Rpt. 04-12-03, Univ. of Nebraska (2004)Google Scholar
  5. 5.
    Green, T., Petre, M.: Usability Analysis of Visual Programming Environments: A “Cognitive Dimensions” Framework. J. Visual Lang. and Computing 7, 131–174 (1996)CrossRefGoogle Scholar
  6. 6.
    Koesnandar, A., et al.: Using Assertions to Help End-User Programmers Create Dependable Web Macros. In: Proc. 16th ACM SIGSOFT Intl. Symp. on Foundations of Software Engineering (to appear) (2008)Google Scholar
  7. 7.
    Lerman, K., Minton, S., Knoblock, C.: Wrapper Maintenance: A Machine Learning Approach. J. Artificial Intelligence Research 18, 149–181 (2003)zbMATHGoogle Scholar
  8. 8.
    Lieberman, H., Nardi, B., Wright, D.: Training Agents to Recognize Text by Example. J. Auton. Agents and Multi-Agent Systems 4(1), 79–92 (2001)CrossRefGoogle Scholar
  9. 9.
    Miller, R., Myers, B.: Outlier Finding: Focusing Human Attention on Possible Errors. In: Proc. 14th Symp. on User Interface Software and Technology, pp. 81–90 (2001)Google Scholar
  10. 10.
    Mosteller, F., Youtz, C.: Quantifying Probabilistic Expressions. Statistical Science 5(1), 2–12 (1990)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Myers, B., Pane, J., Ko, A.: Natural Programming Languages and Environments. Comm. ACM 47(9), 47–52 (2004)CrossRefGoogle Scholar
  12. 12.
    Nardi, B.: A Small Matter of Programming: Perspectives on End User Computing. MIT Press, Cambridge (1993)Google Scholar
  13. 13.
    Nardi, B., Miller, J., Wright, D.: Collaborative, Programmable Intelligent Agents. Comm. ACM 41(3), 96–104 (1998)CrossRefGoogle Scholar
  14. 14.
    Raz, O., Koopman, P., Shaw, M.: Semantic Anomaly Detection in Online Data Sources. In: Proc. 24th Intl. Conf. on Software Engineering, pp. 302–312 (2002)Google Scholar
  15. 15.
    Safonov, A.: Web Macros By Example: Users Managing the WWW of Applications. In: CHI 1999 Extended Abstracts on Human Factors in Computing Sys., pp. 71–72 (1999)Google Scholar
  16. 16.
    Scaffidi, C., Myers, B., Shaw, M.: Topes: Reusable Abstractions for Validating Data. In: Proc. 30th Intl. Conf. on Software Engineering, pp. 1–10 (2008)Google Scholar
  17. 17.
    Scaffidi, C.: Unsupervised Inference of Data Formats in Human-Readable Notation. In: Proc. 9th Intl. Conf. on Enterprise Information Systems-HCI Volume, pp. 236–241 (2007)Google Scholar
  18. 18.
    Scaffidi, C., et al.: Using Topes to Validate and Reformat Data in End-User Programming Tools. In: Proc. 4th Workshop on End-User Software Engineering, pp. 11–15 (2008)Google Scholar
  19. 19.
    Tomita, M.: An Efficient Augmented-Context-Free Parsing Algorithm. J. Computational Linguistics 13(1-2), 31–46 (1987)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Chris Scaffidi
    • 1
  • Brad Myers
    • 1
  • Mary Shaw
    • 1
  1. 1.Carnegie Mellon UniversityPittsburghUSA

Personalised recommendations