InfoHarness: Use of automatically generated metadata for search and retrieval of heterogeneous information

  • Leon Shklar
  • Amit Sheth
  • Vipul Kashyap
  • Kshitij Shah
User Interface Issues
Part of the Lecture Notes in Computer Science book series (LNCS, volume 932)


The InfoHarness system is aimed at providing integrated and rapid access to huge amounts of heterogeneous information independent of its type, representation, and location. This is achieved by extracting metadata and associating it with the original information. The metadata extraction methods ensure rapid and largely automatic creation of information repositories. A stable hierarchy of abstract classes is proposed to organize the processing and representation needs of different kinds of information. An extensible hierarchy of terminal classes simplifies support for new information types and utilization of new indexing technologies. InfoHarness repositories may be accessed through Mosaic or any other HyperText Transfer Protocol (HTTP) compliant browser.


Digital Medium Class Hierarchy Latent Semantic Indexing Information Unit Abstract Class 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    J. Anderson and M. Stonebraker, “SEQUOIA 2000 Metadata schema for Satellite Images”, (to appear) SIGMOD Rec., special issue on Metadata for Digital Media, December 1994.Google Scholar
  2. 2.
    T. Berners-Lee et al, “World Wide Web: The Information Universe”, Electronic Networking: Research, Applications and Policy”, 1(2), 1992.Google Scholar
  3. 3.
    K. Bohm and T. Rakow, “Metadata for Multimedia Documents”, (to appear) SIGMOD Record, special issue on Metadata for Digital Media, December 1994.Google Scholar
  4. 4.
    Y-F. Chen, M. Nishimoto and C. Ramamoorthy, “The C information abstraction system”, IEEE Transactions on Software Engineering, March 1990.Google Scholar
  5. 5.
    F. Chen, M. Hearst, J. Kupiec, J. Pederson and L. Wilcox, “Metadata for Mixed-Media Access”, (to appear) SIGMOD Record, special issue on Metadata for Digital Media, December 1994.Google Scholar
  6. 6.
    S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer and R. Hashman, “Indexing by Latent Semantic Indexing”, Journal of the American Society for Information Science, 41(6), 1990.Google Scholar
  7. 7.
    S. T. Dumais, G. W. Furnas, T. K. Landauer, S. Deerwester, and K. Harshman, “Using latent semantic analysis to improve access to textual information”, Proceedings of the 1988 CHI Conference, 1988.Google Scholar
  8. 8.
    G. Fischer and C. Stevens, “Information access in complex, poorly structured information spaces”, Proceedings of the 1991 CHI Conference, 1991.Google Scholar
  9. 9.
    F. Garzotto. P. Paolini, and D. Schwabe. “HDM — A Model-Based Approach to Hypertext Application Design”, ACM Trans. on Inform. Systems, 11(1), 1993.Google Scholar
  10. 10.
    U. Glavitsch, P. Schauble and M. Wechsler, “Metadata for Integrating Speech Documents in a Text Retrieval System”, (to appear) SIGMOD Record, special issue on Metadata for Digital Media, December 1994.Google Scholar
  11. 11.
    W. Grosky, F. Fotouhi and I. Sethi, “Content-Based Hypermedia — Intelligent Browsing of Structured Media Objects”, (to appear) SIGMOD Record, special issue on Metadata for Digital Media, December 1994.Google Scholar
  12. 12.
    C. Hsu, “The Meta-database Project at Renesselaer”, SIGMOD Record, special issue on Semantic Issues in Multidatabases, 20(4), December 1991.Google Scholar
  13. 13.
    R. Jain and A. Hampapur, “Representations for Video Databases”, (to appear) SIGMOD Record, special issue on Metadata for Digital Media, December 1994.Google Scholar
  14. 14.
    Y. Kane-Esrig, L. A. Streeter, W. Keese, and G. Casella, “The relevance density method in information retrieval”, Proceedings of the 4th International Conference on Computing and Information, 1992.Google Scholar
  15. 15.
    Y. Kane-Esrig, L. Shklar, and C. St. Charles, Using Multiple Sources of Information to Search a Repository of Software Lifecycle Artifacts, Proceedings of the Bellcore Conference on Electronic Document Delivery, NJ, May 1994.Google Scholar
  16. 16.
    B. Kahle and A. Medlar, “An Information System for Corporate Users: Wide Area Information Servers”, Connexions — The Interoperability Report, 5(11), Nov. 1991.Google Scholar
  17. 17.
    V. Kashyap and A. Sheth, “Semantics-based Information Brokering: A step towards realizing the Infocosm”, Technical Report DCS-TR-307, Department of Computer Science, Rutgers University, March 1994.Google Scholar
  18. 18.
    V. Kashyap and A. Sheth, “Semantics-based Information Brokering”, Proceedings of the 3rd International Conference on Information and Knowledge Management (CIKM), Gaithersburg, MD, November 1994.Google Scholar
  19. 19.
    Y. Kiyoki, T. Kitagawa and T. Hayama, “A meta-database System for Semantic Image Search by a Mathematical Model of Meaning”, (to appear) SIGMOD Record, special issue on Metadata for Digital Media, December 1994.Google Scholar
  20. 20.
    T. Landauer, D. Egan, M. Lesk, C. Lochbaum and D. Ketchum, “Enhancing the usability of text through computer delivery and formative evaluation: the Super-Book Project”, In C. McKnight, A. Dillon and J. Richardson, (eds) “Hypertext: A Psychological Perspective”, Chicester: Ellis Horwood, 1993.Google Scholar
  21. 21.
    C.J. Matheus. P.K. Chan, and G. Piatetsky-Shapiro, “Systems for Knowledge Discovery in Databases”, IEEE Trans. on Knowledge and Data Eng., Dec. 1993.Google Scholar
  22. 22.
    V. Sembugamoorthy, L. Streeter, and Mary Leland, “Igrep: A real World Prospective on Locating Software Artifacts for Reuse”, Proceedings of the 5th Annual Workshop on Software Reuse (WISR'92), Palo Alto, CA, 1992.Google Scholar
  23. 23.
    A. Sheth and J. Larson, “Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases”, ACM Comp. Surveys, 22(3), 1990.Google Scholar
  24. 24.
    L. Shklar, “XReuse: Representation and Retrieval of Heterogeneous Multimedia Objects”, Proc. of Bellcore Object-Oriented Symposium, Arlington, VA, June 1993.Google Scholar
  25. 25.
    L. Shklar, S. Thatte, H. Marcus, and A. Sheth, “The InfoHarness Information Integration Platform”, Advance Proceedings of the Second International WWW Conference '94, Chicago, October 17–20, pp. 809–819, Scholar
  26. 26.
    L. Shklar, K. Shah, and C. Basu, “Putting Legacy Data on the Web: A Repository Definition Language”, To appear in the Proceedings of the Third International WWW Conference'95, April 1995, Darmstadt, Germany, http://www.igd.fhg.del www/www95/www95.html.Google Scholar
  27. 27.
    K. Shoens, A Luniewski, P. Shwartz, J. Stamos, and J. Thomas, “The Rufus System: Information Organization for Semi-Structured Data”, Proceedings of the 19th VLDB Conference, Dublin, Ireland, 1993.Google Scholar
  28. 28.
    L. A. Streeter and K. E. Lochbaum, “Who knows: a system based on automatic representation of semantic structure”, Proceedings of RIAO 88: User-oriented context-based text and image handling, M.I.T., Cambridge, MA, 1988, pp.379–388.Google Scholar
  29. 29.
    John R. Rymer, “Distributed Object Computing”, Distributed Computing Monitor, Vol. 8, No. 8, Boston, 1993.Google Scholar
  30. 30.
    N. Yankelovich, B. Haan, N. Meyrowitz and S. Drucker, “Intermedia: The concept and construction of a seamless information environment”, IEEE Computer, 21(1), January 1988.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1995

Authors and Affiliations

  • Leon Shklar
    • 1
    • 3
  • Amit Sheth
    • 2
  • Vipul Kashyap
    • 2
    • 3
  • Kshitij Shah
    • 1
    • 3
  1. 1.Bell Communications ResearchPiscataway
  2. 2.LSDIS, Department of Computer ScienceUniversity of GeorgiaAthens
  3. 3.Computer Science DepartmentRutgers UniversityNew Brunswick

Personalised recommendations