Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adams, K. C. (2001). The Web as Database: New Extraction Technologies and Content Management. Online, March
Agichtein, E., Ipeirotis, P. G., & Gravano, L. (2003). Modeling Query-Based Access to Text Databases
Barbosa, L. & Freire, J. (2004). Siphoning Hidden-Web Data through KeywordBased Interfaces. Paper presented at the SBBD
Bergman, M. I. K. (2001). The Deep Web: Surfacing Hidden Value. The Journal of Electronic Publishing, 7(1)
Boston, T. (2005). Exposing the deep web to increase access to library collections. Paper presented at the AusWeb05. The Twelfth Australasian World Wide Web Conference, Queensland, Australia
Boufkhad, Y. & Viennot, L. (2003). The Observable Web. RR
Boyko, A. (2004). Test Bed Taxonomy. IIPC Reports, 16
Brandman, O., Cho, J., Garcia-Molina, H., & Shivakumar, N. (2000). CrawlerFriendly Web Servers. SIGMETRICS Performance Evaluation Review, 28(2), 9-14
Callan, J. & Connell, M. (2001). Query-based sampling of text databases. ACM Transactions on Information Systems 19(2), 97-130
Castillo, C. (2004). Effective Web Crawling. University of Chile
Chang, K. C.-C., He, B., Li, C., Patel, M., & Zhang, Z. (2004). Structured databases on the web: observations and implications. SIGMOD Records, 33(3), 61-70
Cope, J., Craswell, N., & Hawking, D. (2003). Automated discovery of search interfaces on the web. Paper presented at the Proceedings of the Fourteenth Australasian Database Conference on Database Technologies 2003
Florescu, D., Levy, A., & Mendelzon, A. (1998). Database techniques for the World-Wide Web: A survey. SIGMOD Records, 27, 59-74
Frankewitsch, T. & Prokosch, U. (2001). Navigation in medical Internet image databases. Medical Informatics and the Internet in Medicine, 26(1), 1-15
Gravano, L., Ipeirotis, P. G., & Sahami, M. (2003). QProber: A System for Automatic Classification of Hidden-Web Databases. ACM Transactions on Information Systems, 21(1)
He, H., Meng, W., Yu, C., & Wu, Z. (2005). WISE-Integrator: a system for extracting and integrating complex web search interfaces of the deep web. Trondheim, Norway
Hearst, M. (1998). Information Integration. IEEE Intelligent Systems, 13(5), 12-24
HTTrack. http://www.httrack.com/
Lage, J. P., Silva, A. S. D., Golgher, P. B., & Laender, A. H. F. (2002). Collecting hidden Web pages for data extraction. Paper presented at the Proceedings of the fourth international workshop on Web information and data management
Lagoze, C. & Van de Sompel, H. (2001). The open archives initiative: building a low-barrier interoperability framework. Roanoke, Virginia, United States
Lawrence, S. & Giles, C. L. (1999). Accessibility of Information on the Web. Nature, 400, 107-109
Liddle, W. S., Yau, S. H., & Embley, D. W. (2002). On the Automatic Extraction of Data from the Hidden Web. Springer, Berlin Heidelberg New York
Liu, X., Maly, K., Zubair, M., & Nelson, M. (2002). DP9 - an OAI gateway service for Web crawlers. Paper presented at the Second ACM/IEEE Joint Conference on Digital Libraries
Ludäscher, B. & Gupta, A. (1999). Modeling Interactive Web Sources for Information Mediation. Paper presented at the Intl. Workshop on the World-Wide Web and Conceptual Modeling (WWWCM’99), Paris
Marill, J., Boyko, A., & Ashenfelder, M. (2004). Web Harvesting Survey, 10
Masanès, J. (2002). Archiving the deep web. Paper presented at the 2nd International Workshop on Web Archives (IWAW’02), Roma, Italy
Mohr, G., Kimpton, M., Stack, M., & Ranitovic, I. (2004). Introduction to Heritrix, an archival quality web crawler. Paper presented at the 4th International Web Archiving Workshop (IWAW’04), Bath, UK
Ntoulas, A., Zerfos, P., & Cho, J. (2005). Downloading textual hidden web content through keyword queries. Denver, CO, USA
Raghavan, S. & Garcia-Molina, H. (2001). Crawling the Hidden Web. Paper presented at the Proceedings of the 27th International Conference on Very Large Data Bases
Roche, X. (2006). Copying web sites. In J. Masanès (Ed.), Web Archiving. Springer, Berlin Heidelberg New York
Storey, M.-A. & Jahnke, J. H. (1999). Web site evolution - Towards a flexible integration of data and its representation. Paper presented at the 1st International Workshop on Web Site Evolution (WSE’99), Atlanta, USA
Zhang, Z., He, B., & Chang, K. C.-C. (2004). Understanding Web query interfaces: Best-effort parsing with hidden syntax. Paper presented at the Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Masanés, J. (2006). Archiving the Hidden Web. In: Web Archiving. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-46332-0_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-46332-0_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23338-1
Online ISBN: 978-3-540-46332-0
eBook Packages: Computer ScienceComputer Science (R0)