Abstract
The World-Wide Web hosts numerous communities, each focusing on a particular topic. As such communities proliferate, so do efforts to build community portals. Most current portals are organized according to topic taxonomies. Recently, however, there has been a growing effort to build structured data portals (e.g., IMDB, Citeseer) that present a unified view of entities and relationships in the community. Such portals can prove extremely valuable in a wide range of domains. But how can we build them efficiently?
In this talk, I will present a new research vision that addresses this question. The goal is to develop a system that a small team (or ideally just one person) can quickly deploy to build an initial (but already useful) structured portal, then leverage the entire community in a mass collaboration fashion to improve and expand this portal. As such, the research agenda requires combining and extending research in information extraction, information integration, and Web 2.0 technologies, among others. This agenda is actively being pursued in the Cimple project, a joint effort between the University of Wisconsin and Yahoo Research. In the talk I will describe recent progress in Cimple, portal prototypes, lessons learned, and future directions. I will focus in particular on how Cimple raises interesting and novel challenges for both AI and database research. More information about Cimple can be found at www.cs.wisc.edu/~anhai/projects/cimple.
Chapter PDF
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Doan, AH. (2008). Building Structured Web Community Portals Via Extraction, Integration, and Mass Collaboration. In: Ho, TB., Zhou, ZH. (eds) PRICAI 2008: Trends in Artificial Intelligence. PRICAI 2008. Lecture Notes in Computer Science(), vol 5351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89197-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-89197-0_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89196-3
Online ISBN: 978-3-540-89197-0
eBook Packages: Computer ScienceComputer Science (R0)