Beyond Search: Web-Scale Business Analytics
We discuss the novel problem of supporting analytical business intelligence queries over web-based textual content, e.g., BI-style reports based on 100.000’s of documents from an ad-hoc web search result. Neither conventional search engines nor conventional Business Intelligence and ETL tools address this problem, which lies at the intersection of their capabilities. This application is an exciting challenge that should appeal to and benefit from several research communities, most notably, the database, text analytics and distributed system worlds. E.g., to provide fast answers for such queries, cloud computing techniques need to be incorporated with text analytics, data cleansing, query processing and query refinement methods. However, the envisioned path for OLAP-style query processing over textual web data may take a long time to mature. Two recent developments have the potential to become key components of such an ad-hoc analysis platform: significant improvements in cloud computing query languages and advances in self-supervised information extraction techniques. In this talk, I will give an informative and practical look at the underlying research challenges in supporting "Web-Scale Business Analytics" applications with a focus on its key components and will highlight recent projects.