We are pleased to present a special issue of Data Science and Engineering (DSE), which contains a collection of four extended papers from the APWeb-WAIM 2019 conference. Besides these four special issue papers, this DSE issue also has one survey paper and one regular research paper.

APWeb-WAIM conferences focus on research, development, and applications in relation to Web information management, including a wide range of topics, such as text analysis, graph data processing, social networks, recommender systems, information retrieval, data streams, knowledge graph, data mining and application, query processing, machine learning, database and Web applications, big data, and blockchain. APWeb-WAIM 2019 was held in Chengdu during August 1–3, 2019, and attracted a total of 180 research paper submissions. The conference program committee selected 42 full research papers, 17 short papers, and 6 demonstration papers to be presented at the conference and published in the conference proceedings [1, 2]. The conference program also included keynote presentations by Prof. Divesh Srivastava (AT&T Labs-Research, USA), Prof. Xindong Wu (Mininglamp Technology, China), Prof. Christian S. Jensen (Aalborg University, Denmark), and Prof. Guoliang Li (Tsinghua University, China).

The four extended papers for this special issue were selected from among all the accepted papers by the special issue guest editors Dongxiang Zhang, Wei Wang, Bin Cui, and Heng Tao Shen, based on the relevance to the journal and the reviews of the conference version of the papers. The authors were asked to revise the conference paper for journal publication and in accordance with customary practice of adding 30% new materials. The revised papers again went through the review process in accordance with DSE guidelines and are finally presented to the readers in the present form.

The four extended papers in this special issue cover a variety of topics related to data science and engineering. In the first paper, “FreshJoin: An Efficient and Adaptive Algorithm for Set Containment Join,” authors revisit the set containment join (SCJ) problem and propose a new adaptive parameter-free in-memory algorithm to significantly reduce space overhead without harming the query execution time. The second paper, “Which Category Is Better: Benchmarking Relational and Graph Database Management Systems,” presents an experimental study on benchmarking relational and graph database management systems to exploit their preferable queries. In the third paper, “Leveraging Domain Context for Question Answering over Knowledge Graph,” authors leverage additional context information for questions and candidate answers and design a cross-attention mechanism to improve performance. Finally, in “Discovering Latent Threads in Entity Histories,” authors develop an effective approach to solve entity categorization based on historical similarity.

We hope that the readers enjoy this special issue. We would like to acknowledge the work done by all authors and their willingness to contribute their papers for this special issue. We thank all the reviewers for their expert comments and assistance in timely reviews. Finally, a note of thanks is to DSE editors X. Sean Wang and Elisa Bertino for their guidance and support in this process.

Dongxiang Zhang

Wei Wang

Bin Cui

Heng Tao Shen