Adaptive Rule Adaptation in Unstructured and Dynamic Environments

Tabebordbar, Alireza; Beheshti, Amin; Benatallah, Boualem; Barukh, Moshe Chai

doi:10.1007/978-3-030-34223-4_21

Adaptive Rule Adaptation in Unstructured and Dynamic Environments

Alireza Tabebordbar¹²,
Amin Beheshti¹³,
Boualem Benatallah¹² &
…
Moshe Chai Barukh¹²

Conference paper
First Online: 29 October 2019

2314 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11881))

Abstract

Rule-based systems have been used to augment machine learning based algorithms for annotating data in unstructured and dynamic environments. Rules can alleviate many of shortcomings inherent in pure algorithmic approaches. Rule adaptation is a challenging and error-prone task: in a rule-based system, there is a need for an analyst to adapt rules in order to keep them applicable and precise. In this paper, we present an approach for adapting data annotation rules in unstructured and constantly changing environments. Our approach offloads analysts from adapting rules and autonomically identifies the optimal modification for rules using a Bayesian multi-armed-bandit algorithm. We conduct experiments on different curation domains and compare the performance of our approach with systems relying on analysts. The experimental results show a comparative performance of our approach compared to analysts in adapting rules.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Annotates data with a precision below \(\jmath \).
2.
As feature \(f_1\) is the root feature it annotates data above the average, thus satisfies the restriction condition.
3.
Features \(\{f_3,f_4\}\) are siblings for feature \(f_2\).
4.
Annotates data below the average number of items annotated with its siblings.
5.
https://www.figure-eight.com/.
6.
@lucianaberger: Mental health services facing serious shortages of mental health nurses decrease of 12% since 2010 psychiatrists.
7.
If i have to hire a car and drive home from belgium i am going to go mental stupid french air traffic control wanks on strike.
8.
We set the value of \(\epsilon \) and Q, experimentally using simulated data.

References

Anderson, M.R., Cafarella, M., Jiang, Y., Wang, G., Zhang, B.: An integrated development environment for faster feature engineering. Proc. VLDB Endowment 7(13), 1657–1660 (2014)
Article Google Scholar
Bak, P., Dolev, D., Yatzkar-Haham, T.: Rule adjustment by visualization of physical location data, 11 September 2014. US Patent App. 14/483,158
Google Scholar
Beheshti, A., Benatallah, B., Nouri, R., Chhieng, V.M., Xiong, H., Zhao, X.: CoreDB: a data lake service. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, 06–10 November 2017, pp. 2451–2454 (2017)
Google Scholar
Beheshti, A., Benatallah, B., Nouri, R., Tabebordbar, A.: CoreKG: a knowledge lake service. PVLDB 11(12), 1942–1945 (2018)
Google Scholar
Beheshti, A., Benatallah, B., Tabebordbar, A., Motahari-Nezhad, H.R., Barukh, M.C., Nouri, R.: Datasynapse: a social data curation foundry. Distrib. Parallel Databases 37(3), 351–384 (2019)
Article Google Scholar
Beheshti, A., Vaghani, K., Benatallah, B., Tabebordbar, A.: Crowdcorrect: a curation pipeline for social data cleansing and curation. In: Proceedings of the Information Systems in the Big Data Era - CAiSE Forum 2018, Tallinn, Estonia, 11–15 June 2018, pp. 24–38 (2018)
Google Scholar
Beheshti, S., Benatallah, B., Venugopal, S., Ryu, S.H., Motahari-Nezhad, H.R., Wang, W.: A systematic review and comparative analysis of cross-document coreference resolution methods and tools. Computing 99(4), 313–349 (2017)
Article MathSciNet Google Scholar
Beheshti, S., Tabebordbar, A., Benatallah, B., Nouri, B.: On automating basic data curation tasks. In: Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, 3–7 April 2017, pp. 165–169 (2017)
Google Scholar
Burtini, G., Loeppky, J., Lawrence, R.: Improving online marketing experiments with drifting multi-armed bandits. In: ICEIS 1, pp. 630–636 (2015)
Google Scholar
Clement, B., Roy, D., Oudeyer, P.-Y., Lopes, M.: Online optimization of teaching sequences with multi-armed bandits. In: 7th International Conference on Educational Data Mining (2014)
Google Scholar
Paul Suganthan, G.C., et al.: Why big data industrial systems need rules and what we can do about it. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 265–276. ACM (2015)
Google Scholar
Hammoud, M., Rabbou, D.A., Nouri, R., Beheshti, S., Sakr, S.: DREAM: distributed RDF engine with adaptive query planner and minimal communication. PVLDB 8(6), 654–665 (2015)
Google Scholar
He, J., et al.: Interactive and deterministic data cleaning. In: Proceedings of the 2016 International Conference on Management of Data, pp. 893–907. ACM (2016)
Google Scholar
Hunt, N., Tyrrell, S.: Stratified sampling. Retrieved November, 10:2012 (2001)
Google Scholar
Kohavi, R., Longbotham, R., Sommerfield, D., Henne, R.M.: Controlled experiments on the web: survey and practical guide. Data Min. Knowl. Disc. 18(1), 140–181 (2009)
Article MathSciNet Google Scholar
Liu, B., Chiticariu, L., Chu, V., Jagadish, H., Reiss, F.: Refining information extraction rules using data provenance. IEEE Data Eng. Bull. 33(3), 17–24 (2010)
Google Scholar
Liu, Y.-E., Mandel, T., Brunskill, E., Popovic, Z.: Trading off scientific knowledge and user learning with multi-armed bandits. In: EDM, pp. 161–168 (2014)
Google Scholar
Milo, T., Novgorodov, S., Tan, W.-C.: Rudolf: interactive rule refinement system for fraud detection. Proc. VLDB Endowment 9(13), 1465–1468 (2016)
Article Google Scholar
Milo, T., Novgorodov, S., Tan, W.-C.: Interactive rule refinement for fraud detection. In: EDBT (2018)
Google Scholar
Ortona, S., Meduri, V.V., Papotti, P.: Robust discovery of positive and negative rules in knowledge bases. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 1168–1179. IEEE (2018)
Google Scholar
Panahi, F., Wu, W., Doan, A., Naughton, J.F.: Towards interactive debugging of rule-based entity matching. In: EDBT, pp. 354–365 (2017)
Google Scholar
Ratner, A., Bach, S.H., Ehrenberg, H., Fries, J., Wu, S., Ré, C.: Snorkel: rapid training data creation with weak supervision. arXiv preprint arXiv:1711.10160 (2017)
Ratner, A.J., Bach, S.H., Ehrenberg, H.R., Ré, C.: Snorkel: fast training set generation for information extraction. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1683–1686. ACM (2017)
Google Scholar
Rocchio, J.J.: Relevance feedback in information retrieval. The SMART retrieval system: experiments in automatic document processing, pp. 313–323 (1971)
Google Scholar
Russo, D., Van Roy, B., Kazerouni, A., Osband, I.: A tutorial on Thompson sampling. arXiv preprint arXiv:1707.02038 (2017)
Sun, C., Rampalli, N., Yang, F., Doan, A.: Chimera: large-scale classification using machine learning, rules, and crowdsourcing. VLDB Endowment 7(13), 1529–1540 (2014)
Article Google Scholar
Tabebordbar, A., Beheshti, A.: Adaptive rule monitoring system. In: Proceedings of the 1st International Workshop on Software Engineering for Cognitive Services, SE4COG@ICSE 2018, Gothenburg, Sweden, 28–2 May 2018, pp. 45–51 (2018)
Google Scholar
Volkovs, M., Chiang, F., Szlichta, F., Miller, R.J.: Continuous data cleaning. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE), pp. 244–255. IEEE (2014)
Google Scholar
Williams, J.J., et al.: Axis: generating explanations at scale with learnersourcing and machine learning. In: ACM Conference on Learning@ Scale, pp. 379–388. ACM (2016)
Google Scholar
Xie, J., Sun, C., Yang, F., Rampalli, N.: Automatic rule coaching, 2 September 2014. US Patent App. 14/475,470
Google Scholar

Download references

Acknowledgements

We Acknowledge the AI-enabled Processes (AIP) Research Centre for funding part of this research.

We Acknowledge the Data to Decisions CRC (D2D CRC) and the Cooperative Research Centres Program for funding part of this research.

Author information

Authors and Affiliations

University of New South Wales, Sydney, Australia
Alireza Tabebordbar, Boualem Benatallah & Moshe Chai Barukh
Macquarie University, Sydney, Australia
Amin Beheshti

Authors

Alireza Tabebordbar
View author publications
You can also search for this author in PubMed Google Scholar
Amin Beheshti
View author publications
You can also search for this author in PubMed Google Scholar
Boualem Benatallah
View author publications
You can also search for this author in PubMed Google Scholar
Moshe Chai Barukh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alireza Tabebordbar .

Editor information

Editors and Affiliations

University of Hong Kong, Hong Kong SAR, China
Reynold Cheng
University of Ioannina, Ioannina, Greece
Nikos Mamoulis
University of California, Los Angeles, CA, USA
Yizhou Sun
Hong Kong Baptist University, Kowloon Tong, Hong Kong SAR, China
Xin Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tabebordbar, A., Beheshti, A., Benatallah, B., Barukh, M.C. (2019). Adaptive Rule Adaptation in Unstructured and Dynamic Environments. In: Cheng, R., Mamoulis, N., Sun, Y., Huang, X. (eds) Web Information Systems Engineering – WISE 2019. WISE 2020. Lecture Notes in Computer Science(), vol 11881. Springer, Cham. https://doi.org/10.1007/978-3-030-34223-4_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-34223-4_21
Published: 29 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34222-7
Online ISBN: 978-3-030-34223-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics