Skip to main content

Adaptive Rule Adaptation in Unstructured and Dynamic Environments

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11881))

Abstract

Rule-based systems have been used to augment machine learning based algorithms for annotating data in unstructured and dynamic environments. Rules can alleviate many of shortcomings inherent in pure algorithmic approaches. Rule adaptation is a challenging and error-prone task: in a rule-based system, there is a need for an analyst to adapt rules in order to keep them applicable and precise. In this paper, we present an approach for adapting data annotation rules in unstructured and constantly changing environments. Our approach offloads analysts from adapting rules and autonomically identifies the optimal modification for rules using a Bayesian multi-armed-bandit algorithm. We conduct experiments on different curation domains and compare the performance of our approach with systems relying on analysts. The experimental results show a comparative performance of our approach compared to analysts in adapting rules.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Annotates data with a precision below \(\jmath \).

  2. 2.

    As feature \(f_1\) is the root feature it annotates data above the average, thus satisfies the restriction condition.

  3. 3.

    Features \(\{f_3,f_4\}\) are siblings for feature \(f_2\).

  4. 4.

    Annotates data below the average number of items annotated with its siblings.

  5. 5.

    https://www.figure-eight.com/.

  6. 6.

    @lucianaberger: Mental health services facing serious shortages of mental health nurses decrease of 12% since 2010 psychiatrists.

  7. 7.

    If i have to hire a car and drive home from belgium i am going to go mental stupid french air traffic control wanks on strike.

  8. 8.

    We set the value of \(\epsilon \) and Q, experimentally using simulated data.

References

  1. Anderson, M.R., Cafarella, M., Jiang, Y., Wang, G., Zhang, B.: An integrated development environment for faster feature engineering. Proc. VLDB Endowment 7(13), 1657–1660 (2014)

    Article  Google Scholar 

  2. Bak, P., Dolev, D., Yatzkar-Haham, T.: Rule adjustment by visualization of physical location data, 11 September 2014. US Patent App. 14/483,158

    Google Scholar 

  3. Beheshti, A., Benatallah, B., Nouri, R., Chhieng, V.M., Xiong, H., Zhao, X.: CoreDB: a data lake service. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, 06–10 November 2017, pp. 2451–2454 (2017)

    Google Scholar 

  4. Beheshti, A., Benatallah, B., Nouri, R., Tabebordbar, A.: CoreKG: a knowledge lake service. PVLDB 11(12), 1942–1945 (2018)

    Google Scholar 

  5. Beheshti, A., Benatallah, B., Tabebordbar, A., Motahari-Nezhad, H.R., Barukh, M.C., Nouri, R.: Datasynapse: a social data curation foundry. Distrib. Parallel Databases 37(3), 351–384 (2019)

    Article  Google Scholar 

  6. Beheshti, A., Vaghani, K., Benatallah, B., Tabebordbar, A.: Crowdcorrect: a curation pipeline for social data cleansing and curation. In: Proceedings of the Information Systems in the Big Data Era - CAiSE Forum 2018, Tallinn, Estonia, 11–15 June 2018, pp. 24–38 (2018)

    Google Scholar 

  7. Beheshti, S., Benatallah, B., Venugopal, S., Ryu, S.H., Motahari-Nezhad, H.R., Wang, W.: A systematic review and comparative analysis of cross-document coreference resolution methods and tools. Computing 99(4), 313–349 (2017)

    Article  MathSciNet  Google Scholar 

  8. Beheshti, S., Tabebordbar, A., Benatallah, B., Nouri, B.: On automating basic data curation tasks. In: Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, 3–7 April 2017, pp. 165–169 (2017)

    Google Scholar 

  9. Burtini, G., Loeppky, J., Lawrence, R.: Improving online marketing experiments with drifting multi-armed bandits. In: ICEIS 1, pp. 630–636 (2015)

    Google Scholar 

  10. Clement, B., Roy, D., Oudeyer, P.-Y., Lopes, M.: Online optimization of teaching sequences with multi-armed bandits. In: 7th International Conference on Educational Data Mining (2014)

    Google Scholar 

  11. Paul Suganthan, G.C., et al.: Why big data industrial systems need rules and what we can do about it. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 265–276. ACM (2015)

    Google Scholar 

  12. Hammoud, M., Rabbou, D.A., Nouri, R., Beheshti, S., Sakr, S.: DREAM: distributed RDF engine with adaptive query planner and minimal communication. PVLDB 8(6), 654–665 (2015)

    Google Scholar 

  13. He, J., et al.: Interactive and deterministic data cleaning. In: Proceedings of the 2016 International Conference on Management of Data, pp. 893–907. ACM (2016)

    Google Scholar 

  14. Hunt, N., Tyrrell, S.: Stratified sampling. Retrieved November, 10:2012 (2001)

    Google Scholar 

  15. Kohavi, R., Longbotham, R., Sommerfield, D., Henne, R.M.: Controlled experiments on the web: survey and practical guide. Data Min. Knowl. Disc. 18(1), 140–181 (2009)

    Article  MathSciNet  Google Scholar 

  16. Liu, B., Chiticariu, L., Chu, V., Jagadish, H., Reiss, F.: Refining information extraction rules using data provenance. IEEE Data Eng. Bull. 33(3), 17–24 (2010)

    Google Scholar 

  17. Liu, Y.-E., Mandel, T., Brunskill, E., Popovic, Z.: Trading off scientific knowledge and user learning with multi-armed bandits. In: EDM, pp. 161–168 (2014)

    Google Scholar 

  18. Milo, T., Novgorodov, S., Tan, W.-C.: Rudolf: interactive rule refinement system for fraud detection. Proc. VLDB Endowment 9(13), 1465–1468 (2016)

    Article  Google Scholar 

  19. Milo, T., Novgorodov, S., Tan, W.-C.: Interactive rule refinement for fraud detection. In: EDBT (2018)

    Google Scholar 

  20. Ortona, S., Meduri, V.V., Papotti, P.: Robust discovery of positive and negative rules in knowledge bases. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 1168–1179. IEEE (2018)

    Google Scholar 

  21. Panahi, F., Wu, W., Doan, A., Naughton, J.F.: Towards interactive debugging of rule-based entity matching. In: EDBT, pp. 354–365 (2017)

    Google Scholar 

  22. Ratner, A., Bach, S.H., Ehrenberg, H., Fries, J., Wu, S., Ré, C.: Snorkel: rapid training data creation with weak supervision. arXiv preprint arXiv:1711.10160 (2017)

  23. Ratner, A.J., Bach, S.H., Ehrenberg, H.R., Ré, C.: Snorkel: fast training set generation for information extraction. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1683–1686. ACM (2017)

    Google Scholar 

  24. Rocchio, J.J.: Relevance feedback in information retrieval. The SMART retrieval system: experiments in automatic document processing, pp. 313–323 (1971)

    Google Scholar 

  25. Russo, D., Van Roy, B., Kazerouni, A., Osband, I.: A tutorial on Thompson sampling. arXiv preprint arXiv:1707.02038 (2017)

  26. Sun, C., Rampalli, N., Yang, F., Doan, A.: Chimera: large-scale classification using machine learning, rules, and crowdsourcing. VLDB Endowment 7(13), 1529–1540 (2014)

    Article  Google Scholar 

  27. Tabebordbar, A., Beheshti, A.: Adaptive rule monitoring system. In: Proceedings of the 1st International Workshop on Software Engineering for Cognitive Services, SE4COG@ICSE 2018, Gothenburg, Sweden, 28–2 May 2018, pp. 45–51 (2018)

    Google Scholar 

  28. Volkovs, M., Chiang, F., Szlichta, F., Miller, R.J.: Continuous data cleaning. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE), pp. 244–255. IEEE (2014)

    Google Scholar 

  29. Williams, J.J., et al.: Axis: generating explanations at scale with learnersourcing and machine learning. In: ACM Conference on Learning@ Scale, pp. 379–388. ACM (2016)

    Google Scholar 

  30. Xie, J., Sun, C., Yang, F., Rampalli, N.: Automatic rule coaching, 2 September 2014. US Patent App. 14/475,470

    Google Scholar 

Download references

Acknowledgements

We Acknowledge the AI-enabled Processes (AIP) Research Centre for funding part of this research.

We Acknowledge the Data to Decisions CRC (D2D CRC) and the Cooperative Research Centres Program for funding part of this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alireza Tabebordbar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tabebordbar, A., Beheshti, A., Benatallah, B., Barukh, M.C. (2019). Adaptive Rule Adaptation in Unstructured and Dynamic Environments. In: Cheng, R., Mamoulis, N., Sun, Y., Huang, X. (eds) Web Information Systems Engineering – WISE 2019. WISE 2020. Lecture Notes in Computer Science(), vol 11881. Springer, Cham. https://doi.org/10.1007/978-3-030-34223-4_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34223-4_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34222-7

  • Online ISBN: 978-3-030-34223-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics