Skip to main content
Log in

A FCA framework for inference control in data integration systems

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

Specifying a global access control policy in a data integration system using traditional methods does not necessarily offer a sound and efficient solution to deal with the inference problem. This is because data dependencies (between distributed data sets) are not taken into account when local policies are defined. In this paper, we propose a methodology, together with a set of algorithms, that can help to efficiently detect inferences by considering semantic constraints. The proposed approach is based on formal concept analysis (FCA) as a representation framework. Given a set of local policies, an initial global policy and data dependencies, we propose a methodology that allows the security administrator to derive a set of queries that, combined, could disclose sensitive information. We also say that the set of queries constitutes an inference channel. We use FCA theories to identify the illegal queries known as disclosure transactions. Then, we propose a run-time solution for neutralizing all suspicious queries while ensuring a trade-off between data protection and data availability. By combining Prime Number with Lattice theory, we keep traces of the previously executed queries so that inferences are blocked at run-time. We also discuss a set of experiments that we conducted.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. https://goo.gl/1G7igX.

  2. https://goo.gl/AC8aDY.

  3. A disclosure transaction (DT) is a sequence of queries such that if they are evaluated and their results are combined, they will lead to security breaches and thus violating an access control policy.

  4. http://www.bankmark.de.

References

  1. Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)

    Google Scholar 

  2. Brunswicker, S., Bertino, E., Matei, S.: Big data for open digital innovation—a research roadmap. Big Data Res. 2(2), 53–58 (2015)

    Google Scholar 

  3. Victor, N., Lopez, D., Abawajy, J.H.: Privacy models for big data: a survey. Int. J. Big Data Intell. 3(1), 61–75 (2016)

    Google Scholar 

  4. Li, X., Bertino, E., Yi, M.: Security of new generation computing systems. Concurr. Comput. 26(8), 1475–1476 (2014)

    Google Scholar 

  5. Haddad, M, Stevovic, J., Chiasera, A., Velegrakis, Y., Hacid, M.-S.: Access control for data integration in presence of data dependencies. In: 19th International Conference on Database Systems for Advanced Applications, DASFAA, pp. 203–217. Bali, Indonesia, 21–24 April 2014

  6. den Hartog, J., Zannone, N.: A policy framework for data fusion and derived data control. In: Proceedings of the 2016 ACM International Workshop on Attribute Based Access Control, ABAC ’16, pp. 47–57. ACM, New York, NY, USA (2016)

  7. Farkas, C., Jajodia, S.: The inference problem: a survey. SIGKDD Explor. Newsl. 4(2), 6–11 (2002)

    Google Scholar 

  8. Rosenthal, A, Sciore, E.: View security as the basis for data warehouse security. In: CAiSE Workshop on Design and Management of Data Warehouses, pp. 5–6 (2000)

  9. Rosenthal, A., Sciore, E.: Administering permissions for distributed data: factoring and automated inference. In: Proceedings of IFIP WG11.3 Conference (2001)

  10. De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Assessing query privileges via safe and efficient permission composition. In: ACM Conference on Computer and Communications Security, pp. 311–322 (2008)

  11. Cuzzocrea, A., Hacid, M.S., Grillo, N.: Effectively and efficiently selecting access control rules on materialized views over relational databases. In: International Database Engineering and Applications Symposium (IDEAS), pp. 225–235 (2010)

  12. Pottinger, R., Halevy, A.: Minicon: a scalable algorithm for answering queries using views. VLDB J. 10(2–3), 182–198 (2001)

    MATH  Google Scholar 

  13. Nait-Bahloul, S., Coquery, E., Hacid, M.-S.: Authorization policies for materialized views. In: Information Security and Privacy Research, pp. 525–530. Springer, New York (2012)

    Google Scholar 

  14. Haddad, M., Hacid, M.-S., Laurini, R.: Data integration in presence of authorization policies. In: 11th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2012, pp. 92–99. Liverpool, United Kingdom, 25–27 June 2012

  15. Rizvi, S., Mendelzon, A.O., Sudarshan, S., Roy, P.: Extending query rewriting techniques for fine-grained access control. In: Weikum, G., König, A.C., Deßloch, S. (eds) Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 551–562. ACM, Paris, France, 13–18 Jun, 2004

  16. Sellami, M., Gammoudi, M.M., Hacid, M.S.: Secure data integration: a formal concept analysis based approach. In: Database and Expert Systems Applications, pp. 326–333. Springer, New York (2014)

    Google Scholar 

  17. Denning, D.E., Schlorer, J.: Inference controls for statistical databases. Computer 16(7), 69–82 (1983)

    Google Scholar 

  18. Fellegi, I.P.: On the question of statistical confidentiality. J. Am. Stat. Assoc. 67(337), 7–18 (1972)

    MATH  Google Scholar 

  19. Friedman, T.D., Hoffman, L.J.: In: IEEE Symposium on Security and Privacy

  20. Hoffman, L.J., Miller, W.F.: Getting a personal dossier from a statistical data bank. In: Hoffman, L.J. (ed.) Security and Privacy in Computer Systems, pp. 289–293. Melville Publishing Company, Los Angeles (1973)

    Google Scholar 

  21. Reiss, S.P.: Practical data-swapping: the first steps. ACM Trans. Database Syst. 9(1), 20–37 (1984)

    MathSciNet  MATH  Google Scholar 

  22. Schlörer, J.: Security of statistical databases: multidimensional transformation. ACM Trans. Database Syst. 6(1), 95–112 (1981)

    MathSciNet  MATH  Google Scholar 

  23. Liew, C.K., Choi, U.J., Liew, C.J.: A data distortion by probability distribution. ACM Trans. Database Syst. 10(3), 395–411 (1985)

    MATH  Google Scholar 

  24. Lefons, E., Silvestri, A., Tangorra, F.: An analytic approach to statistical databases. In: Proceedings of the 9th International Conference on Very Large Data Bases, VLDB ’83, pp. 260–274. Morgan Kaufmann Publishers Inc, San Francisco, CA, USA(1983)

  25. Warner, S.L.: Randomized response: a survey technique for eliminating evasive answer bias. J. Am. Stat. Assoc. 60(309), 63 (1965)

    MATH  Google Scholar 

  26. Traub, J.F., Yemini, Y., Wozniakowski, H.: The statistical security of a statistical database. ACM Trans. Database Syst. 9(4), 672–679 (1984)

    Google Scholar 

  27. Denning, D.E.: Secure statistical databases with random sample queries. ACM Trans. Database Syst. 5(3), 291–315 (1980)

    MATH  Google Scholar 

  28. Beck, L.L.: A security machanism for statistical database. ACM Trans. Database Syst. 5(3), 316–3338 (1980)

    Google Scholar 

  29. Fellegi, P.I., Phillips, J.J.: Statistical confidentiality: some theory and application to data dissemination. In: Annals of Economic and Social Measurement, vol. 3, no. 2, pp. 101–112. National Bureau of Economic Research, Inc. (1974)

  30. Ozsoyoglu, G., Su, T.-A.: Rounding and inference control in conceptual models for statistical databases. In: 2013 IEEE Symposium on Security and Privacy, pp. 160. IEEE (1985)

  31. Goguen, J.A., Meseguer, J.: Unwinding and Inference Control. In: Proceedings of the 1984 IEEE Symposium on Security and Privacy, pp. 75–86. IEEE Computer Society (1984)

  32. Su, T.-A., Özsoyoglu, G.: Data dependencies and inference control in multilevel relational database systems. In: Proceedings of the 1987 IEEE Symposium on Security and Privacy, pp. 202–211. Oakland, California, USA, 27–29 April, 1987

  33. Chen, P.P.S.: The entity-relationship model: toward a unified view of data. ACM Trans. Database Syst. 1, 9–36 (1976)

    Google Scholar 

  34. Meadows, C., Jajodia, S.: Integrity versus security in multi-level secure databases. In: DBSec, pp. 89–101 (1987)

  35. Morgenstern, M.: In: IEEE Symposium on Security and Privacy

  36. Delugach, H.S., Hinke, T.H.: Wizard: a database inference analysis and detection system. IEEE Trans. Knowl. Data Eng. 8(1), 56–66 (1996)

    Google Scholar 

  37. Tzong-An, S., Ozsoyoglu, G.: Controlling fd and mvd inferences in multilevel relational database systems. IEEE Trans. Knowl. Data Eng. 3(4), 474–485 (1991)

    Google Scholar 

  38. Dawson, S., di Vimercati, S.D.C., Samarati, P.: Specification and enforcement of classification and inference constraints. In: Proceedings of the 1999 IEEE Symposium on Security and Privacy, pp. 181–195. IEEE (1999)

  39. Brodsky, A., Farkas, C., Jajodia, S.: Secure databases: constraints, inference channels, and monitoring disclosures. IEEE Trans. Knowl. Data Eng. 12(6), 900–919 (2000)

    Google Scholar 

  40. Chen, Y., Chu, W.W.: Protection of database security via collaborative inference detection. IEEE Trans. Knowl. Data Eng. 20(8), 1013–1027 (2008)

    Google Scholar 

  41. Paci, F., Zannone, N.: Preventing information inference in access control. In: Proceedings of the 20th ACM Symposium on Access Control Models and Technologies, pp. 87–97. ACM (2015)

  42. Albertini, D.A., Carminati, B., Ferrari, E.: An extended access control mechanism exploiting data dependencies. Int. J. Inf. Secur. 16(1), 75–89 (2017)

    Google Scholar 

  43. Guarnieri, M., Marinovic, S., Basin, D.A.: Securing databases from probabilistic inference. In: 30th IEEE Computer Security Foundations Symposium, CSF 2017, pp. 343–359. Santa Barbara, CA, USA, 21–25 Aug 2017

  44. Turan, U., Toroslu, S.H., Kantarclu, M.: Secure logical schema and decomposition algorithm for proactive context dependent attribute based inference control. Data Knowl. Eng. 111(Supplement C), 1–21 (2017)

    Google Scholar 

  45. Sharma, S., Rajawat, A.S.: A review of privacy preserving models for multi-party data release framework. In: Proceedings of the ACM Symposium on Women in Research 2016, pp. 165–168. ACM (2016)

  46. Shu, X., Yao, D., Bertino, E.: Privacy-preserving detection of sensitive data exposure. IEEE Trans. Inf. Forensics Secur. 10(5), 1092–1103 (2015)

    Google Scholar 

  47. Liu, F., Shu, X., Yao, D., Butt, A.R.: Privacy-preserving scanning of big content for sensitive data exposure with mapreduce. In: Proceedings of the 5th ACM Conference on Data and Application Security and Privacy, pp. 195–206. ACM (2015)

  48. Sellami, M., Hacid, M.-S., Gammoudi, M.-M.: Inference control in data integration systems. In: OTM Confederated International Conferences On the Move to Meaningful Internet Systems, pp. 285–302. Springer, New York (2015)

    Google Scholar 

  49. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations, 1st edn. Springer, New York (1997)

    MATH  Google Scholar 

  50. Wille, R.: Restructuring lattice theory: an approach based on hierarchies of concept. In: Mrivali, (ed.) Ordered Sets. Reidel, Dordrecht, Boston (1982)

    Google Scholar 

  51. Poelmans, J., Kuznetsov, S.O., Ignatov, D.I., Dedene, G.: Formal concept analysis in knowledge processing: a survey on models and techniques. Expert Ayst. Appl. 40(16), 6601–6623 (2013)

    Google Scholar 

  52. Wille, R.: Concept lattices and conceptual knowledge systems. Comput. Math. Appl. 23(6), 493–515 (1992)

    MATH  Google Scholar 

  53. Groh, B.: Automated knowledge and information fusion from multiple text-based sources using formal concept analysis (1999)

  54. Tamer Özsu, M., Valduriez, P.: Principles of Distributed Database Systems, 3rd edn. Springer, New York (2011)

    Google Scholar 

  55. Levene, M., Loizou, G.: A Guided Tour of Relational Databases and Beyond. Springer, New York (1999)

    MATH  Google Scholar 

  56. Godin, R., Missaoui, R., Alaoui, H.: Incremental concept formation algorithms based on galois (concept) lattices. Comput. Intell. 11(2), 246–267 (1995)

    Google Scholar 

  57. Sellami, M., Hacid, M.-S., Gammoudi, M.M.: Inference control in data integration systems. In: On the Move to Meaningful Internet Systems: OTM 2015 Conferences—Confederated International Conferences: CoopIS, ODBASE, and C&TC 2015, pp. 285–302. Rhodes, Greece, 26-30 Oct 2015

  58. Chang, Y.-I., Yang, B.-Y., Yeh, W.-H.: A generalized prime-number-based matrix strategy for efficient iconic indexing of symbolic pictures. Pattern Recognit. Lett. 22(6), 657–666 (2001)

    MATH  Google Scholar 

  59. Gouda, K., Hassaan, M., Zaki, M.J.: Prism: A primal-encoding approach for frequent sequence mining. In: Seventh IEEE International Conference on Data Mining. ICDM 2007, pp. 487–492. IEEE (2007)

  60. Selmi, A., Gammoudi, M.M., Harrathi, F.: Chapter five a method for improving algorithms of formal concepts extraction using prime numbers. Business Intelligence and Mobile Technology Research: An Information Systems Engineering Perspective, p. 59 (2014)

  61. Rouane-Hacene, M., Huchard, M., Napoli, A., Valtchev, P.: Relational concept analysis: mining concept lattices from multi-relational data. Ann. Math. Artif. Intell. 67(1), 81–108 (2013)

    MathSciNet  MATH  Google Scholar 

  62. Ives, Z.G.: Query Processing in Data Integration Systems, pp. 1–4. Springer, New York (2016)

    Google Scholar 

  63. Valtchev, P., Grosser, D., Roume, C., Hacene, M.R.: Galicia: an open platform for lattices. In: de Moor, A., Ganter, B. (eds) Using Conceptual Structures: Contributions to 11th Intl. Conference on Conceptual Structures, pp. 241–254 (2003)

  64. Huhtala, Y., Kärkkäinen, J., Porkka, P., Toivonen, H.: TANE: an efficient algorithm for discovering functional and approximate dependencies. The Comput. J. 42(2), 100–111 (1999)

    MATH  Google Scholar 

  65. Rabl, T., Poess, M.: Parallel data generation for performance analysis of large, complex RDBMS. In: Proceedings of the Fourth International Workshop on Testing Database Systems, p. 5. ACM (2011)

  66. Poess, M., Rabl, T., Jacobsen, H.-A., Caufield, B.: Tpc-di: the first industry benchmark for data integration. Proc. VLDB Endow. 7(13), 1367–1378 (2014)

    Google Scholar 

  67. Arevalo, G., Berry, A., Huchard, M., Perrot, G., Sigayret, A.: Performances of galois sub-hierarchy-building algorithms. In: Kuznetsov, S., Schmidt, S. (eds.) Formal Concept Analysis. Lecture Notes in Computer Science, vol. 4390, pp. 166–180. Springer, Berlin (2007)

    Google Scholar 

  68. Dias, S.M., Vieira, N.J.: A methodology for analysis of concept lattice reduction. Inf. Sci. 396, 202–217 (2017)

    MathSciNet  Google Scholar 

  69. Mouliswaran, S.C., Kumar, C.A., Chandrasekar, C., et al.: Modeling chinese wall access control using formal concept analysis. In: 2014 International Conference on Contemporary Computing and Informatics (IC3I), pp. 811–816. IEEE (2014)

  70. Baixeries, J., Kaytoue, M., Napoli, A.: Characterizing functional dependencies in formal concept analysis with pattern structures. Ann. Math. Artif. Intell. 72(1–2), 129–149 (2014)

    MathSciNet  MATH  Google Scholar 

  71. Singh, P.K., Kumar, C.A.: Concept lattice reduction using different subset of attributes as information granules. Granul. Comput. 2(3), 159–173 (2017)

    Google Scholar 

  72. Singh, P.K., Kumar, C.A., Li, J.: Knowledge representation using interval-valued fuzzy formal concept lattice. Soft Comput. 20(4), 1485–1502 (2016)

    MATH  Google Scholar 

Download references

Acknowledgements

This work is supported by Thomson Reuters in the framework of the Partner University Fund project: “Cybersecurity Collaboratory: Cyberspace Threat Identification, Analysis and Proactive Response”. The Partner University Fund is a program of the French Embassy in the United States and the FACE Foundation and is supported by American donors and the French government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mokhtar Sellami.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

A Security proofs

1.1 A.1 Correctness

Proof

We prove Theorem 1 by showing that if \(\exists Q* \subset Q\) such that \(Q*\in DT\), for which neither P.1 or nor p.2 hold, then a contradiction arises. If P.1 does not hold, it means that there is a label \(\ell u\{r,\{LockedQuery\}\}\) assigned to the top concept and \( Q* \notin \ell u\{r,\{LockedQuery\}\}\), but it’s not locked definitively for the user r and should be checked with the past executed queries (line 2–3). The Algorithm then checks disclosure transaction achievement by first computing the cumuli of past executed queries (line 6). When the cumuli which composed by the set of queries labeled with the label \(\ell u\{r\}\) and the current query \(Q*\) build a Disclosure Transaction, the if condition in line 7 is not satisfied. Thus P.2 does not hold, The algorithm stops and returns a message to inform that the query is not allowed (revoked). Thus, a contradiction arises.\(\square \)

1.2 A.2 Completeness

Proof

We prove Theorem 2 by showing that if there exists a suspicious query Q such that \(Q* \subset Q\) and \(Q*\in DT\), for which neither P.1 nor p.2 hold, then a contradiction arises. If P.1 does not hold, similarly to proof of Theorem 1, it implies that there is a label \(\ell u\{r,\{LockedQuery\}\}\) assigned to the top-concept and \( Q* \notin \ell u\{r,\{LockedQuery\}\}\), but it’s not locked definitively for the user r and should be checked with the past executed queries (line 2–3). The Algorithm then checks the disclosure transaction achievement by first computing the cumuli of past executed queries (line 6). When the cumuli of the previously executed queries (those are labeled with \(\ell u\{r\}\)) and the suspicious query build a Disclosure Transaction, the if condition in line 7 is not satisfied. Thus, P.2 does not hold, The algorithm should locking definitively the query by assigning a label \(\ell u\{r\}\) to the top-concept and returns a message to inform that the query is not allowed (revoked). Thus, a contradiction arises, since the user is not authorized to execute the query. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sellami, M., Hacid, MS. & Gammoudi, M.M. A FCA framework for inference control in data integration systems. Distrib Parallel Databases 37, 543–586 (2019). https://doi.org/10.1007/s10619-018-7241-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-018-7241-5

Keywords

Navigation