Abstract
Specifying a global access control policy in a data integration system using traditional methods does not necessarily offer a sound and efficient solution to deal with the inference problem. This is because data dependencies (between distributed data sets) are not taken into account when local policies are defined. In this paper, we propose a methodology, together with a set of algorithms, that can help to efficiently detect inferences by considering semantic constraints. The proposed approach is based on formal concept analysis (FCA) as a representation framework. Given a set of local policies, an initial global policy and data dependencies, we propose a methodology that allows the security administrator to derive a set of queries that, combined, could disclose sensitive information. We also say that the set of queries constitutes an inference channel. We use FCA theories to identify the illegal queries known as disclosure transactions. Then, we propose a run-time solution for neutralizing all suspicious queries while ensuring a trade-off between data protection and data availability. By combining Prime Number with Lattice theory, we keep traces of the previously executed queries so that inferences are blocked at run-time. We also discuss a set of experiments that we conducted.
Similar content being viewed by others
Notes
A disclosure transaction (DT) is a sequence of queries such that if they are evaluated and their results are combined, they will lead to security breaches and thus violating an access control policy.
References
Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)
Brunswicker, S., Bertino, E., Matei, S.: Big data for open digital innovation—a research roadmap. Big Data Res. 2(2), 53–58 (2015)
Victor, N., Lopez, D., Abawajy, J.H.: Privacy models for big data: a survey. Int. J. Big Data Intell. 3(1), 61–75 (2016)
Li, X., Bertino, E., Yi, M.: Security of new generation computing systems. Concurr. Comput. 26(8), 1475–1476 (2014)
Haddad, M, Stevovic, J., Chiasera, A., Velegrakis, Y., Hacid, M.-S.: Access control for data integration in presence of data dependencies. In: 19th International Conference on Database Systems for Advanced Applications, DASFAA, pp. 203–217. Bali, Indonesia, 21–24 April 2014
den Hartog, J., Zannone, N.: A policy framework for data fusion and derived data control. In: Proceedings of the 2016 ACM International Workshop on Attribute Based Access Control, ABAC ’16, pp. 47–57. ACM, New York, NY, USA (2016)
Farkas, C., Jajodia, S.: The inference problem: a survey. SIGKDD Explor. Newsl. 4(2), 6–11 (2002)
Rosenthal, A, Sciore, E.: View security as the basis for data warehouse security. In: CAiSE Workshop on Design and Management of Data Warehouses, pp. 5–6 (2000)
Rosenthal, A., Sciore, E.: Administering permissions for distributed data: factoring and automated inference. In: Proceedings of IFIP WG11.3 Conference (2001)
De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Assessing query privileges via safe and efficient permission composition. In: ACM Conference on Computer and Communications Security, pp. 311–322 (2008)
Cuzzocrea, A., Hacid, M.S., Grillo, N.: Effectively and efficiently selecting access control rules on materialized views over relational databases. In: International Database Engineering and Applications Symposium (IDEAS), pp. 225–235 (2010)
Pottinger, R., Halevy, A.: Minicon: a scalable algorithm for answering queries using views. VLDB J. 10(2–3), 182–198 (2001)
Nait-Bahloul, S., Coquery, E., Hacid, M.-S.: Authorization policies for materialized views. In: Information Security and Privacy Research, pp. 525–530. Springer, New York (2012)
Haddad, M., Hacid, M.-S., Laurini, R.: Data integration in presence of authorization policies. In: 11th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2012, pp. 92–99. Liverpool, United Kingdom, 25–27 June 2012
Rizvi, S., Mendelzon, A.O., Sudarshan, S., Roy, P.: Extending query rewriting techniques for fine-grained access control. In: Weikum, G., König, A.C., Deßloch, S. (eds) Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 551–562. ACM, Paris, France, 13–18 Jun, 2004
Sellami, M., Gammoudi, M.M., Hacid, M.S.: Secure data integration: a formal concept analysis based approach. In: Database and Expert Systems Applications, pp. 326–333. Springer, New York (2014)
Denning, D.E., Schlorer, J.: Inference controls for statistical databases. Computer 16(7), 69–82 (1983)
Fellegi, I.P.: On the question of statistical confidentiality. J. Am. Stat. Assoc. 67(337), 7–18 (1972)
Friedman, T.D., Hoffman, L.J.: In: IEEE Symposium on Security and Privacy
Hoffman, L.J., Miller, W.F.: Getting a personal dossier from a statistical data bank. In: Hoffman, L.J. (ed.) Security and Privacy in Computer Systems, pp. 289–293. Melville Publishing Company, Los Angeles (1973)
Reiss, S.P.: Practical data-swapping: the first steps. ACM Trans. Database Syst. 9(1), 20–37 (1984)
Schlörer, J.: Security of statistical databases: multidimensional transformation. ACM Trans. Database Syst. 6(1), 95–112 (1981)
Liew, C.K., Choi, U.J., Liew, C.J.: A data distortion by probability distribution. ACM Trans. Database Syst. 10(3), 395–411 (1985)
Lefons, E., Silvestri, A., Tangorra, F.: An analytic approach to statistical databases. In: Proceedings of the 9th International Conference on Very Large Data Bases, VLDB ’83, pp. 260–274. Morgan Kaufmann Publishers Inc, San Francisco, CA, USA(1983)
Warner, S.L.: Randomized response: a survey technique for eliminating evasive answer bias. J. Am. Stat. Assoc. 60(309), 63 (1965)
Traub, J.F., Yemini, Y., Wozniakowski, H.: The statistical security of a statistical database. ACM Trans. Database Syst. 9(4), 672–679 (1984)
Denning, D.E.: Secure statistical databases with random sample queries. ACM Trans. Database Syst. 5(3), 291–315 (1980)
Beck, L.L.: A security machanism for statistical database. ACM Trans. Database Syst. 5(3), 316–3338 (1980)
Fellegi, P.I., Phillips, J.J.: Statistical confidentiality: some theory and application to data dissemination. In: Annals of Economic and Social Measurement, vol. 3, no. 2, pp. 101–112. National Bureau of Economic Research, Inc. (1974)
Ozsoyoglu, G., Su, T.-A.: Rounding and inference control in conceptual models for statistical databases. In: 2013 IEEE Symposium on Security and Privacy, pp. 160. IEEE (1985)
Goguen, J.A., Meseguer, J.: Unwinding and Inference Control. In: Proceedings of the 1984 IEEE Symposium on Security and Privacy, pp. 75–86. IEEE Computer Society (1984)
Su, T.-A., Özsoyoglu, G.: Data dependencies and inference control in multilevel relational database systems. In: Proceedings of the 1987 IEEE Symposium on Security and Privacy, pp. 202–211. Oakland, California, USA, 27–29 April, 1987
Chen, P.P.S.: The entity-relationship model: toward a unified view of data. ACM Trans. Database Syst. 1, 9–36 (1976)
Meadows, C., Jajodia, S.: Integrity versus security in multi-level secure databases. In: DBSec, pp. 89–101 (1987)
Morgenstern, M.: In: IEEE Symposium on Security and Privacy
Delugach, H.S., Hinke, T.H.: Wizard: a database inference analysis and detection system. IEEE Trans. Knowl. Data Eng. 8(1), 56–66 (1996)
Tzong-An, S., Ozsoyoglu, G.: Controlling fd and mvd inferences in multilevel relational database systems. IEEE Trans. Knowl. Data Eng. 3(4), 474–485 (1991)
Dawson, S., di Vimercati, S.D.C., Samarati, P.: Specification and enforcement of classification and inference constraints. In: Proceedings of the 1999 IEEE Symposium on Security and Privacy, pp. 181–195. IEEE (1999)
Brodsky, A., Farkas, C., Jajodia, S.: Secure databases: constraints, inference channels, and monitoring disclosures. IEEE Trans. Knowl. Data Eng. 12(6), 900–919 (2000)
Chen, Y., Chu, W.W.: Protection of database security via collaborative inference detection. IEEE Trans. Knowl. Data Eng. 20(8), 1013–1027 (2008)
Paci, F., Zannone, N.: Preventing information inference in access control. In: Proceedings of the 20th ACM Symposium on Access Control Models and Technologies, pp. 87–97. ACM (2015)
Albertini, D.A., Carminati, B., Ferrari, E.: An extended access control mechanism exploiting data dependencies. Int. J. Inf. Secur. 16(1), 75–89 (2017)
Guarnieri, M., Marinovic, S., Basin, D.A.: Securing databases from probabilistic inference. In: 30th IEEE Computer Security Foundations Symposium, CSF 2017, pp. 343–359. Santa Barbara, CA, USA, 21–25 Aug 2017
Turan, U., Toroslu, S.H., Kantarclu, M.: Secure logical schema and decomposition algorithm for proactive context dependent attribute based inference control. Data Knowl. Eng. 111(Supplement C), 1–21 (2017)
Sharma, S., Rajawat, A.S.: A review of privacy preserving models for multi-party data release framework. In: Proceedings of the ACM Symposium on Women in Research 2016, pp. 165–168. ACM (2016)
Shu, X., Yao, D., Bertino, E.: Privacy-preserving detection of sensitive data exposure. IEEE Trans. Inf. Forensics Secur. 10(5), 1092–1103 (2015)
Liu, F., Shu, X., Yao, D., Butt, A.R.: Privacy-preserving scanning of big content for sensitive data exposure with mapreduce. In: Proceedings of the 5th ACM Conference on Data and Application Security and Privacy, pp. 195–206. ACM (2015)
Sellami, M., Hacid, M.-S., Gammoudi, M.-M.: Inference control in data integration systems. In: OTM Confederated International Conferences On the Move to Meaningful Internet Systems, pp. 285–302. Springer, New York (2015)
Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations, 1st edn. Springer, New York (1997)
Wille, R.: Restructuring lattice theory: an approach based on hierarchies of concept. In: Mrivali, (ed.) Ordered Sets. Reidel, Dordrecht, Boston (1982)
Poelmans, J., Kuznetsov, S.O., Ignatov, D.I., Dedene, G.: Formal concept analysis in knowledge processing: a survey on models and techniques. Expert Ayst. Appl. 40(16), 6601–6623 (2013)
Wille, R.: Concept lattices and conceptual knowledge systems. Comput. Math. Appl. 23(6), 493–515 (1992)
Groh, B.: Automated knowledge and information fusion from multiple text-based sources using formal concept analysis (1999)
Tamer Özsu, M., Valduriez, P.: Principles of Distributed Database Systems, 3rd edn. Springer, New York (2011)
Levene, M., Loizou, G.: A Guided Tour of Relational Databases and Beyond. Springer, New York (1999)
Godin, R., Missaoui, R., Alaoui, H.: Incremental concept formation algorithms based on galois (concept) lattices. Comput. Intell. 11(2), 246–267 (1995)
Sellami, M., Hacid, M.-S., Gammoudi, M.M.: Inference control in data integration systems. In: On the Move to Meaningful Internet Systems: OTM 2015 Conferences—Confederated International Conferences: CoopIS, ODBASE, and C&TC 2015, pp. 285–302. Rhodes, Greece, 26-30 Oct 2015
Chang, Y.-I., Yang, B.-Y., Yeh, W.-H.: A generalized prime-number-based matrix strategy for efficient iconic indexing of symbolic pictures. Pattern Recognit. Lett. 22(6), 657–666 (2001)
Gouda, K., Hassaan, M., Zaki, M.J.: Prism: A primal-encoding approach for frequent sequence mining. In: Seventh IEEE International Conference on Data Mining. ICDM 2007, pp. 487–492. IEEE (2007)
Selmi, A., Gammoudi, M.M., Harrathi, F.: Chapter five a method for improving algorithms of formal concepts extraction using prime numbers. Business Intelligence and Mobile Technology Research: An Information Systems Engineering Perspective, p. 59 (2014)
Rouane-Hacene, M., Huchard, M., Napoli, A., Valtchev, P.: Relational concept analysis: mining concept lattices from multi-relational data. Ann. Math. Artif. Intell. 67(1), 81–108 (2013)
Ives, Z.G.: Query Processing in Data Integration Systems, pp. 1–4. Springer, New York (2016)
Valtchev, P., Grosser, D., Roume, C., Hacene, M.R.: Galicia: an open platform for lattices. In: de Moor, A., Ganter, B. (eds) Using Conceptual Structures: Contributions to 11th Intl. Conference on Conceptual Structures, pp. 241–254 (2003)
Huhtala, Y., Kärkkäinen, J., Porkka, P., Toivonen, H.: TANE: an efficient algorithm for discovering functional and approximate dependencies. The Comput. J. 42(2), 100–111 (1999)
Rabl, T., Poess, M.: Parallel data generation for performance analysis of large, complex RDBMS. In: Proceedings of the Fourth International Workshop on Testing Database Systems, p. 5. ACM (2011)
Poess, M., Rabl, T., Jacobsen, H.-A., Caufield, B.: Tpc-di: the first industry benchmark for data integration. Proc. VLDB Endow. 7(13), 1367–1378 (2014)
Arevalo, G., Berry, A., Huchard, M., Perrot, G., Sigayret, A.: Performances of galois sub-hierarchy-building algorithms. In: Kuznetsov, S., Schmidt, S. (eds.) Formal Concept Analysis. Lecture Notes in Computer Science, vol. 4390, pp. 166–180. Springer, Berlin (2007)
Dias, S.M., Vieira, N.J.: A methodology for analysis of concept lattice reduction. Inf. Sci. 396, 202–217 (2017)
Mouliswaran, S.C., Kumar, C.A., Chandrasekar, C., et al.: Modeling chinese wall access control using formal concept analysis. In: 2014 International Conference on Contemporary Computing and Informatics (IC3I), pp. 811–816. IEEE (2014)
Baixeries, J., Kaytoue, M., Napoli, A.: Characterizing functional dependencies in formal concept analysis with pattern structures. Ann. Math. Artif. Intell. 72(1–2), 129–149 (2014)
Singh, P.K., Kumar, C.A.: Concept lattice reduction using different subset of attributes as information granules. Granul. Comput. 2(3), 159–173 (2017)
Singh, P.K., Kumar, C.A., Li, J.: Knowledge representation using interval-valued fuzzy formal concept lattice. Soft Comput. 20(4), 1485–1502 (2016)
Acknowledgements
This work is supported by Thomson Reuters in the framework of the Partner University Fund project: “Cybersecurity Collaboratory: Cyberspace Threat Identification, Analysis and Proactive Response”. The Partner University Fund is a program of the French Embassy in the United States and the FACE Foundation and is supported by American donors and the French government.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
A Security proofs
1.1 A.1 Correctness
Proof
We prove Theorem 1 by showing that if \(\exists Q* \subset Q\) such that \(Q*\in DT\), for which neither P.1 or nor p.2 hold, then a contradiction arises. If P.1 does not hold, it means that there is a label \(\ell u\{r,\{LockedQuery\}\}\) assigned to the top concept and \( Q* \notin \ell u\{r,\{LockedQuery\}\}\), but it’s not locked definitively for the user r and should be checked with the past executed queries (line 2–3). The Algorithm then checks disclosure transaction achievement by first computing the cumuli of past executed queries (line 6). When the cumuli which composed by the set of queries labeled with the label \(\ell u\{r\}\) and the current query \(Q*\) build a Disclosure Transaction, the if condition in line 7 is not satisfied. Thus P.2 does not hold, The algorithm stops and returns a message to inform that the query is not allowed (revoked). Thus, a contradiction arises.\(\square \)
1.2 A.2 Completeness
Proof
We prove Theorem 2 by showing that if there exists a suspicious query Q such that \(Q* \subset Q\) and \(Q*\in DT\), for which neither P.1 nor p.2 hold, then a contradiction arises. If P.1 does not hold, similarly to proof of Theorem 1, it implies that there is a label \(\ell u\{r,\{LockedQuery\}\}\) assigned to the top-concept and \( Q* \notin \ell u\{r,\{LockedQuery\}\}\), but it’s not locked definitively for the user r and should be checked with the past executed queries (line 2–3). The Algorithm then checks the disclosure transaction achievement by first computing the cumuli of past executed queries (line 6). When the cumuli of the previously executed queries (those are labeled with \(\ell u\{r\}\)) and the suspicious query build a Disclosure Transaction, the if condition in line 7 is not satisfied. Thus, P.2 does not hold, The algorithm should locking definitively the query by assigning a label \(\ell u\{r\}\) to the top-concept and returns a message to inform that the query is not allowed (revoked). Thus, a contradiction arises, since the user is not authorized to execute the query. \(\square \)
Rights and permissions
About this article
Cite this article
Sellami, M., Hacid, MS. & Gammoudi, M.M. A FCA framework for inference control in data integration systems. Distrib Parallel Databases 37, 543–586 (2019). https://doi.org/10.1007/s10619-018-7241-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-018-7241-5