A Maximum Variance Approach for Graph Anonymization

Nguyen, Hiep H.; Imine, Abdessamad; Rusinowitch, Michaël

doi:10.1007/978-3-319-17040-4_4

A Maximum Variance Approach for Graph Anonymization

Hiep H. Nguyen¹⁷,
Abdessamad Imine¹⁷ &
Michaël Rusinowitch¹⁷

Conference paper
First Online: 01 January 2015

957 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 8930))

Abstract

Uncertain graphs, a form of uncertain data, have recently attracted a lot of attention as they can represent inherent uncertainty in collected data. The uncertain graphs pose challenges to conventional data processing techniques and open new research directions. Going in the reserve direction, this paper focuses on the problem of anonymizing a deterministic graph by converting it into an uncertain form. The paper first analyzes drawbacks in a recent uncertainty-based anonymization scheme and then proposes Maximum Variance, a novel approach that provides better tradeoff between privacy and utility. Towards a fair comparison between the anonymization schemes on graphs, the second contribution of this paper is to describe a quantifying framework for graph anonymization by assessing privacy and utility scores of typical schemes in a unified space. The extensive experiments show the effectiveness and efficiency of Maximum Variance on three large real graphs.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Backstrom, L., Dwork, C., Kleinberg, J., Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In: WWW, pp. 181–190. ACM (2007)
Google Scholar
Boldi, P., Bonchi, F., Gionis, A., Tassa, T.: Injecting uncertainty in graphs for identity obfuscation. Proc. VLDB Endow. 5(11), 1376–1387 (2012)
Article Google Scholar
Bonchi, F., Gionis, A., Tassa, T.: Identity obfuscation in graphs through the information theoretic lens. In: ICDE, pp. 924–935. IEEE (2011)
Google Scholar
Cheng, J., Fu, A. W.-C., Liu, J.: K-isomorphism: privacy preserving network publication against structural attacks. In: SIGMOD, pp. 459–470. ACM (2010)
Google Scholar
Chester, S., Kapron, B.M., Ramesh, G., Srivastava, G., Thomo, A., Venkatesh, S.: Why waldo befriended the dummy? k-anonymization of social networks with pseudo-nodes. Soc. Netw. Anal. Min. 3(3), 381–399 (2013)
Article Google Scholar
Chester, S., Kapron, B.M., Srivastava, G., Venkatesh, S.: Complexity of social network anonymization. Soc. Netw. Anal. Min. 3(2), 151–166 (2013)
Article Google Scholar
Dalvi, N., Suciu, D.: Management of probabilistic data: foundations and challenges. In: PODS, pp. 1–12. ACM (2007)
Google Scholar
Fard, A.M., Wang, K., Yu, P.S.: Limiting link disclosure in social network analysis through subgraph-wise perturbation. In: EDBT, pp. 109–119. ACM (2012)
Google Scholar
Gao, H., Hu, J., Huang, T., Wang, J., Chen, Y.: Security issues in online social networks. IEEE Internet Comput. 15(4), 56–63 (2011)
Article Google Scholar
Hay, M., Miklau, G., Jensen, D., Towsley, D., Weis, P.: Resisting structural re-identification in anonymized social networks. Proc. VLDB Endow. 1(1), 102–114 (2008)
Article Google Scholar
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Discov. Data (TKDD) 1(1), 2 (2007)
Article Google Scholar
Liu, K., Terzi, E.: Towards identity anonymization on graphs. In: SIGMOD, pp. 93–106. ACM (2008)
Google Scholar
Mittal, P., Papamanthou, C., Song, D.: Preserving link privacy in social network based systems. In: NDSS (2013)
Google Scholar
Palmer, C. R., Gibbons, P. B., Faloutsos, C.: ANF: a fast and scalable tool for data mining in massive graphs. In: KDD, pp. 81–90. ACM (2002)
Google Scholar
Potamias, M., Bonchi, F., Gionis, A., Kollios, G.: K-nearest neighbors in uncertain graphs. Proc. VLDB Endow. 3(1–2), 997–1008 (2010)
Article Google Scholar
Sala, A., Cao, L., Wilson, C., Zablit, R., Zheng, H., Zhao, B.Y.: Measurement-calibrated graph models for social network experiments. In: WWW, pp. 861–870. ACM (2010)
Google Scholar
Shokri, R., Theodorakopoulos, G., Le Boudec, J.-Y., Hubaux, J.-P.: Quantifying location privacy, In: SP, pp. 247–262. IEEE (2011)
Google Scholar
Smith, G.: On the foundations of quantitative information flow. In: de Alfaro, L. (ed.) FOSSACS 2009. LNCS, vol. 5504, pp. 288–302. Springer, Heidelberg (2009)
Chapter Google Scholar
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(05), 557–570 (2002)
Article MATH MathSciNet Google Scholar
Tai, C.-H., Yu, P.S., Yang, D.-N., Chen. M.-S.: Privacy-preserving social network publication against friendship attacks. In: KDD, pp. 1262–1270. ACM (2011)
Google Scholar
Vázquez, A.: Growing network with local rules: preferential attachment, clustering hierarchy, and degree correlations. Phys. Rev. E 67(5), 056104 (2003)
Article Google Scholar
Wu, W., Xiao, Y., Wang, W., He, Z., Wang, Z.: k-symmetry model for identity anonymization in social networks. In: EDBT, pp. 111–122. ACM (2010)
Google Scholar
Ying, X., Wu, X.: Randomizing social networks: a spectrum preserving approach. In: SDM, vol.8, pp. 739–750. SIAM (2008)
Google Scholar
Yuan, Y., Wang, G., Wang, H., Chen, L.: Efficient subgraph search over large uncertain graphs. Proc. VLDB Endow. 4(11), 876–886 (2011)
Google Scholar
Zhou, B., Pei, J.: Preserving privacy in social networks against neighborhood attacks. In: ICDE, pp. 506–515. IEEE (2008)
Google Scholar
Zou, L., Chen, L., Özsu, M.T.: K-automorphism: a general framework for privacy preserving network publication. Proc. VLDB Endow. 2(1), 946–957 (2009)
Article Google Scholar
Zou, Z., Li, J., Gao, H., Zhang, S.: Mining frequent subgraph patterns from uncertain graph data. IEEE Trans. Knowl. Data Eng. 22(9), 1203–1218 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

LORIA/INRIA Nancy-Grand Est, Villers-lès-Nancy, France
Hiep H. Nguyen, Abdessamad Imine & Michaël Rusinowitch

Authors

Hiep H. Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Abdessamad Imine
View author publications
You can also search for this author in PubMed Google Scholar
Michaël Rusinowitch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hiep H. Nguyen .

Editor information

Editors and Affiliations

TELECOM Bretagne, Cesson Sévigné, France
Frédéric Cuppens
TELECOM SudParis, Evry, France
Joaquin Garcia-Alfaro
Dalhousie University, Halifax, Nova Scotia, Canada
Nur Zincir Heywood
University of Calgary, Calgary, Canada
Philip W. L. Fong

A Proof of Theorems

1.1 A.1 Proof of Theorem 1

Proof

We prove the result by induction.

When $k=1$, we have two cases of $G_1$: $E_{G_1}=\{e_1\}$ and $E_{G_1}=\emptyset $. For both cases, $Var[D(\mathcal {G}_1,G_1)] = p_1(1-p_1)$, i.e. independent of $G_1$.

Assume that the result is correct up to $k-1$ edges, i.e. $Var[D(\mathcal {G}_{k-1},G_{k-1})] = \sum _{i=1}^{k-1} p_i(1-p_i)$ for all $G_{k-1} \sqsubseteq \mathcal {G}_{k-1}$, we need to prove that it is also correct for $k$ edges. We use the subscript notations $\mathcal {G}_k, G_k$ for the case of $k$ edges. We consider two cases of $G_k$: $e_k \in G_k$ and $e_k \notin G_k$.

Case 1. The formula for $Var[D(\mathcal {G}_k, G_k)]$ is

$$\begin{aligned} Var[D(\mathcal {G}_k,G_k)] = \sum _{G'_k \sqsubseteq \mathcal {G}_k} Pr(G'_k) [D(G'_k,G_k) - E[D(\mathcal {G}_k,G_k)]]^2 \nonumber \\ = \sum _{e_k \in G'_k} Pr(G'_k) [D(G'_k,G_k) - E[D_k]]^2 + \sum _{e_k \notin G'_k} Pr(G'_k) [D(G'_k,G_k) - E[D_k]]^2 \nonumber \end{aligned}$$

The first sum is $\sum _{G'_{k-1} \sqsubseteq \mathcal {G}_{k-1}} p_k Pr(G'_{k-1})[D_{k-1} - E[D_{k-1}] - (1-p_k)]^2$.

The second sum is $\sum _{G'_{k-1} \sqsubseteq \mathcal {G}_{k-1}} (1-p_k) Pr(G'_{k-1})[D_{k-1} - E[D_{k-1}] + p_k)]^2$.

Here we use shortened notations $D_k$ for $D(G'_k,G_k)$ and $E[D_k]$ for $E[D(\mathcal {G}_k,G_k)]$.

By simple algebra, we have $Var[D(\mathcal {G}_k,G_k)] = Var[D(\mathcal {G}_{k-1},G_{k-1})] + q_k(1-q_k) = \sum _{i=1}^{k} p_i(1-p_i)$.

Case 2. similar to the Case 1. $\square $

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nguyen, H.H., Imine, A., Rusinowitch, M. (2015). A Maximum Variance Approach for Graph Anonymization. In: Cuppens, F., Garcia-Alfaro, J., Zincir Heywood, N., Fong, P. (eds) Foundations and Practice of Security. FPS 2014. Lecture Notes in Computer Science(), vol 8930. Springer, Cham. https://doi.org/10.1007/978-3-319-17040-4_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-17040-4_4
Published: 05 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17039-8
Online ISBN: 978-3-319-17040-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Abstract

Buying options

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Proof of Theorems

A Proof of Theorems

1.1 A.1 Proof of Theorem 1

Proof

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation