Introduction

The Internet is constantly overturning traditional social interaction at an alarming rate. It is estimated that about billions of users have actively used OSNs so far [1]. As the popularity of various online social application services, the scale of online social networks formed based on WeChat, Weibo, Facebook, Twitter and many other applications, meanwhile greatly enriching the diversity of human interactions and providing platforms for people to express their feelings [2]. It also accumulated a vast amount of users information and interaction contents, which potentially expose personal information and further develop the related research of social networks. In OSNs, users can set up their profiles including structured data and unstructured data, and lead to the property settings of personal data on different social platforms which are an indicator to measure personal privacy. Moreover, the information interaction between users in multiple social networks platforms also poses a potential threat to the disclosure of users’ privacy. It should be emphasized that when users share information on OSNs, due to lack of privacy awareness, they may not consider the access right and sharing scope [3]. This will result in negative privacy related experience [4].

To prevent privacy disclosure, social platforms implement a series of privacy policies to protect the security of personal data [7], such as restricting access to users’ data, designing general-sum stochastic game model and access-control policies [5, 6]. However, users’ social information is real-time and complex, and the social information between users is relevant. Although these policies are intended to protect users’ privacy to the greatest extent possible, social platforms still cannot provide absolute security for users [8,9,10]. And profile privacy settings for users are subjective on social platforms. Users may change levels of access, privacy awareness for attribute settings, as well as varied privacy requirements for individual attributes, resulting in the range of privacy status. For more accurately measure the degree of privacy leakage from the perspective of users, the best solution is to quantify the users’ privacy status and provide reasonable attribute configuration opinions for each user.

In this study, a new privacy measurement model PSFT is proposed for solving the privacy measurement problem in multiple OSNs. The proposed approach is generalized highly, considering and measuring the influence factors of privacy disclosure scores more comprehensively and accurately. The existing privacy measurement methods do not consider the important impact of attributes on different social platforms. We fully take into account attribute weight based on the fuzzy theory [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44] and enable users to know their privacy status. At the same time, users can adjust their property profile settings according to the weight value. Compared with traditional methods, the significant contributions of this work are as follows.

  • By evaluating the influence of attribute weight on privacy scores and combining with fuzzy TOPSIS decision methods [12], we proposed a computing model PSFT to obtain the ranking of users’ final privacy disclosure scores, which can more accurate measure users’ privacy disclosure.

  • According to the ranking, users can have a clearer understanding of their privacy disclosure. For users with a large degree of disclosure, they can reduce the disclosure risk by changing the profile attribute settings.

  • Users can reset the visibility of attributes according to attribute weight, thus increasing privacy awareness and reducing privacy concerns.

  • Extensive experiments have demonstrated the effectiveness of the PSFT model and its important improvements in reducing the privacy risk of users.

The rest of this paper is organized as follows. Some existing work on the privacy measurement are reviewed in section “Related work”. “Attributes privacy scores evaluation in OSNs” mainly presents the evaluation of attribute privacy scores. In “TOPSIS-based online social network privacy scores calculation model”, the TOPSIS decision method and the overall privacy score calculation model PSFT are provided. Then “Evaluation” describes the data set with experiment analysis. Finally, some summary and future work are given in “Conclusions”.

Related work

Privacy measurement is a challenging issue. Several papers have provided definitions of privacy measurement [13,14,15]. In the research of privacy protection, privacy measurement usually reveal the risk of privacy information leakage through metric indicators or methods. It reflects the strength of privacy protection methods from the perspective of disclosing private information.

In social networks, many works measured the degree of privacy through calculating the privacy disclosure scores. Michael et al. [16] first proposed a model for calculating privacy scores based on sensitivity and visibility in 2010. However, he did not verify the effectiveness of the model through the data set. Liu and Terzi [19] proposed a mathematical model based on sensitivity and visibility using the Item Response Theory [17], and proved the effectiveness and practical utility of the model through synthetic data and real data.

However, these models do not consider the relationships between these attributes and whether they can be used to infer additional information about the users. Petkos et al. presented Pscore in 2015 [18]. The framework can use data-driven mechanisms to infer information that is not explicitly displayed. Inferred information is particularly useful to help the users understand what attackers may know from analysing the OSNs presence data. Pensa and Blasi proposed a further framework based on the model of Liu and Terzi for data protection [19, 20]. They calculated the users’ privacy scores and set a threshold for each user. When the threshold was exceeded, they will notified that users proactively learned to help exposed users customize their privacy settings. On this basis, Erfan applied fuzzy inference system [29] and expanded the calculation of users’ privacy scores to multiple social networks.

With regard to privacy setting configuration and privacy awareness in OSNs, it has become an important issue to provide indicators and mechanisms to promote personal privacy management [21, 22]. The related applications typically only provide the mechanism for configuring users privacy profiles. Most approaches focus on protecting information involving profile rather than protecting the visibility of other information posted by the users. Pensa found some suggestions to try to solve these problems through the automation of privacy settings [20]. However, these methods often require some interventions from the users and do not solve the problem of raising privacy awareness. To quantify the amount of information leaked by users inadvertently, Becker proposed privacy awareness in 2009 [24], and provided solutions to reduce information loss. Privacy awareness motivate users to take measures to lessen privacy risk in addition to decreasing information loss. The Xue Feng model calculated the impact of attribute content and privacy awareness on privacy scores, and verified on multiple social platforms. Experiments show that the Xue Feng model can reduce information loss [27].

Other research have assessed users’ privacy in ways other than sensitivity and visibility. Srivastava et al. described and calculated the Privacy Quotient [23]. That is to say to measure the privacy of the users’ profile using the naive approach. They made full use of the Item Response Theory model to measure the privacy leak of messages and text. Talukder et al. proposed Privometer [25], a tool including information installed in the users’ friend profiles by malicious applications to measure privacy leakage in social networks. According to their personal privacy contributions, Privometer ranks them and offers self-sanitization recommendations to reduce the leakage. Moreover, many studies in recent years have transformed real-world problems into mathematical models [38,39,40]. Kumar applied fuzzy theory to transportation problems and transform the fuzzy transportation problem into a clear-cut problem. Furthermore, he presented algorithms for solving optimization problems using fuzzy and intuitionistic fuzzy sets, and found the optimal objective value [41,42,43].

In our model, we assess the impact of the attributes weight on privacy disclosure scores for multiple social platforms, and introduce fuzzy TOPSIS decision method to rank the value of privacy scores. According to the ranking results and the weight value, users with higher privacy scores are more likely to leak privacy and the attributes of higher weight can be reset.

Attributes privacy scores evaluation in OSNs

On social media networks, user attributes can be extracted through profiles, posts, photographs, and videos. We evaluate the attribute privacy disclosure by calculating attribute privacy scores. The research focuses on two metrics that influence attribute privacy scores, namely sensitivity and visibility. In our model, we measure the sensitivity of the information, and then calculate the visibility based on four factors including accessibility to information, the difficulty of data extraction, reliability of data and privacy aware, which all have the direct impact on visibility. Through Euclidean distance, four factors are aggregated to find the visibility cluster center of each user [27]. Next, by aggregating sensitivity and visibility we get the attribute privacy scores. The calculation of users’ overall privacy scores and ranking are introduced in the next section.

There are a set of users \(\Upsilon =\{u_{1},u_{2},\ldots , u_{m}\}\) and each user includes n attributes in s OSN platforms. Through calculating the sensitivity \(\theta \) and visibility v, the attribute privacy scores p are obtained. The overall privacy scores are R. For the convenience of intuitive understanding, the main symbols and meanings involved in the technical methods in this paper are introduced here, as shown in Table 1.

Table 1 Notation

Sensitivity

Sensitivity refers to the influence of various attributes on different social platforms, reflects the most intuitive impact of attribute settings on privacy scores, and represents the potential harm of attackers to target users. The higher the sensitivity of the attribute, the stronger the concern after the disclosure of the attribute content. Srivastava et al. [23] calculated the sensitivity scores of 11 attributes based on the quotient model to measure the privacy scores. The results showed that address, political views and contact number were highly sensitive, while birthdate and current town were less sensitive. We use the sensitivity values derived by Srivastava et al. [23], as shown in Table 2.

Table 2 Attribute sensitivity

Visibility

Visibility represents the public degree of the attribute information. Accessibility is the degree of access to users information, extraction difficulty includes the difficulty of extracting structured data and unstructured data, the accuracy of privacy measurement depends on the reliability of attributes and other information, and privacy awareness refers to the importance users attach to their privacy on social platforms. By evaluating the value of accessibility, the difficulty of extraction, reliability and privacy aware, we get the visibility of each attribute. The four factors have been quantified [27], which will not be repeated here. Euclidean distance is used to calculate users’ visibility [28]. Each user’s accessibility, extraction difficulty, reliability and privacy aware form a four-dimensional vector. The Euclidean distance between the four-dimensional vector and a certain classification center is the smallest, and the visibility of this vector is the value represented by this classification center.

The formula for n-dimensional Euclidean distance is:

$$\begin{aligned} d(x,y)&=\sqrt{(x_{1}-y_{1})^{2}+(x_{2}-y_{2})^{2}+\cdots +(x_{n}-y_{n})^{2}}\nonumber \\&=\sqrt{\sum \nolimits _{i=1}^{n}(x_{n}-y_{n})^{2}}. \end{aligned}$$
(1)

where x, y denote two n-dimensional vectors respectively.

After the classification center is designated, the visibility of any sample is the value represented by the classification center [28]. We also set that the range of visibility value is [1,6]. Finally, the visibility value of each attribute is obtained. The higher the visibility of the attributes, the more serious the possibility of information leakage.

Calculation of attribute privacy scores

Formula (2) can be used to calculate users’ attribute privacy scores through visibility and sensitivity.

$$\begin{aligned} \begin{aligned} p=\frac{\sum \nolimits _{i=1}^{m}v_{i} \times \theta _{i}}{n}. \end{aligned} \end{aligned}$$
(2)

where \(v_{i}\) is visibility of attributes and \(\theta _{i}\) is sensitivity of attributes of user i. The higher the attribute privacy scores, the more serious the privacy leakage of users.

TOPSIS-based online social network privacy scores calculation model

Fuzzy logic

Fuzzy set theory is widely used in fuzzy modeling of human thinking, and it also fully addresses the uncertainty of decision information based on multiple criteria. In fuzzy sets, Fuzzy items are used to describe the mapping from language variables to numerical variables [26]. The relevant definition of fuzzy set is given as follows.

Definition 1 Let the set \(X\lnot \emptyset \). The fuzzy set F is an object having the form [33]:

$$\begin{aligned} \begin{aligned} F=\{(x, \mu _{F:}(x))|x\in X\}. \end{aligned} \end{aligned}$$
(3)

where \(\mu _{F}\) is the membership function of fuzzy set F and \(\mu _{F}:X\rightarrow [0,1]\) represents the degree of relevance of x in F.

In the fuzzy set F, the experts’ evaluation of attribute weight has no non-subordination, which means that the experts only give the evaluation of attributes with different degree of importance. In addition, since it is difficult to quantify the importance of attributes with precise values on social platforms, considering the specification of related applied field of the present work, triangular membership functions were adopted. The process of fuzzification and defuzzification comes next. The triple (abc) with the pertinence function can be used to set the triangular fuzzy number. Defuzzification is the process of converting fuzzy numbers into single numbers in fuzzy logic.

Treatment of attribute weight

Diverse social attributes have different effects on privacy disclosure scores. The configuration of attributes directly determines whether their privacy will be disclosed. Whether an attribute is important depends on the impact of attribute characteristics on privacy. Attributes that have a greater impact on privacy will have the greater weight, while attributes with a lesser impact will have the correspondingly lower weight. So we evaluate the weight values to generate more accurate privacy scores.

The evaluation of attribute weight is a difficult step in the process of computing privacy scores, and weight of different attributes has a greater effect on the privacy scores. The definition of attributes importance depends on the experience and knowledge of each expert. The treatment of attributes weight constitutes the central contribution of this work.

Table 3 provides a detailed breakdown of how linguistic terms [33,34,35,36,37] supplied by experts were converted into fuzzy triangular numbers. To measure the information disclosure, we collect data on 11 attributes for each user in multiple OSNs.

Table 3 Degree of importance
Table 4 Expert evaluation

Table 4 simulates the definition of the degree of importance for each attribute using the Language Terms defined by four experts. The purpose of this evaluation is to provide different degree of evaluation for the importance of the users’ attributes.

Table 5 shows the calculation process of attribute weight value. Firstly, the fuzzy average values of attributes are computed, and then the fuzzy values are defuzzied using the average method. Finally, the normalization is performed. The attribute weight values will be applied to the process of calculating the overall privacy scores.

Table 5 Calculation of attribute weight

Overall privacy scores calculation model

Multiple criteria decision making (MCDM) is a sub discipline of operations research [30, 31], which clearly evaluates multiple conflicting criteria in decision-making. The difficulty of the problem stems from the existence of multiple criteria. MCDM focuses on structuring and solving decision-making and planning problems involving multiple standards. The process of evaluating users’ privacy measurement can be said a decision-making problem. TOPSIS method is a common comprehensive evaluation method and can accurately reflect the gap between the evaluation projects.

TOPSIS developed by Hwang and Yoon is one of the typical decision-making methods [32], which makes full use of the information from the data and can judge the advantages and disadvantages of each solution. It is based on the positive ideal solution and the negative ideal solution, two fundamental ideas. The positive ideal solution is the best alternative that can be considered, and all of its criteria result in the best value when compared to the alternatives. The worst solution ever proposed is the negative ideal solution, and all of its requirements result in the worst value among the alternatives. Comparing each alternative to both the positive and negative ideal solutions is the principle of scheme ranking. One of the schemes is the best among the alternatives if it is both farthest from the negative ideal solution and closer to the positive ideal solution [30].

TOPSIS takes into account not only the closest distance to the positive ideal solution, but also the longest distance to the negative ideal solution, so as to determine the optimal solution to maximize the effectiveness. And the process of TOPSIS method does not incorporate any subjective factors and can only get the unique optimal solution. It is more suitable for the decision-making environment of social networks that require complete objectivity for privacy leakage.

Fig. 1
figure 1

Users overall privacy scores measurement model based on TOPSIS

On the basis of the distance from the extraction difficulty, accessibility, reliability and privacy awareness of each user to the clustering center, we obtained the users’ visibility according to formula 1, and the attribute privacy scores can be obtained through the aggregation with the sensitivity according to formula 2. Combining attribute privacy scores with attribute weight, the general model of the proposed method is presented in Fig. 1. First, the attribute privacy scores matrix is constructed for user list and user attribute privacy scores. According to formula 4, the matrix is normalized. The attribute weight have been obtained by means of decision experts evaluation. Then the attribute scores matrix and weight matrix are aggregated to obtain the attributes decision matrix in step 4. Next calculating the PIS(positive ideal solutions) and NIS (negative ideal solutions) of the evaluation matrix, calculating the distance from the attribute decision matrix to PIS and NIS in step 6. The minimum overall privacy score of the user is the most ideal scheme in the model PSFT. Finally through calculating the relative closeness between privacy scores of the particular user to the minimum privacy score, we get the overall privacy score for each user. The overall privacy disclosure score ranking is obtained according to the relative closeness. The steps of the TOPSIS-based online social network privacy scores calculation can be summarized as follows.

Step 1. Construct a matrix \([u_{ij}]_{m \times n}\) composed of m users(alternatives) and n attributes(criteria). \(u_{ij}\) represents the privacy score of the j-th attribute of the i-th user, i = 1, 2, \(\ldots \), m; j = 1, 2, \(\ldots \), n.

Step 2. The matrix \((u_{ij})_{m\times n}\) is normalized to form the matrix \([NU_{ij}]_{m \times n}\). The values in this matrix range from 0 to 1, where 1 is the users’ attribute privacy completely leaked and 0 is no leakage of users’ attribute privacy. The normalization method used is as follows:

$$\begin{aligned} \begin{aligned} NU_{ij}=\frac{u_{ij}}{\sqrt{\sum _{i=1}^{m}u_{ij}^{2}}} \end{aligned} \end{aligned}$$
(4)

where i = 1, 2, \(\ldots \), m; j = 1, 2, \(\ldots \), n.

Step 3. A set of attribute weight \(w_{j}(j=1, 2,\ldots , n)\), such that \(\sum \nolimits _{j=1}^{n} w_{j}=1\) have to be decided by experts.

Step 4. Calculate the weighted normalized decision matrix \(T=[t_{ij}]_{m \times n}\).

\(t_{ij}=w_{j}NU_{ij}\), i=1, 2, \(\ldots \), m; j=1, 2, \(\ldots \), n.

Step 5. Determine ideal alternatives \(t_{iI}\) and non-ideal alternatives \(t_{iN}\) for each attribute parameter.

\(t_{iI}=\min \limits _{j = 1}^n t_{ij} \), \(i=1,2,\ldots ,m\).

\(t_{iN}=\max \limits _{j = 1}^n t_{ij} \), \(i=1,2,\ldots ,m\).

Step 6. The distance between \(t_{ij}\) and ideal alternatives and non-ideal alternatives is given by the Euclidean distance in the following equation:

$$\begin{aligned} D_{i+}&=\sqrt{\sum \nolimits _{j=1}^{n}(t_{ij}-t_{iI})^{2}} \end{aligned}$$
(5)
$$\begin{aligned} D_{i-}&=\sqrt{\sum \nolimits _{j=1}^{n}(t_{ij}-t_{iN})^{2}} \end{aligned}$$
(6)

where \(i=1,2,\ldots ,m\); j=1, 2, \(\ldots \), n.

Step 7. Calculate the relative closeness between privacy scores of the particular user to the minimum privacy score as overall privacy scores \(R_{i}\).

$$\begin{aligned} \begin{aligned} R_{i}=\frac{D_{i-}}{D_{i+}+D_{i-}}, i=1,2,\ldots ,m. \end{aligned} \end{aligned}$$
(7)

The higher the privacy score, the better the ranking, and the worse the users’ privacy status. According to the ranking results, users can know their privacy protection degree. Users can choose the sharing range of attributes with higher weight values to be narrower and the privacy level of the attribute with higher weight to be set higher by combining the weight values and their overall privacy scores in detail.

Evaluation

This section shows the degree of privacy protection by ranking the privacy disclosure scores on multiple social platforms. We validate the PSFT model by synthetic data set and real data set on multiple platforms.

Case study I

In this experiment, we use two users’ disclosed data [29] and 13 other generated data randomly. The dataset mainly contains 15 users information with 11 most representative profile attributes on the four online social platforms of Facebook, ResearchGate, LinkedIn, and Google+, and contains the accessibility values with attribute content and data extraction difficulty of two users [27]. In this paper, the accessibility and data extraction difficulty values of the two users are used as the input values of a and b. Then, we comprehensively quantify the visibility of user a and user b, find out the shortest Euclidean distance of users to the clustering center, and get the classification result. The visibility of sample is represented by this center, as shown in Table 6.

Table 6 Two users’ visibility

According to calculation steps, we calculate the 11 attribute privacy scores of 15 users in Table 7. Then combined with the evaluation matrix for attribute weight \(\omega _{j}\) in Table 5, the attribute decision matrix T is obtained. And we determine that the ideal alternatives PIS for each attribute is [2.4000, 0.7332, 4.2500, 0.6996, 0.9000, 0.6996, 1.2000, 2.4996, 1.8000, 3.3996, 4.0998] and the non-ideal alternatives NIS is [0.6000, 0.3666, 1.7000, 0.2332, 0.4500, 0.2332, 0.4000,0.8332, 0.9000, 1.1332, 2.0499]. According to formula 5–7, we obtain the values of \(D_{i+}\) and \(D_{i-}\), and get the privacy scores ranking result of 15 users that is l > n > o > i > d > b > h > k > c > j > f > e > g > a > m in Table 8 finally. The higher the scores, the more serious the privacy disclosure. On the contrary, the lower the scores, the greater the degree of privacy protection and the stronger the privacy security.

Table 7 Users attribute privacy scores calculation
Table 8 The final overall privacy scores \(R_{i}\)

The comparison measured by PSFT model, XueFeng model [27] and Erfan model [29] is shown in Fig. 2, which illustrates the overall privacy disclosure scores of 15 users. All in all, privacy disclosure scores of PSFT model are generally lower, because we consider the weight of attributes, namely the importance of the different attributes privacy disclosure on different platforms. Most users using the PSFT model show lower privacy disclosure scores, except for ‘d’, ‘i’ and ‘l’. Because ‘d’ and ‘i’ have a wider range of access to attributes with larger weight, leading to the higher overall privacy scores. For user ‘l’, he has the same content for the attribute with higher weight in different platforms, so he has a higher score of privacy disclosure and is easier to be attacked. On the whole, Erfan in the FIS model [29] neither considers the impact of attribute content on accessibility, nor the impact of privacy awareness on visibility. Although XueFeng model [27] considers attribute content, the overall privacy scores are higher because the users may often enter the same content on different platforms. However, PSFT model considers content and the weight value of the attributes. For the attribute with greater weight, users make reasonable adjustment to their profile attributes. Therefore, the privacy of users in various social networks will be more strongly guaranteed, and the better social experience will be obtained on different social platforms.

Fig. 2
figure 2

The overall privacy scores of synthetic data set in PSFT model, Xue Feng model and Erfan model

Table 9 Attribute content and accessibility for six users
Fig. 3
figure 3

The overall privacy scores of the real data set

Fig. 4
figure 4

Comparison of privacy scores on three platforms and four platforms

Case study II

We apply our model to multiple platforms that are widely used in the real world social networks. According to the survey, the most widely used social platforms are QQ, WeChat, Sina Weibo and Taobao in China. On these four platforms, users have registered attribute data. Through questionnaire survey and analysis [27], the sensitivity of the attributes is obtained. To ensure the authenticity and reliability of the experimental data, we investigate and collect the real social information of 100 college students, and record their attribute information on four social platforms. Due to the limited space, we select the information of 6 users, namely ‘p’,‘q’,‘r’,‘s’,‘t’,‘u’, who use social platforms to be more diversified as experimental data in Table 9. As with related studies, to represent the accessible level and content of attributes, we use 1–4 to represent the accessibility of the attributes where 1 means only visible to oneself, 2 means visible to family and oneself, 3 means visible to oneself, family and friends, and 4 means the attribute content is public [29]. A-Z represent the content of the attributes [27]. Similarly, after expert review, the weight of username, avatar, political belief and education are 0.53, 0.42, 0.65 and 0.71, and the weight of other attributes are shown in Table 5.

Table 10 Profile change

The privacy score experiment results of these six users in PSFT model, Xue Feng model [27] and Erfan model [29] are shown in Fig. 3. Due to insufficient privacy awareness, user ‘t’ has the wider range of access to the privacy settings of more important attributes, and other users have low privacy scores in the PSFT model. The Erfan model has the highest privacy scores, because the default attribute content filled in is the same on different platforms. Xuefeng model does not consider the attributes of users with different degree of importance. In our model, users can set a smaller sharing range for the attribute privacy level, so as to effectively prevent malicious attackers from prying into users’ privacy. To prove the advantage of our model in multiple social networks, Fig. 4 compares the privacy scores of users ‘m’ and ‘n’ about the PSFT model, Xue Feng model and Erfan model in three social platforms and four social platforms respectively. As the number of platforms increases, attribute reliability and privacy awareness continue to improve. However, as shown in Fig. 4, the privacy scores of PSFT model, Xue Feng model and Erfan model increase with the number of platforms, which means that the more platforms used, the greater the possibility of exposing privacy. In other words, the process of attackers attacking users will be easier, and the degree of privacy protection of users will be worse. Overall, the privacy scores of the PSFT model are much lower than Xue Feng model or Erfan model. Because users’ awareness of protecting the privacy of important attributes is increasing with the use of more social platforms, and the sharing scope of attribute information with large weight is shrinking, resulting in smaller privacy scores.

Change profile

To further verify the usability of our proposed model in practical social networking, the confirmatory description is made. According to the evaluation of attribute weight, contract number, e-mail, address, religious views, and political views have higher weight. We set the access level of these attributes to higher access settings. The experiment shows how to use the PSFT model to reduce privacy leakage. Taking user a and user b as examples, users can reasonably change the configuration file settings on the basis of the weight of attributes, as shown in Table 10. For example, set the sharing range of \(a \)’s contact number on Facebook and Researchgate to be accessible only to himself, set the sharing scope of \(a \)’s email on Facebook and LinkedIn to be accessible only to family and themselves, set the sharing range of \(b \)’s address on Facebook to be accessible only to himself et al.

In Fig. 5, after modifying the settings, the privacy scores of both user a and user b are significantly reduced. Because the user sets the accessible range of the profile attribute with larger weight to a higher level resulting in the lower overall privacy scores, which meets the users’ demand for higher privacy protection intensity. In the real world, people have different requirements for privacy protection, so there is no general protection method. The fundamental purpose of calculating privacy scores is to hope that users can understand their privacy status and reasonably set personal data attributes according to attribute weight, so as to make users’ privacy inviolable from the beginning.

Fig. 5
figure 5

Comparison of privacy scores before and after changing users’ profile settings

Fig. 6
figure 6

Comparison of privacy scores before and after changing profile settings in Facebook dataset

We also verify the model on the single platform dataset. This dataset is published by Stanford Network Analysis Project (SNAP) [45], where we use the Facebook dataset including 4039 users and 10 user attributes. Due to space limitation, 8 users were selected to validate this model on the Facebook dataset. By calculating the overall privacy disclosure scores, users know their privacy status and attributes vulnerable to attackers. And then they adjusted the sharing range of the attribute profile according to the weight, so that the visible range of attributes is set narrower. The experimental results in Fig. 6 show that the privacy scores of users are significantly reduced, and the model is also suitable for the single platform social network.

We propose a privacy measurement model combining TOPSIS to calculate privacy scores. In references [46,47,48], privacy measurement methods based on anonymity, information entropy and differential privacy are respectively proposed. In view of the existing privacy measurement methods and the privacy measurement model proposed in this paper, Table 11 explains the privacy measurement indicators, the relationship between measures and privacy intensity, and the advantages and disadvantages of different privacy measurement methods in detail.

Table 11 Comparative analysis of privacy measurement

Conclusions

In this paper, we propose a more accurate privacy score calculation model PSFT which reflects the strength of privacy measurement through the disclosure of privacy information. To quantify privacy leakage of users in social network platforms, we first combined sensitivity and visibility to calculated attributes privacy scores. Then considering the importance of attribute weight, we propose TOPSIS-based online social network privacy scores calculation model that can obtain the ranking of users’ overall privacy scores by combining the TOPSIS decision method. Finally, based on the ranking of the users’ privacy scores and the weight values of attributes, users can reset their profiles, thus effectively reducing the risk of privacy leakage. We compare our method with XueFeng model and Erfan model in the same dataset and multiple OSNs. The results show privacy disclosure scores of PSFT model are generally lower and better protect the privacy of users from disclosure.

The PSFT privacy measurement model is effective for solving the privacy measurement problem. However, the experts’ evaluation of attribute weight is not objective enough and the attack through background knowledge should also be considered. To defend the inferred attacks of attackers, we can use (2,1)-fuzzy set to measure the degree of privacy disclosure more comprehensively through the membership evaluation of users’ background knowledge for attackers and the non-membership evaluation of attributes for users. Due to the large-scale and dynamic of social networks, the existing decision-making methods are not yet able to meet the dynamic changes of attribute settings in social networks. The decision model of triangular fuzzy soft level with dynamic intervals deserves further study in privacy score disclosure.