Privacy-preserving worker allocation in crowdsourcing

Zheng, Libin; Chen, Lei; Cheng, Peng

doi:10.1007/s00778-021-00713-1

Privacy-preserving worker allocation in crowdsourcing

Regular Paper
Published: 16 January 2022

Volume 31, pages 733–751, (2022)
Cite this article

The VLDB Journal Aims and scope Submit manuscript

815 Accesses
2 Citations
Explore all metrics

Abstract

Crowdsourcing has been a prevalent way to obtain answers for tasks that need human intelligence. In general, a crowdsourcing platform is responsible for allocating workers to each received task, with high-quality workers in priority. However, the allocation results can in turn yield knowledge about workers’ quality. For example, those unallocated workers are supposed to be less-qualified. They can be upset if such information is known by the public, which is an invasion of their privacy. To alleviate such concerns, we study the privacy-preserving worker allocation problem in this paper, aiming to properly allocate the workers while protecting their privacy. We propose worker allocation methods with the property of differential privacy, which proceed by first computing weights for each potential allocation and then sampling according to the weights. The Markov Chain Monte Carlo-based method is shown in our experiments to improve over the trivial random allocation method by 18.9% in terms of worker quality on synthetic data. On the real data, it realizes differential privacy with less than 20% loss on quality even when $\epsilon = \frac{1}{3}$.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Preserving worker privacy in crowdsourcing

Article 29 May 2014

Lightweight Privacy-Preserving Task Assignment in Skill-Aware Crowdsourcing

From Task Tuning to Task Assignment in Privacy-Preserving Crowdsourcing Platforms

Notes

References

Amazon mechanical turk. https://www.mturk.com/
Ele.me. https://www.ele.me/
Uber. https://www.uber.com/
Abadi, M., Chu, A., Goodfellow, I.J., McMahan, H.B., Mironov, I., Talwar, K., Zhang, L.: Deep learning with differential privacy. In: Proc. of the ACM CCS, pp. 308–318
Andrés, M.E., Bordenabe, N.E., Chatzikokolakis, K., Palamidessi, C.: Geo-indistinguishability: differential privacy for location-based systems. In: Proc. of the ACM CCS, pp. 901–914 (2013)
Beimel, A., Nissim, K., Stemmer, U.: Private learning and sanitization: Pure vs. approximate differential privacy. In: Raghavendra, P., Raskhodnikova, S., Jansen, K., Rolim, J.D.P. (eds.) Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pp. 363–378 (2013)
Béziaud, L., Allard, T., Gross-Amblard, D.: Lightweight privacy-preserving task assignment in skill-aware crowdsourcing. In: DEXA (2), 10439 of Lecture Notes in Computer Science, pp. 18–26 (2017)
Bhaskar, R., Laxman, S., Smith, A.D., Thakurta, A.: Discovering frequent patterns in sensitive data. In: Proc. of the KDD, pp. 503–512. ACM (2010)
Borodin, A., El-Yaniv, R.: Online computation and competitive analysis. cambridge university press (2005)
Duguépéroux, J., Allard, T.: From task tuning to task assignment in privacy-preserving crowdsourcing platforms. Trans. Large Scale Data Knowl. Centered Syst. 44, 67–107 (2020)
Google Scholar
Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: privacy via distributed noise generation. Proc. of the EUROCRYPT 4004, 486–503 (2006)
MathSciNet MATH Google Scholar
Dwork, C., McSherry, F., Nissim, K., Smith, A.D.: Calibrating noise to sensitivity in private data analysis. Proc. of the TCC 3876, 265–284 (2006)
MathSciNet MATH Google Scholar
Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9(3–4), 211–407 (2014)
MathSciNet MATH Google Scholar
Fan, J., Li, G., Ooi, B.C., Tan, K., Feng, J.: icrowd: An adaptive crowdsourcing framework. In: Proc. of the SIGMOD, pp. 1015–1030 (2015)
Fisher, R.A., et al.: Statistical methods for research workers. Statistical methods for research workers., (5th Ed) (1934)
Geweke, J., et al.: Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments, volume 196. Federal Reserve Bank of Minneapolis, Research Department Minneapolis, MN (1991)
Hu, H., Zheng, Y., Bao, Z., Li, G., Feng, J., Cheng, R.: Crowdsourced poi labelling: location-aware result inference and task assignment. In: Proc. of the ICDE, pp. 61–72. IEEE (2016)
Ipeirotis, P.G., Provost, F., Wang, J.: Quality management on amazon mechanical turk. In: Proc. of the SIGKDD, pp. 64–67 (2010)
Kajino, H., Arai, H., Kashima, H.: Preserving worker privacy in crowdsourcing. Data Mining and Knowl. Dis. 28(5–6), 1314–1335 (2014)
Article MathSciNet Google Scholar
Karger, D.R., Oh, S., Shah, D.: Budget-optimal task allocation for reliable crowdsourcing systems. Oper. Res. 62(1), 1–24 (2014)
Article Google Scholar
Karp, R.M.: On-line algorithms versus off-line algorithms: How much. In: Algorithms, Software, Architecture: Information Processing 92: Proceedings of the IFIP 12th World Computer Congress, volume 1, p. 416 (1992)
Khattak, F.K., Salleb-Aouissi, A.: Quality control of crowd labeling through expert evaluation. In: Proc. of the NIPS 2nd Workshop on Computational Social Science and the Wisdom of Crowds, volume 2, p. 5 (2011)
Li, H., Liu, Q.: Cheaper and better: Selecting good workers for crowdsourcing. In: Proc. of the HCOMP, pp. 20–21 (2015)
Liu, Y., Guo, B., Chen, C., Du, H., Yu, Z., Zhang, D., Ma, H.: Foodnet: toward an optimized food delivery network based on spatial crowdsourcing. IEEE Trans. Mobile Comput. 18(6), 1288–1301 (2018)
Article Google Scholar
Marshall Hall, J.: Combinatorial theory. Blaisdell, Waltham, Mass, 196 (1986)
McSherry, F.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: Proc. of the SIGMOD, pp. 19–30 (2009)
McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: Proc. of the FOCS, pp. 94–103 (2007)
Shen, E., Yu, T.: Mining frequent graph patterns with differential privacy. In: Proc. of the KDD, pp. 545–553 (2013)
Shu, J., Jia, X., Yang, K., Wang, H.: Privacy-preserving task recommendation services for crowdsourcing. IEEE Trans. Services Comput. (2018)
Tao, Q., Tong, Y., Zhou, Z., Shi, Y., Chen, L., Xu, K.: Differentially private online task assignment in spatial crowdsourcing: A tree-based approach. In: Proc. of the ICDE, pp. 517–528 (2020)
To, H., Ghinita, G., Shahabi, C.: A framework for protecting worker location privacy in spatial crowdsourcing. Proc. of the VLDB Endowment 7(10), 919–930 (2014)
Article Google Scholar
Tong, Y., Zhou, Z., Zeng, Y., Chen, L., Shahabi, C.: Spatial crowdsourcing: a survey. The VLDB J. 29(1), 217–250 (2020)
Article Google Scholar
Varshney, L.R.: Privacy and reliability in crowdsourcing service delivery. In: Annual SRII Global Conference, pp. 55–60 (2012)
Wang, J., Kraska, T., Franklin, M.J., Feng, J.: Crowder: Crowdsourcing entity resolution. Proc. of the VLDB 5(11), 1483–1494 (2012)
Article Google Scholar
Zhao, Z., Wei, F., Zhou, M., Chen, W., Ng, W.: Crowd-selection query processing in crowdsourcing databases: A task-driven approach. In: Proc. of the EDBT, pp. 397–408 (2015)
Zheng, L., Chen, L.: DLTA: A framework for dynamic crowdsourcing classification tasks. IEEE Trans. Knowl. Data Eng. 31(5), 867–879 (2019)
Article Google Scholar
Zheng, Y., Li, G., Li, Y., Shan, C., Cheng, R.: Truth inference in crowdsourcing: is the problem solved? Proc. of the VLDB 10(5), 541–552 (2017)
Article Google Scholar

Download references

Acknowledgements

Libin Zheng’s work is supported by the National Natural Science Foundation of China No. 62102463, Basic and Applied basic Research Project of Guangzhou basic Research Program (202102080401), and Zhuhai Industry-University-Research Cooperation Project (ZH22017001210010PWC). Peng Cheng is sponsored by Shanghai Pujiang Program 19PJ1403300 and the National Natural Science Foundation of China No. 62102149. Lei Chen’s work is partially supported by National Key Research and Development Program of China Grant No. 2018AAA0101100, the Hong Kong RGC GRF Project 16209519, CRF Project C6030-18G, C1031-18G, C5026-18G, AOE Project AoE/E-603/18, Theme-based project TRS T41-603/20R, China NSFC No. 61729201, Guangdong Basic and Applied Basic Research Foundation 2019B151530001, Hong Kong ITC ITF grants ITS/044/18FX and ITS/470/18FX, Microsoft Research Asia Collaborative Research Grant, HKUST-NAVER/LINE AI Lab, Didi-HKUST joint research lab, HKUST-Webank joint research lab grants.

Author information

Authors and Affiliations

Guangdong Key Laboratory of Big Data Analysis and Processing, Sun Yat-sen University, Guangdong, China
Libin Zheng
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
Lei Chen
School of Software Engineering, East China Normal University, Shanghai, China
Peng Cheng

Authors

Libin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Lei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Peng Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Libin Zheng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A Proof of Lemma 1

Proof

We show that $g(\cdot )$ can be constructed to meet (i) and (ii), respectively.

(i)
We construct a bipartite graph with $\varPhi '$ and $\varPhi $ as the left and right sides, respectively. In addition, each $\phi \in \varPhi $ has $\lfloor \frac{|W| - (k -1 )}{k} \rfloor $ copies in its side. Each copy of $\phi $ is equally linked to all $\phi ' \in \varPhi '$ such that $|\phi \setminus \phi ' | = 1$. An example is given in Fig. 10. We show that there is a matching covering all $\phi \in \varPhi $, which then directly infers the feasibility of constructing $g(\cdot )$ meeting (i). Hall’s Theorem [25] states that for a bipartite graph, there is a matching covering $\varPhi $, if and only if
$$\begin{aligned} \forall X \subseteq {\varPhi }, \quad |X| \le |Nbor(X)|\,, \end{aligned}$$
where Nbor(X) contains all the neighbor nodes of X in $\varPhi '$. Given any $X \subseteq \varPhi $, we construct $X^*$ on top of X by including all copies of any $\phi \in X$ such that each $\phi \in X$ has $\lfloor \frac{|W| - (k -1 )}{k} \rfloor $ copies in $X^*$. Obviously we have $Nbor(X) = Nbor(X^*)$. There are $\frac{|X^*|}{\lfloor \frac{|W| - (k -1 )}{k} \rfloor }$ distinct $\phi $’s in $X^*$, each of which has $|W| - (k -1 )$ neighbors in $\varPhi '$. Different $\phi \in X^*$ can be incident to a same node in $Nbor(X^*)$. However, each $\phi '\in Nbor(X^*)$ has at most k neighbors in $X^*$. As a result,
$$\begin{aligned} |Nbor(X^*)| \ge \frac{|X^*|}{\lfloor \frac{|W| - (k -1 )}{k} \rfloor } * (|W|-(k-1)) * \frac{1}{k} \ge |X^*|. \end{aligned}$$
Then $|Nbor(X)| = |Nbor(X^*)| \ge |X^*| \ge |X| $.
(ii)
We still construct a bi-graph with $\varPhi $ and $\varPhi '$, but this time each $\phi \in \varPhi $ has $\lceil \frac{|W| - (k -1) }{k} \rceil $ copies. An example is given in Fig. 11. We show that there is a matching covering all $\phi ' \in \varPhi '$, which then directly infers the feasibility of constructing $g(\cdot )$ meeting (ii). With Hall’s Theorem, we need to show that
$$\begin{aligned} \forall X \subseteq {\varPhi '}, \quad |X| \le |Nbor(X)|\,. \end{aligned}$$
Given any $X \subseteq \varPhi '$, each $\phi ' \in X$ has $k * \lceil \frac{|W| - (k -1) }{k} \rceil $ neighbors in $\varPhi $. Each $\phi \in Nbor(X)$ has at most $|W| -(k - 1)$ neighbors in X. As a result,
$$\begin{aligned} |Nbor(X)|\ge & {} |X| * k * \lceil \frac{|W| - (k -1) }{k} \rceil \\&* \frac{1}{|W| -(k - 1)} \ge |X|\,. \end{aligned}$$
$\square $

B Proof of Theorem 3

Proof

Given two neighboring worker pools $W_1$ and $W_2$ such that $W_1 \setminus W_2 = w_1$, and a query Q, let $P_1(A_h) $ and $P_2(A_h) $ denote the probability of getting the output $A_h$ from using Q over $W_1$ and $W_2$, respectively. Let $\varPhi _1$ and $\varPhi _2$ denote all the length-B subsets of $W_1$ and $W_2$, respectively. Then, we need to bound $\frac{P_1(A_h)}{P_2(A_h) }$ and $\frac{P_2(A_h)}{P_1(A_h) }$ for any $A _h\in \varPhi =\varPhi _1 \cup \varPhi _2$.

Let us first consider $A _h$ such that $w_1 \notin A _h$. Then, according to Algorithm 1,

$$\begin{aligned} P_1(A _h) = \frac{e^{\epsilon f(A _h)} }{\sum \limits _{A \in \varPhi _1} e^{\epsilon f(A) }} \,. \end{aligned}$$

$P_2(A _h)$ can be deducted similarly. Since $\varPhi _2 \subset \varPhi _1$, we have $\frac{P_1(A _h)}{P_2(A _h) } <\frac{P_2(A _h)}{P_1(A _h) }$, and thus focus on $\frac{P_2(A _h)}{P_1(A _h) }$.

$$\begin{aligned} \frac{P_2(A _h)}{P_1(A _h) }= & {} \frac{\sum \limits _{A \in \varPhi _2 } e^{\epsilon f(A)} + \sum \limits _{A \in \varPhi _1 \setminus \varPhi _2 } e^{\epsilon f(A) } }{ \sum \limits _{A \in \varPhi _2} e^{\epsilon f(A)} } \\= & {} 1 + \frac{\sum \limits _{A \in \varPhi _1 \setminus \varPhi _2 } e^{\epsilon f(A)} }{\sum \limits _{A \in \varPhi _2 } e^{\epsilon f(A) } } \\ \end{aligned}$$

Let $\varDelta \varPhi = \varPhi _1 \setminus \varPhi _2$. Each answer $A_p \in \varDelta \varPhi $ can be represented as a combination as $(\tilde{A},w_1)$, where $\tilde{A}$ is a size-(B-1) worker subset of $W_2$, i.e., $\tilde{A}\subset W_2 \wedge |\tilde{A}| = B - 1$. Given $\tilde{A}$, for any $A_p \in \varDelta \varPhi $ and any $A_q \in \varPhi _2$, such that $\tilde{A} \subset A_p $ and $\tilde{A} \subset A_q $, we have $f(A_p) \le f(A_q) + 1$. Note that $\tilde{A}$ and $A_q$ are size-$(B-1)$ and -B subsets of $W_2$. According to Lemma 1,we can construct an injection $g: \varPhi _2 \rightarrow \varDelta \varPhi $, such that for each $A_q \in \varPhi _2$, we have $g(A_q) \setminus A_q = w_1$ and each $A_p \in \varDelta \varPhi $ is reached via $g(\cdot )$ by at least $\lfloor {\frac{|W_2| - (B-1)}{B}} \rfloor $ times. Let $\alpha = \lfloor {\frac{|W_2| - (B-1)}{B}} \rfloor $, and we have

$$\begin{aligned} \frac{\sum \limits _{A \in \varDelta \varPhi } e^{\epsilon f(A)} }{\sum \limits _{A \in \varPhi _2 } e^{\epsilon f(A) } }\le & {} \frac{\frac{1}{\alpha } \sum \limits _{A \in \varPhi _2 } e^{\epsilon f(g(A)) } }{\sum \limits _{A \in \varPhi _2 } e^{\epsilon (f(A)) )} } \\\le & {} \frac{\frac{1}{\alpha } \sum \limits _{A \in \varPhi _2 } e^{\epsilon (f(A) + 1) } }{\sum \limits _{A \in \varPhi _2 } e^{\epsilon (f(A)) )} } \\= & {} \frac{e^\epsilon }{\alpha } \,. \end{aligned}$$

Then,

$$\begin{aligned} \frac{P_2(A _h)}{P_1(A _h) } \le 1 + \frac{e^\epsilon }{\alpha } \le e^{\epsilon + \ln (\frac{1}{\alpha } + 1)} \,. \end{aligned}$$

(5)

We then consider the other case when $w_1 \in A _h$, which suggests that $A _h \notin \varPhi _2$. Obviously we have $P_2(A _h) = 0$. For any $t \in \{1...B\}$ and $\varPhi _1^t = \{A\in \varPhi _1 \wedge |A \cap A_h |= t\} $, we have $|\varPhi _1^t| = \left( {\begin{array}{c}B\\ t\end{array}}\right) \left( {\begin{array}{c}|W_1| - B\\ B - t\end{array}}\right) $ and $\forall A \in \varPhi _1^t\,,~ f(A) \ge f(A_h) - (B - t)$. Then,

$$\begin{aligned} \begin{aligned} P_1(A _h)&= \frac{e^{\epsilon f(A_h)}}{\sum \limits _{t \in \{0...B\}}\sum \limits _{A \in \varPhi _1^t} e^{\epsilon f(A)} } \\&\le \frac{e^{\epsilon f(A_h) } }{\sum \limits _{t \in \{0...B\}} \left( {\begin{array}{c}B\\ t\end{array}}\right) \left( {\begin{array}{c}|W_1| - B\\ B - t\end{array}}\right) e^{\epsilon \left( f(A_h) - (B - t)\right) }} \\&\le \frac{ 1 }{ \sum \limits _{t \in \{0...B\}} \left( {\begin{array}{c}B\\ t\end{array}}\right) \left( {\begin{array}{c}|W_1| - B\\ B - t\end{array}}\right) e^{\epsilon ( t - B ) }} \,. \end{aligned} \end{aligned}$$

(6)

Combining Equation (5) and (6), we can obtain for any $A_h$,

$$\begin{aligned} \begin{aligned} P_1(A _h) \le ~&e^{\epsilon + \ln (\frac{1}{\alpha } + 1) }* P_2(A _h) + \\&\frac{ 1 }{ \sum \limits _{t \in \{0...B\}} \left( {\begin{array}{c}B\\ t\end{array}}\right) \left( {\begin{array}{c}|W_1| - B\\ B - t\end{array}}\right) e^{\epsilon ( t - B ) }} \,, \\ P_2(A _h) \le ~&e^{\epsilon + \ln (\frac{1}{\alpha } + 1)} * P_1(A _h) + \\&\frac{ 1 }{ \sum \limits _{t \in \{0...B\}} \left( {\begin{array}{c}B\\ t\end{array}}\right) \left( {\begin{array}{c}|W_1| - B\\ B - t\end{array}}\right) e^{\epsilon ( t - B ) }} \,. \end{aligned} \end{aligned}$$

Finally, since $|W_1| - |W_2| = 1$, we have ${{\mathcal {O}}}(|W_1|) = {{\mathcal {O}}}(|W_2|) $. We simply use the same ‘W’ in the theorem. $\square $

C Proof of Theorem 4

Proof

Case $\mathbf {B \le \frac{W+1}{2}}$. Let us consider the worker $\hat{w}$ who has the largest reliability. We consider all the worker subsets of length-B including and excluding $\hat{w}$, respectively, denoted by ${{\mathcal {W}}}^+ = \{W^+\}$ and ${{\mathcal {W}}^-} = \{W^-\}$. Then, we have

$$\begin{aligned} |{{\mathcal {W}}}^+| = \left( {\begin{array}{c}|W| - 1\\ B-1\end{array}}\right) \text {~~and~~} |{{\mathcal {W}}}^-| = \left( {\begin{array}{c}|W| - 1\\ B\end{array}}\right) \,. \end{aligned}$$

According to Lemma 1 (making $W\leftarrow W \setminus \hat{w}$), we can construct an injection $g: {{\mathcal {W}}^-} \rightarrow {{\mathcal {W}}}^+$ such that $ g(W^-) \setminus W^-= \hat{w}$, and each $ W^+ \in {{\mathcal {W}}}^+$ is mapped at most $\lceil \frac{|W| - B}{B} \rceil $ = $\lceil \frac{|W|}{B} \rceil - 1 $ times.

Obviously we have $\forall W^-, ~~ f(W^-) \le f( g(W^-))$. Then, the probability that a worker allocation in ${{\mathcal {W}}}^-$ is selected by Algorithm 1 is

$$\begin{aligned} \begin{aligned}&Prob(\widehat{A} \in {{\mathcal {W}}}^-) \\ =&\frac{\sum \limits _{W^-\in {{\mathcal {W}}}^-} e^{\epsilon f(W^-)}}{\sum \limits _{W^+\in { {{\mathcal {W}}}^+}} e^{\epsilon f(W^+)} + \sum \limits _{W^-\in {{\mathcal {W}}}^-} e^{\epsilon f(W^-)}}\\ \le&\frac{\sum \limits _{W^- \in {{\mathcal {W}}}^-} e^{\epsilon f(W^-)}}{ \frac{1}{\lceil \frac{|W|}{B} \rceil - 1}\sum \limits _{W^- \in {{\mathcal {W}}}^-} e^{\epsilon f(g(W^-)) } + \sum \limits _{W^-\in {{\mathcal {W}}}^-} e^{\epsilon f(W^-)} } \\ \le&\frac{\sum \limits _{W^- \in {{\mathcal {W}}}^-} e^{\epsilon f(W^-)}}{\frac{1}{\lceil \frac{|W|}{B} \rceil - 1 } \sum \limits _{W^- \in {{\mathcal {W}}}^-} e^{\epsilon f(W^-) } + \sum \limits _{W^-\in {{\mathcal {W}}}^-} e^{\epsilon f(W^-)} } \\ =&1 - \frac{1}{\lceil \frac{|W|}{B} \rceil } \,. \end{aligned} \end{aligned}$$

The probability that $\widehat{A} $ excludes $\hat{w}$ is at most $ 1 - \frac{1}{\lceil \frac{|W|}{B} \rceil } $, conditional on which we can also infer that $\widehat{A} $ excludes the worker with second largest reliability is $1 - \frac{1}{\lceil \frac{|W|- 1}{B - 1} \rceil } $ using the same analysis as above by replacing W and B with $W \setminus \hat{w}$ and $B-1 $, respectively. In fact, let $\hat{w}_k$ denote the worker with the k-th largest reliability for $1 \le k \le B $, we have

$$\begin{aligned} Prob(\hat{w}_{k}\notin \widehat{A} |\hat{w}_1...\hat{w}_{k-1} \notin \widehat{A} ) \le 1 - \frac{1}{\lceil \frac{|W| - (k - 1)}{B - (k - 1)} \rceil } \,. \end{aligned}$$

Therefore, $ Prob(\hat{w}_1...\hat{w}_{b} \notin \widehat{A} ) \le \prod \limits _{ k \in \{0...b-1\} }( 1 - \frac{1}{\lceil \frac{|W| - k}{B - k } \rceil } )$.

Case $\mathbf {B > \frac{W+1}{2}}$. Let $B' = |W| - B$. The algorithm opts to select the top-$B' $ workers, with the reversed reliabilities. Considering the original top reliable worker $\hat{w}$, w.r.t. the regulated reliabilities, it now has the smallest $r'$. Similarly denoting the worker subsets including and excluding $\hat{w} $ as ${{\mathcal {W}}}^+ $ and ${{\mathcal {W}}}^- $, respectively, then missing $\hat{w}$ in the final allocation $W \setminus \widehat{A} $ is equivalent to selecting $\hat{w}$ in $\widehat{A} $. We have

$$\begin{aligned} \frac{|{{\mathcal {W}}}^+|}{|{{\mathcal {W}}}^-|} = \frac{B'}{|W| - B'} = \frac{|W| - B}{B} \,. \end{aligned}$$

Similarly, we can construct an injection $g: {{\mathcal {W}}^+} \rightarrow {{\mathcal {W}}}^-$ such that $ W^+ \setminus g(W^+) = \hat{w}$, with each $ W^- \in {{\mathcal {W}}}^-$ mapped at most $\lceil \frac{|W| - B}{B} \rceil $ times. In addition, since $\hat{w}$ has the smallest $r'$, we have $\forall W^+, ~~ f'(W^+) \le f'( g(W^+))$. Then we can deduct the bound for $Prob(\widehat{A} \in {{\mathcal {W}}}^+) $ in the same way as deducting $Prob(\widehat{A} \in {{\mathcal {W}}}^-) $ for the case $B \le \frac{W+1}{2}$, with $f(\cdot )$ replaced by $f'(\cdot )$. The subsequent deductions for $ Prob(\hat{w}_1...\hat{w}_{b} \in \widehat{A} )$ follow the previous case as well. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zheng, L., Chen, L. & Cheng, P. Privacy-preserving worker allocation in crowdsourcing. The VLDB Journal 31, 733–751 (2022). https://doi.org/10.1007/s00778-021-00713-1

Download citation

Received: 01 January 2021
Revised: 20 August 2021
Accepted: 27 October 2021
Published: 16 January 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s00778-021-00713-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Privacy-preserving worker allocation in crowdsourcing

Abstract

Access this article

Similar content being viewed by others

Preserving worker privacy in crowdsourcing

Lightweight Privacy-Preserving Task Assignment in Skill-Aware Crowdsourcing

From Task Tuning to Task Assignment in Privacy-Preserving Crowdsourcing Platforms

Notes

References

Acknowledgements