Abstract
Homogeneous team formation is the task of grouping individuals into teams, each of which consists of members who fulfill the same set of prespecified properties. In this theoretical work, we propose, motivate, and analyze a combinatorial model where, given a matrix over a finite alphabet whose rows correspond to individuals and columns correspond to attributes of individuals, the user specifies lower and upper bounds on team sizes as well as combinations of attributes that have to be homogeneous (that is, identical) for all members of the corresponding teams. Furthermore, the user can define a cost for assigning any individual to a certain team. We show that some special cases of our new model lead to NP-hard problems while others allow for (fixed-parameter) tractability results. For example, the problem is already NP-hard even if (i) there are no lower and upper bounds on the team sizes, (ii) all costs are zero, and (iii) the matrix has only two columns. In contrast, the problem becomes fixed-parameter tractable for the combined parameter “number of possible teams” and “number of different individuals”, the latter being upper-bounded by the number of rows.
Similar content being viewed by others
Notes
Informally speaking, a problem with input size x and parameter p is called fixed-parameter tractable if it can be solved in f(p)⋅x O(1) time, where f may be an arbitrarily computable function solely depending on p.
Although the input table as well as the given patterns formally are matrices, we use different terms to distinguish between them: The “input matrix” consisting of “rows” and the “pattern mask” consisting of “pattern vectors”.
In (2) the modified Row Assignment ∗ has a specific lower bound α j and a specific upper bound β j for each j∈T out instead of a uniform upper bound k.
As a consequence of the “binarization” in Lemma 1, the question whether Homogeneous Team Formation is fixed-parameter tractable with respect to the combined parameter (p,|Σ|) is equivalent to the question whether Homogeneous Team Formation is fixed-parameter tractable with respect to p alone.
References
Aamodt, M.G., Kimbrough, W.W.: Effect of group heterogeneity on quality of task solutions. Psychol. Rep. 50(1), 171–174 (1982)
Abdelsalam, H.: Multi-objective team forming optimization for integrated product development projects. In: Foundations of Computational Intelligence, Volume 3. Studies in Computational Intelligence, vol. 203, pp. 461–478. Springer, Berlin (2009)
Adodo, S.O., Agbayewa, J.O.: Effect of homogeneous and heterogeneous ability grouping class teaching on student’s interest, attitude and achievement in integrated science. Int. J. Psychol. Couns. 3(3), 48–54 (2011)
Aggarwal, G., Feder, T., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas, D., Zhu, A.: Achieving anonymity via clustering. ACM Trans. Algorithms 6(3), 1–19 (2010)
Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows: Theory, Algorithms, and Applications. Prentice Hall, New York (1993)
Baykasoglu, A., Dereli, T., Das, S.: Project team selection using fuzzy optimization approach. Cybern. Syst. 38(2), 155–185 (2007)
Blocki, J., Williams, R.: Resolving the complexity of some data privacy problems. In: Proceedings of the 37th International Colloquium on Automata, Languages and Programming (ICALP’10). LNCS, vol. 6199, pp. 393–404. Springer, Berlin (2010)
Kernelization, H.L.B.: New upper and lower bound techniques. In: Proceedings of the 4th International Workshop on Parameterized and Exact Computation (IWPEC’09). LNCS, vol. 5917, pp. 17–37. Springer, Berlin (2009)
Bodlaender, H.L., Thomassé, S., Yeo, A.: Kernel bounds for disjoint cycles and disjoint paths. Theor. Comput. Sci. 412(35), 4570–4578 (2011)
Bredereck, R., Nichterlein, A., Niedermeier, R., Philip, G.: The effect of homogeneity on the computational complexity of combinatorial data anonymization. Data Min. Knowl. Discov. (2012). Online available
Bredereck, R., Nichterlein, A., Niedermeier, R.: Pattern-guided k-anonymity. In: Proceedings of the Joint Conference of the 7th International Frontiers of Algorithmics Workshop and the 9th International Conference on Algorithmic Aspects of Information and Management (FAW-AAIM’13). LNCS, vol. 7924, pp. 350–361. Springer, Berlin (2013)
Cygan, M., Pilipczuk, M., Pilipczuk, M., Wojtaszczyk, J.: Solving the 2-disjoint connected subgraphs problem faster than 2n. In: Proceedings of the 10th Latin American Symposium on Theoretical Informatics (LATIN’12). LNCS, vol. 7256, pp. 195–206. Springer, Berlin (2012)
Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer, Berlin (1999)
Fellows, M.R., Jansen, B.M., Rosamond, F.: Towards fully multivariate algorithmics: parameter ecology and the deconstruction of computational complexity. Eur. J. Comb. 34, 541–566 (2013)
Flum, J., Grohe, M.: Parameterized Complexity Theory. Springer, Berlin (2006)
Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: a survey of recent developments. ACM Comput. Surv. 42(4), 14:1–14:53 (2010)
Guo, J., Niedermeier, R.: Invitation to data reduction and problem Kernelization. SIGACT News 38(1), 31–45 (2007)
Köhler, T.: Benutzergeführtes Anonymisieren von Daten mit Pattern Clustering: Algorithmen und Komplexität (in German, English title: User-guided data anonymization with pattern clustering: Algorithms and complexity). Diploma thesis, Friedrich-Schiller-Universität Jena (2011). Available at http://fpt.akt.tu-berlin.de/publications/pattern_D.pdf
Kuo, S., Fuchs, W.: Efficient spare allocation for reconfigurable arrays. IEEE Des. Test Comput. 4(1), 24–31 (1987)
Lappas, T., Liu, K., Terzi, E.: Finding a team of experts in social networks. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09), pp. 467–476. ACM, New York (2009)
Lokshtanov, D., Misra, N., Saurabh, S.: Kernelization—preprocessing with a guarantee. In: The Multivariate Algorithmic Revolution and Beyond. LNCS, vol. 7370, pp. 129–161. Springer, Berlin (2012)
Majumder, A., Datta, S., Naidu, K.: Capacitated team formation problem on social networks. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’12), pp. 1005–1013. ACM, New York (2012)
Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’04), pp. 223–228. ACM, New York (2004)
Niedermeier, R.: Invitation to Fixed-Parameter Algorithms. Oxford University Press, Oxford (2006)
Niedermeier, R.: Reflections on multivariate algorithmics and problem parameterization. In: Proceedings of the 27th International Symposium on Theoretical Aspects of Computer Science (STACS’10). Leibniz International Proceedings in Informatics (LIPIcs), vol. 5, pp. 17–32. Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Wadern (2010)
Orlin, J.: A faster strongly polynomial minimum cost flow algorithm. In: Proceedings of the 20th Annual ACM Symposium on Theory of Computing (STOC’88), pp. 377–387. ACM, New York (1988)
Samarati, P.: Protecting respondents identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)
Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information. In: Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS’98), p. 188. ACM, New York (1998)
Sweeney, L.: Uniqueness of simple demographics in the U.S. population. Technical report, Carnegie Mellon University, School of Computer Science, Laboratory for International Data Privacy (2000)
Sweeney, L.: k-Anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)
White, K.B.: A preliminary investigation of information systems team structures. Inf. Manag. 7(6), 331–335 (1984)
Wi, H., Oh, S., Mun, J., Jung, M.: A team formation model based on knowledge and collaboration. Expert Syst. Appl. 36(5), 9121–9134 (2009)
Zzkarian, A., Kusiak, A.: Forming teams: an analytical approach. IIE Trans. 31(1), 85–97 (1999)
Acknowledgements
We are grateful to the anonymous referees of the MFCS’11 conference for helping to improve this work by spotting some flaws and providing the idea behind Corollary 4. Furthermore, we thank an anonymous referee for providing the idea of the proof of Theorem 2 which is significantly simpler than the one in the conference version of this paper. We are also grateful to two anonymous Algorithmica reviewers for their constructive feedback.
Author information
Authors and Affiliations
Corresponding author
Additional information
R. Bredereck and T. Köhler were supported by the DFG, research project PAWS, NI 369/10.
Major parts of this work were done while G. Philip was with The Institute of Mathematical Sciences, Chennai, India, and visiting TU Berlin.
Parts of this work were done while T. Köhler was student at Friedrich-Schiller-Universität Jena and visiting TU Berlin as student research assistent.
An extended abstract appeared under the title “Pattern-Guided Data Anonymization and Clustering” in Proceedings of the 36th International Symposium on Mathematical Foundations of Computer Science (MFCS’11), volume 6907 of LNCS, pages 182–193, Springer 2011. That version concentrates on the anonymization aspects of the model. In our new version we slightly extend our model and show how it applies to (homogeneous) clustering of individuals, that is, to homogeneous team formation. Indeed, we now claim that the models and ideas better fit with these applications than with the previous data anonymization motivation. Apart from full proofs omitted in the extended abstract and also adapting our old ideas to the new extended model, the current article also contains a new and easier proof of NP-hardness, a new proof for showing that polynomial-time data reduction in term of so-called polynomial-size problem kernels is unlikely to exist with respect to certain parameterizations, and a new algorithm for the (still NP-hard) special case ignoring costs. Many of the new findings are part of the diploma thesis [18] of Thomas Köhler.
Rights and permissions
About this article
Cite this article
Bredereck, R., Köhler, T., Nichterlein, A. et al. Using Patterns to Form Homogeneous Teams. Algorithmica 71, 517–538 (2015). https://doi.org/10.1007/s00453-013-9821-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-013-9821-0