Abstract
Companies frequently outsource datasets to mining firms, and academic institutions create repositories or share datasets in the interest of promoting research collaboration. Still, many practitioners have reservations about sharing or outsourcing datasets, primarily because of fear of losing the principal rights over the dataset. This work presents a way of convincingly claiming ownership rights over a trajectory dataset, without, at the same time, destroying the salient dataset characteristics, which are important for accurate search operations and data-mining tasks. The digital watermarking methodology that we present distorts imperceptibly a collection of sequences, effectively embedding a secret key, while retaining as well as possible the neighborhood of each object, which is vital for operations such as similarity search, classification, or clustering. A key contribution in this methodology is a technique for discovering the maximum distortion that still maintains such desirable properties. We demonstrate both analytically and empirically that the proposed dataset marking techniques can withstand a number of attacks (such a translation, rotation, noise addition, etc) and therefore can provide a robust framework for facilitating the secure dissemination of trajectory datasets.
Similar content being viewed by others
Abbreviations
- \({\mathcal{D}}\) :
-
Original dataset of trajectories
- \({\widehat{\mathcal{D}}}\) :
-
Watermarked dataset
- x :
-
Trajectory in time-domain
- X :
-
Trajectory in frequency domain
- n :
-
Number of points in a sequence
- \({X_j = \rho_j e ^ {\phi_j i}}\) :
-
Fourier descriptor as a function of its magnitude and phase
- p :
-
Embedding power
- \({\widehat{X_j} = \widehat{\rho_j}e^{\widehat{\phi_j}i}}\) :
-
Watermarked Fourier descriptor as a function of its watermarked magnitude and phase
- \({\mu_j(\mathcal{D})}\) :
-
Mean of ρ j across the trajectories in \({\mathcal{D}}\)
- l :
-
Number of non-zero elements of watermark
- χ :
-
Correlation
- \({\widehat{D}_p(x,y)}\) :
-
Distance between two trajectories x, y after watermarking with power p
References
Agarwal, P., Adi, K., Prabhakaran, B.: Robust blind watermarking mechanism for motion data streams. In: Proceedings of ACM Workshop on Multimedia and Security, pp. 230–235 (2006)
Agarwal, P., Prabhakaran, B.: Tamper proofing mechanisms for motion capture data. In: Proceedings of ACM Workshop on Multimedia and Security, pp. 91–100 (2008)
Aggarwal, C.C., Yu, P.S.: A condensation approach to privacy preserving data mining. In: Proceedings of EDBT, pp. 183–199 (2004)
Agrawal, R., Kiernan, J.: Watermarking relational databases. In: Proceedings of VLDB, pp. 155–166 (2002)
Agrawal, R., Srikant, R.: Privacy preserving data mining. In: Proceedings of SIGMOD, pp. 439–450 (2000)
Aha D., Kibler D., Albert M.: Instance based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)
Atkeson C.G., Moore A.W., Schaal S.: Locally weighted learning. Artif. Intell. Rev. 11(1–5), 11–73 (1997)
Bassia, P., Pitas, I.: Robust audio watermarking in the time domain. In: European Signal Processing Conference (EUSIPCO) (1998)
Becker, M., Desoky, A.: A study of the DVD content scrambling system (CSS) algorithm. In: Proceedings of IEEE International Symposium on Signal Processing and Information Technology, pp. 353–356 (2004)
Bertino E., Khan L.R., Sandhu R.S., Thuraisingham B.M.: Secure knowledge management: confidentiality, trust, and privacy. IEEE Trans. Syst. Man Cybern. A 36(3), 429–438 (2006)
Brickell, J., Shmatikov, V.: The cost of privacy: destruction of data-mining utility in anonymized data publishing. In: Proceedings of SIGKDD, pp. 70–78 (2008)
Chen B., Wornell G.: Quantization index modulation: a class of provably good methods for digital watermarking and information embedding. IEEE Trans. Inf. Theory 47(4), 1423–1443 (2001)
Chen, B., Wornell, G.W.: Achievable performance of digital watermarking systems. In: IEEE International Conference on Multimedia Computing and Systems, pp. 13–18 (1999)
Chen, K., Liu, L.: Privacy preserving data classification with rotation rerturbation. In: Proceedings of ICDM, pp. 589–592 (2005)
Cheng Q., Huang T.: Robust optimum detection of transform domain multiplicative watermarks. IEEE Trans. Signal Process. 51(4), 906–924 (2003)
Cover, T., Hart, P.: Nearest Neighbor pattern classification. In: IEEE Trans. Inf. Theory, pp. 21–27 (1967)
Cox I.J., Kilian J., Leighton T., Shamoon T.: Secure spread spectrum watermarking for multimedia. IEEE Trans. Image Process. 6(12), 1673–1687 (1997)
Cox, I.J., Miller, M.L.: Electronic watermarking: the first 50 years. In: International Conference on Control, Automation, Robotics and Vision (2004)
Cox I.J., Miller M.L., Bloom J.A.: Digital watermarking. Morgan Kaufmann, New York (2007)
Deshpande, P.M., D.P, Kummamuru, K.: Efficient online top-K retrieval with arbitrary similarity measures. In: Proceedings of EDBT, pp. 356–367 (2008)
Fridrich, J.: Minimizing the embedding impact in steganography. In: Proceedings of ACM workshop on Multimedia and security, pp. 2–10 (2006)
Fridrich, J., Pevný, T., Kodovský, J.: Statistically undetectable jpeg steganography: dead ends challenges, and opportunities. In: Proceedings of ACM Workshop on Multimedia and security, pp. 3–14 (2007)
Green D., Swets J.: Signal detection theory and psychophysics. Wiley, New York (1966)
Information Hiding: Techniques for Steganography and Digital Watermarking. Artech House, Boston (2000)
Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N.: A new privacy-preserving distributed k-clustering algorithm. In: Proceedings of SIAM International Conference on Data Mining (SDM) (2006)
Jin, X., Zhang, Z., Wang, J., Li, D.: Watermarking spatial trajectory database. In: Proceedings of DASFAA (2005)
Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: Proceedings of ICDM, pp. 99–106 (2003)
Keogh, E., Kasetty, S.: On the need for time series data mining benchmarks: a survey and empirical demonstration. In: Proceedings of SIGKDD, pp. 102–111 (2002)
Kesal, M., Mihcak, M.K., Venkatesan, R.: An improved attack analysis on a public-key spread spectrum watermarking. In: ACM Multimedia Systems Journal, pp. 133–142 (2005)
Kifer, D., Gehrke, J.: Injecting utility into anonymized datasets. In: Proceedings of SIGMOD, pp. 217–228 (2006)
Li, F., Sun, J., Papadimitriou, S., Mihaila, G., Stanoi, I.: Hiding in the crowd: privacy preservation on evolving streams through correlation tracking. In: Proceedings of ICDE, pp. 686–695 (2007)
Li S., Okuda M.: Iterative frame decimation and watermarking for human motion animation. Int. J. Graph. Vis. Image Process. 07, 27–34 (2007)
Li, T., Li, N.: On the tradeoff between privacy and utility in data publishing. In: Proceedings of SIGKDD, pp. 517–525 (2009)
Liu, L., Kantarcioglu, M., Thuraisingham, B.: The applicability of the perturbation model-based privacy preserving data mining for real-world data. In: ICDM International Workshop on Privacy Aspects of Data-Mining (2006)
Liu, Y., Prabhakaran, B., Guo, X.: A robust spectral approach for blind watermarking of manifold surfaces. In: Proceedings of ACM Workshop on Multimedia and security, pp. 43–52 (2008)
Lucchese, C., Vlachos, M., Rajan, D., Yu, P.: Rights protection of trajectory datasets. In: Proceedings of International Conference on Data Engineering, pp. 1349–1351 (2008)
Maity, S.P., Kundu, M.K.: Robust and blind spatial watermarking in digital image. In: Indian Conference on Computer Vision, Graphics and Image Processing (2002)
Malvar H., Florencio D.: Improved spread spectrum: a new modulation technique for robust watermarking. IEEE Trans. Signal Process. 51(4), 898–905 (2003)
Moulin, P., Mihcak, M., Lin, G.-I.: An information-theoretic model for image watermarking and data hiding. In: IEEE International Conference on Image Processing (2000)
Moulin, P., Mihcak, M.K., Lin, G.I.: An information–theoretic model for watermarking and data hiding. In: Proceedings IEEE International Conference on Image Processing, pp. 667–670 (2000)
Niu X., Shao C., Wang X.: A survey of digital vector map watermarking. Int. J. Innov. Comput. Inf. Control 2(6), 1301–1316 (2006)
Oliveira, S., Zaiane, O.: Privacy preserving clustering by data transformation. In: Proceedings of SBBD, pp. 304–318 (2003)
Perez-Freire L., Perez-Gonzalez F.: Spread-spectrum watermarking security. Inf Forensics Secur. IEEE Trans. 4(1), 2–24 (2009)
Pinkas, B.: Cryptographic techniques for privacy-preserving data mining. In: SIGKDD Explorations 4(2), pp. 12–19 (2002)
Rastogi, V., Suciu, D., Hong, S.: The boundary between privacy and utility in data publishing. In: Proceedings of VLDB, pp. 531–542 (2007)
Sagetong, P., Zhou, W.: Dynamic wavelet feature-based watermarking for copyright tracking in digital movie distribution systems. In: IEEE International Conference on Image Processing, pp. 653–656 (2002)
Simitopoulos, D., Tsaftaris, S., Boulgouris, N., Strintzis, M.: Compressed-domain video watermarking of MPEG streams. In: IEEE International Conference on Multimedia and Expo (ICME) (2002)
Sion R., Atallah M., Prabhakar S.: Rights Protection for Relational Data. IEEE Trans. Knowl. Data Eng. 16(12), 1509–1525 (2004)
Sion R., Atallah M.J., Prabhakar S.: Rights Protection for Discrete Numeric Streams. IEEE Trans. Knowl. Data Eng. 18(5), 699–714 (2006)
Solachidis V., Pitas I.: Watermarking polygonal lines using Fourier Descriptors. IEEE Comput. Graph. Appl. 24(3), 44–51 (2004)
Swanson M.D., Zhu B., Tewfik A.H., Boney L.: Robust audio Watermarking Using perceptual masking. Signal Process. 66(3), 337–355 (1998)
Thuraisingham, B.M., Khan, L., Subbiah, G., Alam, A., Kantarcioglu, M.: Privacy and security challenges in GIS. In: Encyclopedia of GIS, pp. 898–902 (2008)
Topkara, U., Topkara, M., Atallah, M.J.: The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions. In: MM & Sec, pp. 164–174 (2006)
UC Riverside Time Series Data Mining Archive. http://www.cs.ucr.edu/~eamonn/TSDMA/
UCI Repository of Machine Learning Databases. http://www.ics.uci.edu/~mlearn/MLRepository.html
Voyatzis, G., Pitas, I.: Chaotic mixing of digital images and applications to watermarking. ECMAST 2, pp. 687–694 (1996)
Vaidya, J., Clifton, C.: Privacy-preserving K-means clustering over vertically partitioned data. In: SIGKDD (2003)
Vaidya, J., Clifton, C.: Privacy preserving naive bayes classifier for vertically partitioned data. In: Proceedings of SDM (2004)
Vlachos, M., Lucchese, C., Rajan, D., Yu, P.: Ownership protection of shape datasets with geodesic distance preservation. In: Proceedings of EDBT, pp. 276–286 (2008)
Voigt, M., Yang, B., Busch, C.: Reversible watermarking of 2d-vector data. In: Proceedings of the Workshop on Multimedia and Security, pp. 160–165 (2004)
Xu, Y., Ke Wang, A.W.-C.F., She, R., Pei, J.: Privacy-preserving data stream classification. In: Advances in Database Systems, pp. 487–510 (2008)
Yamazaki, S.: Watermarking motion data. In: Proceedings of Pacific Rim Workshop on Digital Steganography, pp. 177–185 (2004)
Yu, H., Jiang, X., Vaidya, J.: Privacy-preserving SVM using nonlinear kernels on horizontally partitioned data. In: SAC, pp. 603–610 (2006)
Yu, H., Vaidya, J., Jiang, X.: Privacy-preserving SVM classification on vertically partitioned data. In: Proceedings of PAKDD, pp. 647–656 (2006)
Zhu W., Xiong Z., Zhang Y.-Q.: Multiresolution watermarking for images and video. IEEE Trans. Circuits Syst. Video Technol. 9(4), 545–550 (1999)
Zmudzinski, S., Steinebach, M.: Psycho-acoustic model-based message authentication coding for audio data. In: Proceedings of ACM Workshop on Multimedia and security, pp. 75–84 (2008)
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is partially supported by the National Science Foundation under Grants No. IIS- 0914934.
Rights and permissions
About this article
Cite this article
Lucchese, C., Vlachos, M., Rajan, D. et al. Rights protection of trajectory datasets with nearest-neighbor preservation. The VLDB Journal 19, 531–556 (2010). https://doi.org/10.1007/s00778-010-0178-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-010-0178-6