A k-anonymized Text Generation Method

Conference paper
Part of the Lecture Notes on Data Engineering and Communications Technologies book series (LNDECT, volume 7)


In this paper, we propose a method for automatically generating k-anonymized texts from texts which include sensitive information. Many texts are posted on social media, but these texts sometimes include sensitive information, such as living places, phone numbers, and SSNs. Even if sensitive information is removed from the texts, readers still be able to estimate the sensitive information from the anonymized texts, because the readers can guess sensitive information using remained information. To solve this problem, we propose a method for anonymizing texts using k-anonimization based techniques. This anonymization process is time consuming, we cannot identify appropriate anonymized strings in real time. Therefore, we proposed a method for generating an anonymization dictionary, and anonymize texts using the anonymization dictionary. In our experiments, we confirmed that our proposed method can anonymize texts in a practical time.


k-anonymization Text processing 


  1. 1.
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD 2000, pp. 439–450. ACM, New York (2000).
  2. 2.
    Kokkinakis, D., Thurin, A.: Anonymisation of Swedish Clinical Data, pp. 237–241. Springer, Heidelberg (2007)Google Scholar
  3. 3.
    Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: privacy beyond k-anonymity and l-diversity. In: 2007 IEEE 23rd International Conference on Data Engineering, pp. 106–115, April 2007Google Scholar
  4. 4.
    Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-Diversity: privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 1(1), 1–52 (2007). doi: 10.1145/1217299.1217302 CrossRefGoogle Scholar
  5. 5.
    Maeda, W., Suzuki, Y., Nakamura, S.: Fast text anonymization using k-anonyminity. In: Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services, iiWAS 2016, pp. 340–344. ACM, New York (2016).
  6. 6.
    Nguyen-Son, H.Q., Tran, M.T., Yoshiura, H., Sonehara, N., Echizen, I.: Anonymizing personal text messages posted in online social networks and detecting disclosures of personal information. IEICE Trans. Inf. Syst. E98.D(1), 78–88 (2015)CrossRefGoogle Scholar
  7. 7.
    Sweeney, L.: K-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Nara Institute of Science and TechnologyTakayama, IkomaJapan

Personalised recommendations