A k-anonymized Text Generation Method
In this paper, we propose a method for automatically generating k-anonymized texts from texts which include sensitive information. Many texts are posted on social media, but these texts sometimes include sensitive information, such as living places, phone numbers, and SSNs. Even if sensitive information is removed from the texts, readers still be able to estimate the sensitive information from the anonymized texts, because the readers can guess sensitive information using remained information. To solve this problem, we propose a method for anonymizing texts using k-anonimization based techniques. This anonymization process is time consuming, we cannot identify appropriate anonymized strings in real time. Therefore, we proposed a method for generating an anonymization dictionary, and anonymize texts using the anonymization dictionary. In our experiments, we confirmed that our proposed method can anonymize texts in a practical time.
Keywordsk-anonymization Text processing
- 1.Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD 2000, pp. 439–450. ACM, New York (2000). http://doi.acm.org/10.1145/342009.335438
- 2.Kokkinakis, D., Thurin, A.: Anonymisation of Swedish Clinical Data, pp. 237–241. Springer, Heidelberg (2007)Google Scholar
- 3.Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: privacy beyond k-anonymity and l-diversity. In: 2007 IEEE 23rd International Conference on Data Engineering, pp. 106–115, April 2007Google Scholar
- 5.Maeda, W., Suzuki, Y., Nakamura, S.: Fast text anonymization using k-anonyminity. In: Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services, iiWAS 2016, pp. 340–344. ACM, New York (2016). http://doi.acm.org/10.1145/3011141.3011217