The Research and Construction of Complaint Orders Classification Corpus in Mobile Customer Service

  • Junli XuEmail author
  • Jiangjiang Zhao
  • Ning Zhao
  • Chao Xue
  • Linbo Fan
  • Zechuan Qi
  • Qiang Wei
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11109)


Complaint orders in mobile customer service are the records of complaint description, which professional knowledge and information on customer’s complaint intention are kept. Complaint orders classification is important and necessary to be established and completed for further mining, analysis and improve the quality of customer service. Constructed corpus is the basis of research. The lack of complaint orders classification corpus (COCC) in mobile customer service has limited the research of complaint orders classification. This paper first employs K-means algorithm and professional knowledge to determine complaint orders classification labels. Then we craft the annotation rules for complaint orders, and then construct complaint orders classification corpus. The corpus consists of 130044 complaint orders annotated. Finally, we statistically analyze the corpus constructed, and the agreement of each question class reaches over 91%. It indicates that the corpus constructed could provide a great support for complaint orders classification and specialized analysis.


Mobile customer service Complaint orders classification corpus K-means Annotation rules 


  1. 1.
    Lowe, R., Pow, N., Serban, I.V., Pineau, J.: The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-turn Dialogue Systems. arXiv preprint arXiv:1506.08909 (2015)
  2. 2.
    Hu, B.T., Chen, Q.C., Zhu, F.Z.: Lcsts: A Large Scale Chinese Short Text Summarization Dataset. arXiv preprint arXiv:1506.05865 (2015)
  3. 3.
    Yang, L.Y.: Research on the establishment and applications of public sentiment corpus based on micro-blog information. Office Inform. 22, 015 (2016)Google Scholar
  4. 4.
    Xi, X.F., Zhu, X.M., Sun, Q.Y., Zhou, G.D.: Corpus construction for chinese discourse topic via micro-topic scheme. J. Comput. Res. Develop. 54(8), 1833–1852 (2017)Google Scholar
  5. 5.
    Xue, N.W., Chiou, F.D., Palmer, M.: Building a large-scale annotated Chinese corpus. In: Proceedings of COLING, pp. 1–8. ACL, Taipei (2002)Google Scholar
  6. 6.
    Aksan, Y., Aksan, M., Koltuksuz, A.: Construction of the Turkish National Corpus (TNC). In: Proceedings of LREC, pp. 3223–3227. European Language Resources Association, Istanbul (2012)Google Scholar
  7. 7.
    You, Z.Y., Wang, Y.Q., Shu, H.P.: A corpus-based TCM symptoms of speech tagging. Electron. Technol. Softw. Eng. 21, 177–178 (2017)Google Scholar
  8. 8.
    Brockett, C., Dolan, W.B.: Support Vector Machines for Paraphrase Identification and Corpus Construction. In: Proceedings of IWP, pp. 1–8. IWP, Iowa (2005)Google Scholar
  9. 9.
    Dolan, W.B., Brockett, C.: Automatically constructing a corpus of sentential paraphrases. In: Proceedings of IWP, pp. 9–16. IWP, Iowa (2005)Google Scholar
  10. 10.
    Vincze, V., Szarvas, G., Farkas, R.: The bioscope corpus: biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics 9(Suppl. 11), S9 (2008)CrossRefGoogle Scholar
  11. 11.
    Zou, B.W., Zhu, Q.M., Zhou, G.D.: Negation and speculation identification in Chinese language. In: Proceedings of ACL-IJCNLP, pp. 656–665. ACL, Beijing (2015)Google Scholar
  12. 12.
    Zhou, H.W., Yang, H., Xu, J.L., Kang, S.Y.: Construction of Chinese hedge scope corpus. J. Chin. Inf. Process. 31(3), 77–85 (2017)Google Scholar
  13. 13.
    Yin, P.F., Liu, Z., Xu, A.B.: Tone analyzer for online customer service: an unsupervised model with interfered training. In: Proceedings of CIKM, pp. 1887–1895. ACM, Singapore (2017)Google Scholar
  14. 14.
    Quan, C.Q., Ren, F.J.: Construction of a blog emotion corpus for chinese emotional expression analysis. In: Proceedings of EMNLP, pp. 1446–1454. ACL, Singapore (2009)Google Scholar
  15. 15.
    Chen, J., Nie, J.Y.: Automatic construction of parallel English-Chinese corpus for cross-language information retrieval. In: Proceedings of ANLP, pp. 21–28. ACL, Stroudsburg (2000)Google Scholar
  16. 16.
    Feng, G.J., Yu, L., Tian, S.W.: Auto construction of uyghur emotional words corpus based on CRFs. Data Anal. Knowl. Discov. 27(3), 17–21 (2011)Google Scholar
  17. 17.
    Yang, J.F., Guan, Y., He, B., Qu, C.Y., Yu, Q.B., Liu, Y.X., Zhao, Y.J.: Corpus construction for named entities and entity relations on chinese electronic medical records. J. Softw. 27(11), 2725–2746 (2016)Google Scholar
  18. 18.
    Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistics. Comput. Linguist. 34(4), 555–596 (2008)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Junli Xu
    • 1
    Email author
  • Jiangjiang Zhao
    • 1
  • Ning Zhao
    • 1
  • Chao Xue
    • 1
  • Linbo Fan
    • 1
  • Zechuan Qi
    • 1
  • Qiang Wei
    • 1
  1. 1.IT System DepartmentChina Mobile Online Services Company LimitedZhengzhouChina

Personalised recommendations