Skip to main content

Email Importance Evaluation in Mailing List Discussions

  • Conference paper
  • First Online:
Data Quality and Trust in Big Data (QUAT 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11235))

Included in the following conference series:

Abstract

Nowadays, mailing lists are widely used in team work for discussion and consultation. Identifying important emails in mailing list discussions could significantly benefit content summary and opinion leader recognition. However, previous studies only focus on the importance evaluation methods regarding personal emails, and there is no consensus on the definition of important emails. Therefore, in this paper we consider the characteristics of mailing lists and study how to evaluate email importance in mailing list discussions. Our contribution mainly includes the following aspects. First, we propose ER-Match, an email conversation thread reconstruction algorithm that takes nested quotation relationships into consideration while constructing the email relationship network. Based on the email relationship network, we formulate the importance of emails in mailing list discussions. Second, we propose a feature-rich learning method to predict the importance of new emails. Furthermore, we characterize various factors affecting email importance in mailing list discussions. Experiments with publicly available mailing lists show that our prediction model outperforms baselines with large gains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://lkml.org.

  2. 2.

    https://lists.w3.org/.

  3. 3.

    The body here refers to contents without header, signature and quotation.

  4. 4.

    https://github.com/sloria/TextBlob.

  5. 5.

    http://scikit-learn.org.

  6. 6.

    For email importance prediction with XGBoost, we set learning_rate = 0.1, n_estimators = 1000, max_depth = 5, min_child_weight = 1, gamma = 0, subsample = 0.8, colsample_bytree = 0.8, objective = ’binary:logistic’, scale_pos_weight = 1, seed = 27.

References

  1. Aberdeen, D., Pacovsky, O., Slater, A.: The learning behind gmail priority inbox. In: LCCC: NIPS 2010 Workshop on Learning on Cores, Clusters and Clouds (2010)

    Google Scholar 

  2. Albitar, S., Fournier, S., Espinasse, B.: An effective TF/IDF-based text-to-text semantic similarity measure for text classification. In: Benatallah, B., Bestavros, A., Manolopoulos, Y., Vakali, A., Zhang, Y. (eds.) WISE 2014. LNCS, vol. 8786, pp. 105–114. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11749-2_8

    Chapter  Google Scholar 

  3. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)

    Google Scholar 

  4. Dabbish, L.A., Kraut, R.E., Fussell, S., Kiesler, S.: Understanding email use: predicting action on a message. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 691–700. ACM (2005)

    Google Scholar 

  5. Dehghani, M., Shakery, A., Asadpour, M., Koushkestani, A.: A learning approach for email conversation thread reconstruction. J. Inf. Sci. 39(6), 846–863 (2013)

    Article  Google Scholar 

  6. Golbeck, J., Hendler, J.: Inferring binary trust relationships in web-based social networks. ACM Tran. Internet Technol. (TOIT) 6(4), 497–529 (2006)

    Article  Google Scholar 

  7. Jain, A.: XGboost tuning. https://www.analyticsvidhya.com/blog/2016/03/complete-guide-parameter-tuning-xgboost-with-codes-python/. Accessed 24 July 2018

  8. Joshi, S., Contractor, D., Ng, K., Deshpande, P.M., Hampp, T.: Auto-grouping emails for faster e-discovery. Proc. VLDB Endow. 4(12), 1284–1294 (2011)

    Google Scholar 

  9. Lewis, D.D., Knowles, K.A.: Threading electronic mail: a preliminary study. Inf. Process. Manage. 33(2), 209–217 (1997)

    Article  Google Scholar 

  10. Liu, L., Tang, J., Han, J., Jiang, M., Yang, S.: Mining topic-level influence in heterogeneous networks. In: CIKM ACM Conference on Information and Knowledge Management, CIKM 2010, Toronto, Ontario, Canada, October, pp. 199–208 (2010)

    Google Scholar 

  11. Merton, R.K.: The Matthew effect in science: the reward and communication systems of science are considered. Science 159(3810), 56–63 (1968)

    Article  Google Scholar 

  12. Page, L.: The pagerank citation ranking: bringing order to the web. Stanford Digital Libraries Working Paper 9(1), 1–14 (1999)

    Google Scholar 

  13. Passant, A., Zimmermann, A., Schneider, J., Breslin, J.G.: A semantic framework for modelling quotes in email conversations. In: Proceedings of the 1st International Conference on Intelligent Semantic Web-Services and Applications. ACM (2010)

    Google Scholar 

  14. Sharaff, A., Nagwani, N.K.: Email thread identification using latent Dirichlet allocation and non-negative matrix factorization based clustering techniques. J. Inf. Sci. 42(2), 200–212 (2016)

    Article  Google Scholar 

  15. Tsugawa, S., Ohsaki, H., Imase, M.: Estimating message importance using inferred inter-recipient trust for supporting email triage. Inf. Media Technol. 7(3), 1073–1082 (2012)

    Google Scholar 

  16. Wu, Y., Oard, D.W.: Indexing emails and email threads for retrieval. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 665–666. ACM (2005)

    Google Scholar 

  17. Yang, L., Dumais, S.T., Bennett, P.N., Awadallah, A.H.: Characterizing and predicting enterprise email reply behavior. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 235–244. ACM (2017)

    Google Scholar 

  18. Yoo, S., Yang, Y., Lin, F., Moon, I.C.: Mining social networks for personalized email prioritization. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 967–976. ACM (2009)

    Google Scholar 

  19. Zawinski, J.: Message threading. https://www.jwz.org/doc/threading.html/. Accessed 10 May 2018

  20. Zhang, F., Xu, K.: Annotation and classification of an email importance corpus. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), vol. 2, pp. 651–656 (2015)

    Google Scholar 

Download references

Acknowledgement

This work is supported by National Key Research & Development Program (2016YFB1000503).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunming Hu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jiang, K., Hu, C., Sun, J., Shen, Q., Jiang, X. (2019). Email Importance Evaluation in Mailing List Discussions. In: Hacid, H., Sheng, Q., Yoshida, T., Sarkheyli, A., Zhou, R. (eds) Data Quality and Trust in Big Data. QUAT 2018. Lecture Notes in Computer Science(), vol 11235. Springer, Cham. https://doi.org/10.1007/978-3-030-19143-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-19143-6_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-19142-9

  • Online ISBN: 978-3-030-19143-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics