Skip to main content

Multitask Learning for Sparse Failure Prediction

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11439))

Abstract

Sparsity is a problem which occurs inherently in many real-world datasets. Sparsity induces an imbalance in data, which has an adverse effect on machine learning and hence reducing the predictability. Previously, strong assumptions were made by domain experts on the model parameters by using their experience to overcome sparsity, albeit assumptions are subjective. Differently, we propose a multi-task learning solution which is able to automatically learn model parameters from a common latent structure of the data from related domains. Despite related, datasets commonly have overlapped but dissimilar feature spaces and therefore cannot simply be combined into a single dataset. Our proposed model, namely hierarchical Dirichlet process mixture of hierarchical beta process (HDP-HBP), learns tasks with a common model parameter for the failure prediction model using hierarchical Dirichlet process. Our model uses recorded failure history to make failure predictions on a water supply network. Multi-task learning is used to gain additional information from the failure records of water supply networks managed by other utility companies to improve prediction in one network. We achieve superior accuracy for sparse predictions compared to previous state-of-the-art models and have demonstrated the capability to be used in risk management to proactively repair critical infrastructure.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bonilla, E.V., Chai, K.M.A., Williams, C.K.: Multi-task Gaussian process prediction. In: NIPs, vol. 20, pp. 153–160 (2007)

    Google Scholar 

  2. Dai, W., Yang, Q., Xue, G.R., Yu, Y.: Self-taught clustering. In: Proceedings of the 25th International Conference on Machine Learning, pp. 200–207. ACM (2008)

    Google Scholar 

  3. David, C.R., et al.: Regression models and life tables (with discussion). J. Roy. Stat. Soc. 34, 187–220 (1972)

    Google Scholar 

  4. Gupta, S., Phung, D., Venkatesh, S.: Factorial multi-task learning: a Bayesian nonparametric approach. In: International conference on machine learning, pp. 657–665 (2013)

    Google Scholar 

  5. Ibrahim, J.G., Chen, M.H., Sinha, D.: Bayesian Survival Analysis. Wiley, Hoboken (2005)

    MATH  Google Scholar 

  6. Kemp, C., Tenenbaum, J.B., Griffiths, T.L., Yamada, T., Ueda, N.: Learning systems of concepts with an infinite relational model. In: AAAI, vol. 3, p. 5 (2006)

    Google Scholar 

  7. Kumar, A., et al.: Using machine learning to assess the risk of and prevent water main breaks. arXiv preprint arXiv:1805.03597 (2018)

  8. Li, B., Zhang, B., Li, Z., Wang, Y., Chen, F., Vitanage, D.: Prioritising water pipes for condition assessment with data analytics (2015)

    Google Scholar 

  9. Li, Z., et al.: Water pipe condition assessment: a hierarchical beta process approach for sparse incident data. Mach. Learn. 95(1), 11–26 (2014)

    Article  MathSciNet  Google Scholar 

  10. Lin, P., et al.: Data driven water pipe failure prediction: a Bayesian nonparametric approach. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 193–202. ACM (2015)

    Google Scholar 

  11. Luo, S., Chu, V.W., Zhou, J., Chen, F., Wong, R.K., Huang, W.: A multivariate clustering approach for infrastructure failure predictions. In: 2017 IEEE International Congress on Big Data (BigData Congress), pp. 274–281. IEEE (2017)

    Google Scholar 

  12. Schwaighofer, A., Tresp, V., Yu, K.: Learning Gaussian process kernels via hierarchical bayes. In: Advances in Neural Information Processing Systems, pp. 1209–1216 (2005)

    Google Scholar 

  13. Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  14. Thibaux, R., Jordan, M.I.: Hierarchical beta processes and the Indian buffet process. In: AISTATS, vol. 2, pp. 564–571 (2007)

    Google Scholar 

  15. Xue, Y., Liao, X., Carin, L., Krishnapuram, B.: Multi-task learning for classification with dirichlet process priors. J. Mach. Learn. Res. 8(Jan), 35–63 (2007)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simon Luo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Luo, S. et al. (2019). Multitask Learning for Sparse Failure Prediction. In: Yang, Q., Zhou, ZH., Gong, Z., Zhang, ML., Huang, SJ. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2019. Lecture Notes in Computer Science(), vol 11439. Springer, Cham. https://doi.org/10.1007/978-3-030-16148-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-16148-4_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-16147-7

  • Online ISBN: 978-3-030-16148-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics