On Applying Probabilistic Logic Programming to Breast Cancer Data

  • Joana Côrte-RealEmail author
  • Inês Dutra
  • Ricardo Rocha
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10759)


Medical data is particularly interesting as a subject for relational data mining due to the complex interactions which exist between different entities. Furthermore, the ambiguity of medical imaging causes interpretation to be complex and error-prone, and thus particularly amenable to improvement through automated decision support. Probabilistic Inductive Logic Programming (PILP) is a particularly well-suited tool for this task, since it makes it possible to combine the relational nature of this field with the ambiguity inherent in human interpretation of medical imaging. This work presents a PILP setting for breast cancer data, where several clinical and demographic variables were collected retrospectively, and new probabilistic variables and rules reflecting domain knowledge were introduced. A PILP predictive model was built automatically from this data and experiments show that it can not only match the predictions of a team of experts in the area, but also consistently reduce the error rate of malignancy prediction, when compared to other non-relational techniques.



This work was partially funded by the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, and through the European Regional Development Fund (ERDF) as part of project NanoSTIMA (NORTE-01-0145-FEDER-000016). Joana Côrte-Real was funded by the FCT grant SFRH/BD/52235/2013. The authors would like to thank Dr. Elizabeth Burnside for making the dataset used in this paper available to us.


  1. 1.
    Bellodi, E., Riguzzi, F.: Structure learning of probabilistic logic programs by searching the clause space. Theor. Pract. Log. Program. 15(02), 169–212 (2015)CrossRefzbMATHGoogle Scholar
  2. 2.
    Berg, W.A., Hruban, R.H., Kumar, D., Singh, H.R., Brem, R.F., Gatewood, O.M.: Lessons from mammographic histopathologic correlation of large-core needle breast biopsy. Radiographics 16(5), 1111–1130 (1996)CrossRefGoogle Scholar
  3. 3.
    Brancato, B., Crocetti, E., Bianchi, S., Catarzi, S., Risso, G.G., Bulgaresi, P., Piscioli, F., Scialpi, M., Ciatto, S., Houssami, N.: Accuracy of needle biopsy of breast lesions visible on ultrasound: audit of fine needle versus core needle biopsy in 3233 consecutive samplings with ascertained outcomes. Breast 21(4), 449–454 (2012)CrossRefGoogle Scholar
  4. 4.
    Burbank, F.: Stereotactic breast biopsy: comparison of 14- and 11-gauge mammotome probe performance and complication rates. Am. Surg. 63(11), 988–995 (1997)Google Scholar
  5. 5.
    Côrte-Real, J., Dutra, I., Rocha, R.: Estimation-based search space traversal in PILP environments. In: Cussens, J., Russo, A. (eds.) ILP 2016. LNCS (LNAI), vol. 10326, pp. 1–13. Springer, Cham (2017). CrossRefGoogle Scholar
  6. 6.
    Côrte-Real, J., Mantadelis, T., Dutra, I., Rocha, R., Burnside, E.: SkILL - a stochastic inductive logic learner. In: International Conference on Machine Learning and Applications, Miami, Florida, USA, December 2015Google Scholar
  7. 7.
    Santos Costa, V., Rocha, R., Damas, L.: The YAP prolog system. J. Theor. Pract. Log. Program. 12(1 & 2), 5–34 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Davis, J., Burnside, E.S., Dutra, I.C., Page, D., Santos Costa, V.: Knowledge discovery from structured mammography reports using inductive logic programming. In: American Medical Informatics Association 2005 Annual Symposium, pp. 86–100 (2005)Google Scholar
  9. 9.
    Davis, J., Burnside, E.S., Dutra, I.C., Page, D., Ramakrishnan, R., Santos Costa, V., Shavlik, J.W.: View learning for statistical relational learning: with an application to mammography. In: Kaelbling, L.P., Saffiotti, A. (eds.) IJCAI 2005, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, 30 July–5 August 2005, pp. 677–683. Professional Book Center (2005)Google Scholar
  10. 10.
    De Raedt, L., Dries, A., Thon, I., Van den Broeck, G., Verbeke, M.: Inducing probabilistic relational rules from probabilistic examples. In: International Joint Conference on Artificial Intelligence, pp. 1835–1843. AAAI Press (2015)Google Scholar
  11. 11.
    De Raedt, L., Kimmig, A.: Probabilistic (logic) programming concepts. Mach. Learn. 100(1), 5–47 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    D’Orsi, C.J., Bassett, L.W., Berg, W.A., et al.: BI-RADS®: Mammography, 4th edn. American College of Radiology Inc., Reston (2003)Google Scholar
  13. 13.
    Dutra, I., Nassif, H., Page, D., et al.: Integrating machine learning and physician knowledge to improve the accuracy of breast biopsy. In: AMIA Annual Symposium Proceedings, Washington, DC, pp. 349–355 (2011)Google Scholar
  14. 14.
    Gonçalves, A.V., Thuler, L.C., Kestelman, F.P., Carmo, P.A., Lima, C.F., Cipolotti, R.: Underestimation of malignancy of core needle biopsy for nonpalpable breast lesions. Rev. Bras. Ginecol. Obstet. 33(7), 123–131 (2011)Google Scholar
  15. 15.
    Halpern, J.: An analysis of first-order logics of probability. Artif. Intell. 46(3), 311–350 (1990)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Kimmig, A., Demoen, B., De Raedt, L., Santos Costa, V., Rocha, R.: On the implementation of the probabilistic logic programming language ProbLog. Theor. Pract. Log. Program. 11(2 & 3), 235–262 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Liberman, L.: Percutaneous imaging-guided core breast biopsy: state of the art at the millennium. Am. J. Roentgenol. 174(5), 1191–1199 (2000)CrossRefGoogle Scholar
  18. 18.
    Liberman, L., Drotman, M., Morris, E.A., et al.: Imaging-histologic discordance at percutaneous breast biopsy. Cancer 89(12), 2538–2546 (2000)CrossRefGoogle Scholar
  19. 19.
    Muggleton, S., De Raedt, L.: Inductive logic programming: theory and methods. J. Log. Program. 19(20), 629–679 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Muggleton, S.H., Santos, J.C.A., Tamaddoni-Nezhad, A.: TopLog: ILP using a logic program declarative bias. In: Garcia de la Banda, M., Pontelli, E. (eds.) ICLP 2008. LNCS, vol. 5366, pp. 687–692. Springer, Heidelberg (2008). CrossRefzbMATHGoogle Scholar
  21. 21.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Woods, R., Oliphant, L., Shinki, K., Page, D., Shavlik, J., Burnside, E.: Validation of results from knowledge discovery: mass density as a predictor of breast cancer. J. Digit. Imaging, 418–419 (2009)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Faculty of Sciences and CRACS & INESC TECUniversity of PortoPortoPortugal

Personalised recommendations