Journal of Intelligent Information Systems

, Volume 36, Issue 3, pp 253–281 | Cite as

A new co-training-style random forest for computer aided diagnosis

  • Chao Deng
  • M. Zu Guo


Machine learning techniques used in computer aided diagnosis (CAD) systems learn a hypothesis to help the medical experts make a diagnosis in the future. To learn a well-performed hypothesis, a large amount of expert-diagnosed examples are required, which places a heavy burden on experts. By exploiting large amounts of undiagnosed examples and the power of ensemble learning, the co-training-style random forest (Co-Forest) releases the burden on the experts and produces well-performed hypotheses. However, the Co-forest may suffer from a problem common to other co-training-style algorithms, namely, that the unlabeled examples may instead be wrongly-labeled examples that become accumulated in the training process. This is due to the fact that the limited number of originally-labeled examples usually produces poor component classifiers, which lack diversity and accuracy. In this paper, a new Co-Forest algorithm named Co-Forest with Adaptive Data Editing (ADE-Co-Forest) is proposed. Not only does it exploit a specific data-editing technique in order to identify and discard possibly mislabeled examples throughout the co-labeling iterations, but it also employs an adaptive strategy in order to decide whether to trigger the editing operation according to different cases. The adaptive strategy combines five pre-conditional theorems, all of which ensure an iterative reduction of classification error and an increase in the scale of new training sets under PAC learning theory. Experiments on UCI datasets and an application to small pulmonary nodules detection using chest CT images show that ADE-Co-Forest can more effectively enhance the performance of a learned hypothesis than Co-Forest and DE-Co-Forest (Co-Forest with Data Editing but without adaptive strategy).


Semi-supervised learning Co-training Co-forest Adaptive data editing PAC theory Pulmonary nodules detection Computer aided diagnosis 



This work is supported by the National Science Foundation of China under the Grant Nos.60702033, 60772076 and 2007307000189, the National High-Tec Research and Development Plant of China under the Grant No.2007AA01Z171, the Heilongjiang Science Foundation Key Project under the No. ZJG0705, the Science Foundation for Distinguished Young Scholars of Heilongjiang Province in China under the Grant No.JC200611. The authors thank partners from the 2nd Affiliated Hospital of Harbin Medical University for collecting and labeling the CT images.


  1. Anagnostopoulos, I., & Maglogiannis, I. (2006). Neural network-based diagnostic and prognostic estimations in breast cancer microscopic instances. Medical and Biological Engineering and Computing, 44, 773–784.CrossRefGoogle Scholar
  2. Angluin, D., & Laird, P. (1988). Learning from noisy examples. Machine Learning, 2(4), 343–370.Google Scholar
  3. Bennett, K. P., Demiriz, A., & Maclin, R. (2002). Exploiting unlabeled data in ensemble methods. In Proc. 8th ACM int. conf. on knowledge discovery and data mining (SIGKDD’02) (pp. 289–296). Canada: Edmonton.CrossRefGoogle Scholar
  4. Blake, C., Keogh, E., & Merz, C. J. (1998). UCI repository of machine learning databases. Dept. Inf. and Comput. Sci., Univ. California, [Online].
  5. Blum, A., & Chawla, S. (2001). Learning from labeled and unlabeled data using graph mincuts. In Proc. 18th int. conf. on machine learning (ICML01) (pp. 19–26). Williamstown, MA.Google Scholar
  6. Blum, A., & Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. In Proc. 11th annu. conf. on computational learning theory (pp. 92–100). U.S.A.: Wisconsin.CrossRefGoogle Scholar
  7. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.MathSciNetzbMATHGoogle Scholar
  8. Breiman, L. (2001). Random Forest. Machine Learning, 45(1), 5–32.zbMATHCrossRefGoogle Scholar
  9. Chapelle, O., Schoelkopf, B., & Zien, A. (2006). Semi-supervised learning. Cambridge: MIT Press.Google Scholar
  10. Dasgupta, S., Littman, M., & McAllester, D. (2002). PAC generalization bounds for co-training. In Advances in neural information processing systems (NIPS02) (Vol. 4, pp. 375–382). Cambridge: MIT Press.Google Scholar
  11. Deng, C., & Guo, M. Z. (2006). Tri-training and data editing based semi-supervised clustering algorithm. In A. F. Gelbukhm & C. A. R. García (Eds.), MICAI2006: Advances in artificial intelligence (pp. 641–651). Mexico: Apizaco.Google Scholar
  12. Goldman, S., & Zhou, Y. (2000). Enhancing supervised learning with unlabeled data. In Proc. 17th int. conf. on machine Learning (ICML00) (pp. 327–334). San Francisco, CA.Google Scholar
  13. Hansen, L., & Salamon, P. (1990). Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10), 993–1001.CrossRefGoogle Scholar
  14. Hwa, R., Osborne, M., Sarkar A., & Steedman, M. (2003). Corrected cotraining for statistical parsers. In Proc. 20th int. conf. on machine learning (ICML03) workshop on continuum from labeled to unlabeled data in machine learning and data mining (pp. 95–102). Washington, DC.Google Scholar
  15. Jia, X. H., Wang, Z., & Chen, S. C. (2006). Fast screening out true negative regions for microcalcification detection in digital mammograms. Transaction of Nanjing University of Aeronautics & Astronautics, 23(1), 52–58.zbMATHGoogle Scholar
  16. Jiang, Y., & Zhou, Z. H. (2004). Editing training data for kNN classifiers with neural network ensemble. In Proc. IEEE 2004 int. sym. on neural networks (ISNN04) (pp. 356–361). Dalian, China.Google Scholar
  17. Koprinska, I., Poon, J., Clark, J., & Chan, J. (2007). Learning to classify e-mail. Information Sciences, 177(10), 2167–2187.CrossRefGoogle Scholar
  18. Li, M., & Zhou, Z. H. (2005). SETRED: Self-training with editing. In Proc. 9th Pacific-Asia conf. on knowledge discovery and data mining (PAKDD05) (pp. 611–621). Hanoi, Vietnam.Google Scholar
  19. Li, M., & Zhou, Z. H. (2007). Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Transactions on Systems, Man, and Cybernetics, Part A, 37(6), 1088–1098.CrossRefGoogle Scholar
  20. Martínez, C., & Fuentes, O. (2003). Face recognition using unlabeled data. Computación y Sistemas, 7(2), 123–129.Google Scholar
  21. Mitchell, T. M. (1997). Machine learning (ch. 3). New York: McGraw-Hill.Google Scholar
  22. Muhlenbach, F., Lallich, S., & Zighed, D. A. (2004). Identifying and handling mislabeled instances. Journal of Intelligent Information Systems, 22(1), 89–109.CrossRefGoogle Scholar
  23. Muhlenbruch, M. D. G., et al. (2006). Small pulmonary nodules: Effect of two computer-aided detection systems on radiologist performance. Radiology, 241(2), 564–571.MathSciNetCrossRefGoogle Scholar
  24. Nigam K., & Ghani, R. (2000). Analyzing the effectiveness and applicability of co-training. In Proc. ACM 9th conf. on information and knowledge management (pp. 86–93). McLean, Virginia.Google Scholar
  25. Nigam, K., McCallum, A. K., Thrun, S., & Mitchell, T. (2000). Text classification from labeled and unlabeled documents using EM. Machine Learning, 39(3–4), 103–134.zbMATHCrossRefGoogle Scholar
  26. Paredes, R., & Vidal, E. (2006). Learning weighted metrics to minimize nearest-neighbor classification error. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(7), 1100–1110.CrossRefGoogle Scholar
  27. Roli, F. (2005). Semi-supervised multiple classifier systems: Background and research direction. In Proc. multiple classifiers systems (pp. 1–11). Seaside, CA.Google Scholar
  28. Sánchez, J. S., Barandela, R., Marqués, A. I., Alejo, R., & Badenas, J. (2003). Analysis of new techniques to obtain quality training sets. Pattern Recognition Letters, 24(7), 1015–1022.CrossRefGoogle Scholar
  29. Seeger, M. (2001). Learning with labeled and unlabeled data. Tech. Rep., Univ. of Edinburgh, Edinburgh, Scotland.Google Scholar
  30. Vincent, N., & Claire, C. (2003). Bootstrapping coreference classifiers with multiple machine learning algorithms. In Proc. 2003 conf. empirical methods in natural language processing (pp. 113–120). Sapporo, Japan.Google Scholar
  31. Wilson, D. R., & Martinez, T. R. (1997). Improved heterogeneous distance functions. Journal of Artificial Intelligence Research, 6(1), 1–34.MathSciNetzbMATHGoogle Scholar
  32. Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques with java implementations (2nd ed.). San Francisco: Morgan Kaufmann.Google Scholar
  33. Xu, Q., Hu, D. H., Xue, H., Yu, W., & Yang, Q. (2009). Semi-supervised protein subcellular localization. BMC Bioinformatics, 10(suppl. 1), S47. doi: 10.1186/1471-2105-10-S1-S47.CrossRefGoogle Scholar
  34. Zhou, Y., & Goldman, S. (2004). Democratic co-learning. In Proc. 16th IEEE int. conf. tools with artificial intelligence (pp. 594–602). Boca Raton, FL.Google Scholar
  35. Zhou, Z. H., & Li, M. (2005). Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering, 17(11), 1529–1541.CrossRefGoogle Scholar
  36. Zhu, X. J. (2008). Semi-supervised learning literature survey. Tech. Rep. Computer Sciences, TR1530, Univ. of Wisconsin-Madison, Wisconsin.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyHarbin Institute of TechnologyHarbinChina
  2. 2.China Mobile Research InstituteBeijingChina

Personalised recommendations