Springer Nature is making Coronavirus research free. View research | View latest news | Sign up for updates

Semi-Supervised Approach to Phase Identification from Combinatorial Sample Diffraction Patterns


Manual attribution of crystallographic phases from high-throughput x-ray diffraction studies is an arduous task, and represents a rate-limiting step in high-throughput exploration of new materials. Here, we demonstrate a semi-supervised machine learning technique, SS-AutoPhase, which uses a two-step approach to identify automatically phases from diffraction data. First, clustering analysis is used to select a representative subset of samples automatically for human analysis. Second, an AdaBoost classifier uses the labeled samples to identify the presence of the different phases in diffraction data. SS-AutoPhase was used to identify the metallographic phases in 278 diffraction patterns from a FeGaPd composition spread sample. The accuracy of SS-AutoPhase was >82.6% for all phases when 15% of the diffraction patterns were used for training. The SS-AutoPhase predicted phase diagram showed excellent agreement with human expert analysis. Furthermore it was able to determine and identify correctly a previously unreported phase.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. 1.

    National Science and Technology Council, Materials Genome Initiative for Global Competitiveness (2011).

  2. 2.

    S. Curtarolo, G.L.W. Hart, M.B. Nardelli, N. Mingo, S. Sanvito, and O. Levy, Nat. Mater. 12, 191 (2013).

  3. 3.

    National Science and Technology Council, Materials Genome Initiative Strategic Plan (2014).

  4. 4.

    M.L. Green, J.R. Hattrick-Simpers, I. Takeuchi, S.C. Barron, A.M. Joshi, T. Chiang, A. Mehta, and A. Davydov, Fulfilling the Promise of the Materials Genome Initiative via High-Throughput Experimentation (2014).

  5. 5.

    J.R. Hattrick-Simpers, C. Wen, and J. Lauterbach, Catal. Lett. 145, 290 (2014).

  6. 6.

    D.J. Arriola, E.M. Carnahan, P.D. Hustad, R.L. Kuhlman, and T.T. Wenzel, Science 714, 312 (2006).

  7. 7.

    J. Cui, Y.S. Chu, O.O. Famodu, Y. Furuya, J.R. Hattrick-Simpers, R.D. James, A. Ludwig, S. Thienhaus, M. Wuttig, Z. Zhang, and I. Takeuchi, Nat. Mater. 4, 286 (2006).

  8. 8.

    A. Shinde, D. Guevarra, J.A. Haber, J. Jin, and J.M. Gregoire, J. Mater. Res. 30, 442 (2015).

  9. 9.

    W.F. Maier, K. Stowe, and S. Sieg, Angew. Chem. Int. Ed. Engl. 46, 6016 (2007).

  10. 10.

    O.O. Famodu, J.R. Hattrick-Simpers, M. Aronova, K. Chang, M. Murakami, M. Wuttig, T. Okazaki, Y. Furuya, L.A. Knauss, L.A. Bendersky, F.S. Biancaniello, and I. Takeuchi, Mater. Trans. 45, 173 (2004).

  11. 11.

    A. Holzwarth and W.F. Maier, Platin. Met. Rev. 44, 16 (2000).

  12. 12.

    K. Yang, J. Bedenbaugh, H. Li, M. Peralta, J.K. Bunn, J. Lauterbach, and J.R. Hattrick-Simpers, ACS Comb. Sci. 14, 372 (2012).

  13. 13.

    G. Barr, W. Dong, and C.J. Gilmore, J. Appl. Crystallogr. 37, 243 (2004).

  14. 14.

    G.J. Cunningham (Master’s Thesis, Instituto Superior Técnico, 2011).

  15. 15.

    C.J. Long, J.R. Hattrick-Simpers, M. Murakami, R.C. Srivastava, I. Takeuchi, V.L. Karen, and X. Li, Rev. Sci. Instrum. 78, 072217 (2007).

  16. 16.

    R. Le Bras, T. Damoulas, J.M. Gregoire, A. Sabharwal, C.P. Gomes, and R.B. Van Dover, Lect. Notes Comput. Sci. 6878, 508 (2011).

  17. 17.

    S. Ermon, R. Le Bras, S.K. Suram, J.M. Gregoire, C.P. Gomes, B. Selman, and R.B. Van Dover, arXiv. 1411, 7441 (2014).

  18. 18.

    L.A. Baumes, M. Moliner, N. Nicoloyannis, and A. Corma, Cryst. Eng. Comm. 10, 10 (2008).

  19. 19.

    C.J. Long, D. Bunker, X. Li, V.L. Karen, and I. Takeuchi, Rev. Sci. Instrum. 80, 1 (2009).

  20. 20.

    A.G. Kusne, T. Gao, A. Mehta, L. Ke, M.C. Nguyen, K.M. Ho, V. Antropov, C.Z. Wang, M.J. Kramer, C. Long, and I. Takeuchi, Sci. Rep. 4, 6367 (2014).

  21. 21.

    J.K. Bunn, S. Han, Y. Tong, Y. Zhang, J. Hu, and J.R. Hattrick-Simpers, J. Mater. Res. 30, 879 (2015).

  22. 22.

    Citrin Informatics, Fe-Ga-Pd, Ciritrination,

  23. 23.

    C. Long, CombiView,

  24. 24.

    F. Pedregosa and G. Varoquaux, J. Mach. Learn. 12, 2825 (2011).

  25. 25.

    J.A. Hartigan and M.A. Wong, J. R. Stat. Soc. C App. 28, 100 (1979).

  26. 26.

    D. Arthur and S. Vassilvitskii, Proceedings of Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, p. 1027 (2007).

  27. 27.

    Y. Freund and R.E. Schapire, J. Comput. Syst. Sci. 55, 119 (1997).

  28. 28.

    K. Rajan, C. Suh, and P.F. Mendez, Stat. Anal. Data Min. 1, 361 (2009).

  29. 29.

    J.R. Hattrick-Simpers, J. Cui, M. Murakami, A. Orozco, L. Knauss, R.J. Booth, E.W. Greve, S.E. Lofland, M. Wuttig, and I. Takeuchi, Appl. Surf. Sci. 254, 734 (2007).

  30. 30.

    J. Cui, T.W. Shield, and R.D. James, Acta Mater. 52, 35 (2004).

  31. 31.

    J. Cui (PhD Thesis, University of Minnesota 2002).

  32. 32.

    R.A. Kellogg, A.B. Flatau, A.E. Clark, M. Wun-Fogle, and T.A. Lograsso, J. Appl. Phys. 93, 8495 (2003).

  33. 33.

    M. Wuttig, L. Dai, and J. Cullen, Appl. Phys. Lett. 80, 113501137 (2002).

  34. 34.

    S. Hamann, M.E. Gruner, S. Irsen, J. Buschbeck, C. Bechtold, I. Kock, S.G. Mayr, A. Savan, S. Thienhaus, E. Quandt, E.S. Fohler, P. Entel, and A. Ludwig, Acta Mater. 58, 5949 (2010).

  35. 35.

    S. Curtarolo, W. Setyawan, S. Wang, J. Xue, K. Yang, R.H. Taylor, L.J. Nelson, G.L.W. Hart, S. Sanvito, M. Buongiorno-Nardelli, N. Mingo, and O. Levy, Comp. Mater. Sci. 58, 227 (2012).

  36. 36.

    A. Jain, G. Hautier, C.J. Moore, S.P. Ong, C.C. Fischer, T. Mueller, K.A. Persson, and G. Ceder, Mater. Sci. 50, 2295 (2011).

  37. 37.

    D. Landis, J.S. Hummelshoj, S. Nestorov, J. Greeley, M. Dulak, T. Bligaard, J.K. Norskov, and K. Jaconsen, Comput. Sci. Eng. 14, 51 (2012).

  38. 38.

    M. Klintenberg, The Electronic Structure Project,

  39. 39.

    E. Tadmor, R. Elliot, and I. Takeuichi, Rise of Data in Materials Research,

  40. 40.

    J.R. Hattrick-Simpers, J.M. Gregoire, and A.G. Kusne, APL Mater. 4, 053211 (2016).

Download references


The work is funded in part by the Advanced Research Projects Agency-Energy (ARPA-E), U.S. Department of Energy, under Award DE-AR0000492. We would like to acknowledge the support of the South Carolina SmartState Center for Strategic Approaches to the Generation of Electricity (SAGE).

Author information

Correspondence to Jason R. Hattrick-Simpers.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 397 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bunn, J.K., Hu, J. & Hattrick-Simpers, J.R. Semi-Supervised Approach to Phase Identification from Combinatorial Sample Diffraction Patterns. JOM 68, 2116–2125 (2016).

Download citation


  • Hierarchal Cluster Analysis
  • Human Expert
  • Dynamic Time Warping
  • Fe3Si
  • Training Sample Size