Abstract
Data mining has revolutionized sectors as diverse as pharmaceutical drug discovery, finance, medicine, and marketing, and has the potential to similarly advance materials science. In this paper, we describe advances in simulation-based materials databases, open-source software tools, and machine learning algorithms that are converging to create new opportunities for materials informatics. We discuss the data mining techniques of exploratory data analysis, clustering, linear models, kernel ridge regression, tree-based regression, and recommendation engines. We present these techniques in the context of several materials application areas, including compound prediction, Li-ion battery design, piezoelectric materials, photocatalysts, and thermoelectric materials. Finally, we demonstrate how new data and tools are making it easier and more accessible than ever to perform data mining through a new analysis that learns trends in the valence and conduction band character of compounds in the Materials Project database using data on over 2500 compounds.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
G. Hautier, A. Jain, and S.P. Ong: From the computer to the laboratory: Materials discovery and design using first-principles calculations. J. Mater. Sci. 47 (21), 7317–7340 (2012).
K. Rajan and P. Mendez: Materials informatics. Mater. Today 8 (10), 38–45 (2005).
M. Rupp, E. Proschak, and G. Schneider: Kernel approach to molecular similarity based on iterative graph similarity. J. Chem. Inf. Model. 47 (6), 2280–2286 (2007).
M. Rupp, A. Tkatchenko, K.-R. Müller, V. Lilienfeld, and O. Anatole: Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 108, 058301 (2012).
K. Hansen, G. Montavon, F. Biegler, S. Fazli, M. Rupp, M. Scheffler, O.A. Von Lilienfeld, A. Tkatchenko, and K.R. Müller: Assessment and validation of machine learning methods for predicting molecular atomization energies. J. Chem. Theory Comput. 9, 3404–3419 (2013).
G. Bergerhoff, R. Hundt, R. Sievers, and I.D. Brown: The inorganic crystal-structure database. J. Chem. Inf. Comput. Sci. 23 (2), 66–69 (1983).
F.H. Allen: The cambridge structural database: a quarter of a million crystal structures and rising. Acta Crystallogr., Sect. B: Struct. Sci. 58, 380–388 (2002).
P. Villars: The linus pauling file (LPF) and its application to materials design. J. Alloys Compd. 279 (1), 1–7 (1998).
R.D. Shannon: Revised effective ionic radii and systematic studies of interatomic distances in halides and chalcogenides. Acta Crystallogr., Sect. A: Found. Adv. 32 (5), 751–767 (1976).
I.D. Brown and D. Altermatt: Bond-valence parameters obtained from a systematic analysis of the inorganic crystal structure database. Acta Crystallogr., Sect. B: Struct. Sci. 244 (2), 244–247 (1985).
M. O’Keefe and N.E. Brese: Bond–valence parameters for solids. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 47, 192–197 (1991).
I. Brown and K. Wu: Empirical parameters for calculating cation-oxygen bond valences. Acta Crystallogr., Sect. B: Struct. Sci. 32 (31563), 1957–1959 (1976).
I.D. Brown: On the geometry of OH…O hydrogen bonds. Acta Crystallogr., Sect. A: Found. Adv. 32 (31563), 24–31 (1976).
D. Yu and D. Xue: Bond analyses of borates from the inorganic crystal structure database. Acta Crystallogr., Sect. B: Struct. Sci. 62, 702–709 (2006).
A.L. Mackay: The statistics of the distribution of crystalline substances among the space groups. Acta Crystallogr. 22, 329–330 (1967).
V.S. Urusov and T.N. Nadezhina: Frequency distribution and selection of space groups in inorganic crystal chemistry. J. Struct. Chem. 50, 22–37 (2009).
S.C. Abrahams: Inorganic structures in space group P3m1; Coordinate analysis and systematic prediction of new ferroelectrics. Acta Crystallogr., Sect. B: Struct. Sci. 64, 426–437 (2008).
M. Avdeev, M. Sale, S. Adams, and R.P. Rao: Screening of the alkali-metal ion containing materials from the inorganic crystal structure database (ICSD) for high ionic conductivity pathways using the bond valence method. Solid State Ionics 2–5 (2012).
O. Muller and R. Roy: The Major Ternary Structural Families (Springer-Verlag, New York, 1974).
D.G. Pettifor: The structures of binary compound: I. Phenomenological structure maps. J. Phys. C: Solid State Phys. 19, 285–313 (1986).
D.G. Pettifor: Structure maps in alloy design. J. Chem. Soc., Faraday Trans. 86 (8), 1209–1213 (1990).
D.G. Pettifor: Structure maps revisited. J. Phys.: Condens. Matter 15, 13–16 (2003).
D. Morgan, J. Rodgers, and G. Ceder: Automatic construction, implementation and assessment of Pettifor maps. J. Phys.: Condens. Matter 15, 4361–4369 (2003).
C.S. Kong, W. Luo, S. Arapan, P. Villars, S. Iwata, R. Ahuja, and K. Rajan: Information-theoretic approach for the discovery of design rules for crystal chemistry. J. Chem. Inf. Model 52, 1812–1820 (2012).
P.S. White, J.R. Rodgers, and Y. Le Page: Crystmet: A database of the structures and powder patterns of metals and intermetallics. Acta Crystallogr., Sect. B: Struct. Sci. 58, 343–348 (2002).
P. Villars and K. Cenzual: Pearsons crystal data: Crystal structure database for inorganic compounds (ASM International/Material Phases Data System, Vitznau, Switzerland, 2010).
L. Glasser: Crystallographic information resources. J. Chem. Educ. (2015). acs.jchemed.5b00253.
SpringerMaterials: The Landolt-Börnstein database. www.springermaterials.com/.
C. Bale, E. Bélisle, P. Chartrand, S. Decterov, G. Eriksson, K. Hack, I-H. Jung, Y-B. Kang, J. Melançon, A. Pelton, C. Robelin, and S. Petersen: FactSage thermochemical software and databases recent developments. Calphad 33 (2), 295–311 (2009).
P. Linstrom and W. Mallard: NIST Chemistry WebBook, NIST Standard Reference Database Number 69 (National Institute of Standards and Technology, Gaithersburg MD 20899, 2015).
L. MatWeb: MatWeb, Material property data, Data base of materials data sheets.
MatNavi: NIMS materials database. http://mits.nims.go.jp/index_en.html. (2014).
O. Kubaschewski, C.B. Alcock, and P.J. Spencer: Thermochemical Data, in: Materials Thermochemistry, 6th ed. (Pergamon Press, Oxford, 1993); ch. 5, p. 376.
H. Okamoto: In Handbook of Ternary Alloy Phase Diagrams, P. Villars, A. Prince, and H. Okamoto eds.; (ASM International: OH, 1995); pp. 10378–10379.
P. Hohenberg and W. Kohn: Inhomogeneous electron gas. Phys. Rev. 136, B864–B871 (1964).
W. Kohn and L. Sham: Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, 1133–1138 (1965).
M.D. Jong, W. Chen, T. Angsten, A. Jain, R. Notestine, A. Gamst, M. Sluiter, C.K. Ande, S.V.D. Zwaag, J.J. Plata, C. Toher, S. Curtarolo, G. Ceder, K.A. Persson, and M. Asta: Charting the complete elastic properties of inorganic crystalline compounds. Sci. Data 2, 1–13 (2015).
A. Jain, S.P. Ong, G. Hautier, W. Chen, W.D. Richards, S. Dacek, S. Cholia, D. Gunter, D. Skinner, G. Ceder, and K.A. Persson: Performance of genetic algorithms in search for water splitting perovskites. APL Mater. 1, 011002 (2013).
S. Curtarolo, W. Setyawan, S. Wang, J. Xue, K. Yang, R.H. Taylor, L.J. Nelson, G.L.W. Hart, S. Sanvito, M. Buongiorno-Nardelli, N. Mingo, and O. Levy: Aflowlib.org: A distributed materials properties repository from high-throughput ab initio calculations. Comput. Mater. Sci. 58, 227–235 (2012).
J.E. Saal, S. Kirklin, M. Aykol, B. Meredig, and C. Wolverton: Materials design and discovery with high-throughput density functional theory: The open quantum materials database (OQMD). JOM 65 (11), 1501–1509 (2013).
J. Hachmann, R. Olivares-Amaya, S. Atahan-Evrenk, C. Amador-Bedolla, R.S. Sanchez-Carrera, A. Gold-Parker, L. Vogt, A.M. Brockway, and A. Aspuru-Guzik: The Harvard clean energy project: large-scale computational screening and design of organic photovoltaics on the world community grid. J. Phys. Chem. Lett. 2 (17), 2241–2251 (2011).
C. Ortiz, O. Eriksson, and M. Klintenberg: Data mining and accelerated electronic structure theory as a tool in the search for new functional materials. Comput. Mater. Sci. 44 (4), 1042–1049 (2009).
E. Blokhin, L. Pardini, F. Mohamed, K. Hannewald, L. Ghiringhelli, P. Pavone, C. Carbogno, J-C. Freytag, C. Draxl, and M. Scheffler: The NoMaD Repository. http://nomad-repository.eu/.
V. Stevanović, S. Lany, X. Zhang, and A. Zunger: Correcting density functional theory for accurate predictions of compound enthalpies of formation: fitted elemental-phase reference energies. Phys. Rev. B: Condens. Matter Mater. Phys. 85 (11), 1–12 (2012).
D.D. Landis, J.S. Hummelshøj, S. Nestorov, J. Greeley, M. Dulak, T. Bligaard, J. Norskov, and K. Jacobsen: The computational materials repository. Comput. Sci. Eng. 14, 51–57 (2012).
J.S. Hummelshøj, F. Abild-Pedersen, F. Studt, T. Bligaard, and J.K. Nørskov: CatApp: A web application for surface chemistry and heterogeneous catalysis. Angew. Chem., Int. Ed. Engl. 51 (1), 272–274 (2012).
A. Togo and I. Tanaka: First principles phonon calculations in materials science. Scr. Mater. 108, 1–5 (2015).
A. Togo: PhononDB at Kyoto University (http://phonondb.mtl.kyoto-u.ac.jp).
P. Gorai, D. Gao, B. Ortiz, S. Miller, S.A. Barnett, T. Mason, Q. Lv, V. Stevanović, and E.S. Toberer: Te design lab: A virtual laboratory for thermoelectric material design. Comput. Mater. Sci. 112, 368–376 (2016).
G. Yuan and F. Gygi: Estest: A framework for the validation and verification of electronic structure codes. Comput. Sci. Discovery 3 (1), 015004 (2010).
H.E. Pence and A. Williams: ChemSpider: An online chemical information resource. J. Chem. Educ. 87 (11), 1123–1124 (2010).
L. Lin: Materials databases infrastructure constructed by first principles calculations: A review. Mater. Perform. Charact. 4, MPC20150014 (2015).
S.P. Ong, W.D. Richards, A. Jain, G. Hautier, M. Kocher, S. Cholia, D. Gunter, V.L. Chevrier, K.A. Persson, and G. Ceder: Python materials genomics (pymatgen): A robust, open-source python library for materials analysis. Comput. Mater. Sci. 68, 314–319 (2013).
S. Bahn and K. Jacobsen: An object-oriented scripting interface to a legacy electronic structure code. Comput. Sci. Eng. 4 (3), 56–66 (2002).
S. Curtarolo, W. Setyawan, G.L. Hart, M. Jahnatek, R.V. Chepulskii, R.H. Taylor, S. Wang, J. Xue, K. Yang, O. Levy, M.J. Mehl, H.T. Stokes, D.O. Demchenko, and D. Morgan, AFLOW: An automatic framework for high-throughput materials discovery, Comput. Mater. Sci. 58, 218–226 (2012).
G. Pizzi, A. Cepellotti, R. Sabatini, N. Marzari, and B. Kozinsky: AiiDA: automated interactive infrastructure and database for computational science. Comput. Mater. Sci. 111, 218–230 (2016).
A. Jain, S. Ong, W. Chen, B. Medasani, X. Qu, M. Kocher, M. Brafman, G. Petretto, G-M. Rignanese, G. Hautier, D. Gunter, and K. Persson: FireWorks: a dynamic workflow system designed for high-throughput applications. Concurr. Comput. Pract. Exp. 27, 5037–5059 (2015).
R.T. Fielding: Architectural styles and the design of network-based software architectures. Ph.D. Dissertation, University of California, Irvine, 2000.
S.P. Ong, S. Cholia, A. Jain, M. Brafman, D. Gunter, G. Ceder, and K.A. Persson: The materials application programming interface (API): A simple, flexible and efficient API for materials data based on REpresentational state transfer (REST) principles. Comput. Mater. Sci. 97, 209–215 (2015).
T. Hastie, R. Tibshirani, and J. Friedman: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics, 2nd ed. (Springer, New York, 2009); ch. 4, pp. 80–113.
L.M. Ghiringhelli, J. Vybiral, S.V. Levchenko, C. Draxl, and M. Scheffler: Big data of materials science: Critical role of the descriptor. Phys. Rev. Lett. 114, 105503 (2015).
K. Yang, W. Setyawan, S. Wang, M. Buongiorno Nardelli, and S. Curtarolo: A search model for topological insulators with high-throughput robustness descriptors. Nat. Mater. 11 (7), 614–619 (2012).
H. Burzlaff and H. Zimmermann: On symmetry classes of crystal structures. Acta Crystallogr., Sect. A: Found. Crystallogr. 65, 456–465 (2009).
R. Allmann and R. Hinek: The introduction of structure types into the inorganic crystal structure database ICSD. Acta Crystallogr., Sect. A: Found. Crystallogr. 63, 412–417 (2007).
G. Hautier, C.C. Fischer, A. Jain, T. Mueller, and G. Ceder: Finding nature’s missing ternary oxide compounds using machine learning and density functional theory. Chem. Mater. 22 (12), 3762–3767 (2010).
G. Hautier, C. Fischer, V. Ehrlacher, A. Jain, and G. Ceder: Data mined ionic substitutions for the discovery of new compounds. Inorg. Chem. 50 (17), 656–663 (2010).
J. Behler and M. Parrinello: Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98 (14), 146401 (2007).
L. Yang, S. Dacek, and G. Ceder: Proposed definition of crystal substructure and substructural similarity. Phys. Rev. B: Condens. Matter Mater. Phys. 90 (5), 054102 (2014).
A.R. Oganov and M. Valle: How to quantify energy landscapes of solids. J. Chem. Phys. 130 (10), 104504 (2009).
O. Isayev, D. Fourches, E.N. Muratov, C. Oses, K. Rasch, A. Tropsha, and S. Curtarolo: Materials cartography: Representing and mining material space using structural and electronic fingerprints. Chem. Mater. 27, 735–743 (2014).
F. Faber, A. Lindmaa, O.A. von Lilienfeld, and R. Armiento: Crystal structure representations for machine learning models of formation energies. Int. J. Quantum Chem. 115, 1–8 (2015).
C.S. Kong, S.R. Broderick, T.E. Jones, C. Loyola, M.E. Eberhart, and K. Rajan: Mining for elastic constants of intermetallics from the charge density landscape. Phys. B 458, 1–7 (2015).
A. Seko, T. Maekawa, K. Tsuda, and I. Tanaka: Machine learning with systematic density-functional theory calculations: Application to melting temperatures of single- and binary-component solids. Phys. Rev. B: Condens. Matter Mater. Phys. 89, 054303 (2014).
M. Schmidt and H. Lipson: Distilling free-form natural laws from experimental data. Science 324 (5923), 81–85 (2009).
A. Jain, G. Hautier, C.J. Moore, S.P. Ong, C.C. Fischer, T. Mueller, K.A. Persson, G. Ceder, and S. Ping Ong: A high-throughput infrastructure for density functional theory calculations. Comput. Mater. Sci. 50, 2295–2310 (2011).
G. Hautier, A. Jain, S.P. Ong, B. Kang, C. Moore, R. Doe, and G. Ceder: Phosphates as lithium-ion battery Cathodes: An evaluation based on high-throughput ab initio calculations. Chem. Mater. 23, 3508–3945 (2011).
A. Jain, G. Hautier, S.P. Ong, S. Dacek, and G. Ceder: Relating voltage and thermal safety in Li-ion battery cathodes: a high-throughput computational study. Phys. Chem. Chem. Phys. 17, 5942–5953 (2015).
S.P. Ong, A. Jain, G. Hautier, B. Kang, and G. Ceder: Thermal stabilities of delithiated olivine MPO4 (M = Fe, Mn) cathodes investigated using first principles calculations. Electrochem. Commun. 12 (3), 427–430 (2010).
N.A. Godshall, I.D. Raistrick, and R.A. Huggins: Relationships among electrochemical, thermodynamic, and oxygen potential quantities in lithium-transition metal-oxygen molten salt cells. J. Electrochem. Soc. 131 (3), 543 (1984).
R. Xu and D. Wunsch II: Survey of clustering algorithms, neural networks, IEEE Trans. Neural Networks 16, 645–678 (2005).
G. Gan, C. Ma, and J. Wu: Data clustering: theory, algorithms, and applications, Vol. 20 (Society for Industrial and Applied Mathematics, Philadelphia, 2007).
B. Meredig and C. Wolverton: Dissolving the periodic table in cubic zirconia: Data mining to discover chemical trends. Chem. Mater. 26 (6), 1985–1991 (2014).
I.E. Castelli and K.W. Jacobsen: Designing rules and probabilistic weighting for fast materials discovery in the perovskite structure. Modell. Simul. Mater. Sci. Eng. 22 (5), 055007 (2014).
S.R. Broderick, H. Aourag, and K. Rajan: Classification of oxide compounds through data-mining density of states spectra. J. Am. Ceram. Soc. 94 (9), 2974–2980 (2011).
R. Andersen: Modern Methods for Robust Regression (Sage, Los Angeles, 2008).
J.R. Chelikowsky and K.E. Anderson: Melting point trends in intermetallic alloys. J. Phys. Chem. Solids 48 (2), 197–205 (1987).
R. Tibshirani: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996).
H. Zou and T. Hastie: Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67 (2), 301–320 (2005).
P. Dey, J. Bible, S. Datta, S. Broderick, J. Jasinski, M. Sunkara, M. Menon, and K. Rajan: Informatics-aided bandgap engineering for solar materials. Comput. Mater. Sci. 83, 185–195 (2014).
S. Srinivasan and K. Rajan: “Property phase diagrams” for compound semiconductors through data mining. Materials 6 (1), 279–290 (2013).
C.S. Kong and K. Rajan: Rational design of binary halide scintillators via data mining. Nucl. Instrum. Methods Phys. Res., Sect. A 680, 145–154 (2012).
I. Toda-Caraballo, E.I. Galindo-Nava, and P.E.J. Rivera-Díaz-Del-Castillo: Unravelling the materials genome: Symmetry relationships in alloy properties. J. Alloys Compd. 566, 217–228 (2013).
W.B. Park, S.P. Singh, M. Kim, and K-S. Sohn: Phosphor informatics based on confirmatory factor analysis. ACS Comb. Sci. 150408124118005 (2015).
S. Curtarolo, D. Morgan, K. Persson, J. Rodgers, and G. Ceder: Predicting crystal structures with data mining of quantum calculations. Phys. Rev. Lett. 91 (13), 135503 (2003).
P.V. Balachandran, S.R. Broderick, and K. Rajan: Identifying the ‘inorganic gene’ for high-temperature piezoelectric perovskites through statistical learning. Proc. R. Soc. A 467, 2271–2290 (2011).
N. Cristianini and J. Shawe-Taylor: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods (Cambridge University Press, 2000).
G. Pilania, C. Wang, X. Jiang, S. Rajasekaran, and R. Ramprasad: Accelerating materials property predictions using machine learning. Sci. Rep. 3, 2810 (2013).
K.T. Schütt, H. Glawe, F. Brockherde, A. Sanna, K.R. Müller, and E.K.U. Gross: How to represent crystal structures for machine learning: Towards fast prediction of electronic properties. Phys. Rev. B: Condens. Matter Mater. Phys. 89, 1–5 (2014).
R. Jalem, M. Nakayama, and T. Kasuga: An efficient rule-based screening approach for discovering fast lithium ion conductors using density functional theory and artificial neural networks. J. Mater. Chem. A 2 (3), 720 (2014).
F. Pettersson, C. Suh, H. Saxen, K. Rajan, and N. Chakraborti: Analyzing sparse data for nitride spinels using data mining, neural networks, and multiobjective genetic algorithms. Mater. Manuf. Processes 24 (1), 2–9 (2009).
D. Scott, S. Manos, and P. Coveney: Design of electroceramic materials using artificial neural networks and multiobjective evolutionary algorithms. J. Chem. Inf. Model. 48, 262–273 (2008).
Y. Zhang, S. Yang, and J. Evans: Revisiting Hume-Rotherys rules with artificial neural networks. Acta Mater. 56 (5), 1094–1105 (2008).
J. Carrete, N. Mingo, S. Wang, and S. Curtarolo: Nanograined half-heusler semiconductors as advanced Thermoelectrics: An ab initio high-throughput statistical study. Adv. Funct. Mater. 24, 7427–7432 (2014).
A. Liaw and M. Wiener: Classification and regression by randomForest. R News 2 (3), 18–22 (2002).
B. Meredig, A. Agrawal, S. Kirklin, J.E. Saal, J.W. Doak, A. Thompson, K. Zhang, A. Choudhary, and C. Wolverton: Combinatorial screening for new materials in unconstrained composition space with machine learning. Phys. Rev. B: Condens. Matter Mater. Phys. 89 (9), 094104 (2014).
R. Bell, Y. Koren, and C. Volinsky: Chasing $1,000,000: How We Won The Netflix Progress Prize, Statistical Computing and Statistical Graphics Newsletter 18(2), 4–12 (2007).
C.C. Fischer, K.J. Tibbetts, D. Morgan, and G. Ceder: Predicting crystal structure by merging data mining with quantum mechanics. Nature Mater. 5 (8), 641–646 (2006).
T. Fix, S-L. Sahonta, V. Garcia, J.L. MacManus-Driscoll, and M.G. Blamire: Structural and Dielectric Properties of SnTiO3, a putative ferroelectric. Cryst. Growth Des. 11, 1422–1426 (2011).
A. Jain, G. Hautier, C.J. Moore, B. Kang, J. Lee, H. Chen, N. Twu, and G. Ceder: A computational investigation of Li9M3(P2O7)3(PO4)2 (M = V, Mo) as cathodes for Li ion batteries. J. Electrochem. Soc. 159 (5), A622–A633 (2012).
Q. Kuang, J. Xu, Y. Zhao, X. Chen, and L. Chen: Layered monodiphosphate Li9V3(P2O7)3(PO4)2: A novel cathode material for lithium-ion batteries. Electrochim. Acta 56 (5), 2201–2205 (2011).
H. Chen, G. Hautier, and G. Ceder: Synthesis, computed stability and crystal structure of a new family of inorganic compounds: Carbonophosphates. J. Am. Chem. Soc. 134 (48), 19619–19627 (2012).
G. Hautier, A. Jain, H. Chen, C. Moore, SP. Ong, and G. Ceder: Novel mixed polyanions lithium-ion battery cathode materials predicted by high-throughput ab initio computations. J. Mater. Chem. 21, 17147–17153 (2011).
C. Jähne, C. Neef, C. Koo, H-P. Meyer, and R. Klingeler: A new LiCoPO4 polymorph via low temperature synthesis. J. Mater. Chem. A 1 (8), 2856 (2013).
K. Snyder, B. Raguž, W. Hoffbauer, R. Glaum, H. Ehrenberg, and M. Herklotz: Lithium copper(I) orthophosphates Li3−xCuxPO4: Synthesis, crystal structures, and electrochemical properties. Z. Anorg. Allg. Chem. 640 (5), 944–951 (2014).
E. Mosymow, R. Glaum, and R.K. Kremer: Searching for “LiCrIIPO4”. J. Solid State Chem. 218, 131–140 (2014).
L. Yang and G. Ceder: Data-mined similarity function between material compositions. Phys. Rev. B: Condens. Matter Mater. Phys. 88, 224107 (2013).
M.W. Gaultois, A.O. Oliynyk, A. Mar, T.D. Sparks, G.J. Mulholland, and B. Meredig: A recommendation engine for suggesting unexpected thermoelectric chemistries. 7, (2015), 7arXiv: 1502.07635.
A. Seko, A. Togo, H. Hayashi, K. Tsuda, L. Chaput, and I. Tanaka: Prediction of Low-Thermal-Conductivity Compounds with First-Principles Anharmonic Lattice-Dynamics Calculations and Bayesian Optimization, Phys. Rev. Lett. 115 (20), 205901 (2015).
H. Turner and D. Firth: Bradley-Terry models in R: The BradleyTerry2 Package. J. Stat. Software 48 (9), 1–21 (2012).
R.A. Bradley and M.E. Terry: Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika 39, 324–345 (1952).
J. Robertson and S.J. Clark: Limits to doping in oxides. Phys. Rev. B 83 (7), 075205 (2011).
D.O. Scanlon and G.W. Watson: On the possibility of p-type SnO2. J. Mater. Chem. 22 (48), 25236 (2012).
A. Zunger: Practical doping principles. Appl. Phys. Lett. 83 (1), 57 (2003).
H. Kawazoe, M. Yasukawa, and H. Hyodo: P-type electrical conduction in transparent thin films of CuAlO2. Nature 389, 939–942 (1997).
S. Sheng, G. Fang, C. Li, S. Xu, and X. Zhao: p-type transparent conducting oxides. Phys. Status Solidi A 203 (8), 1891–1900 (2006).
A. Kudo, H. Yanagi, H. Hosono, and H. Kawazoe: SrCu2O2: A p-type conductive oxide with wide band gap. Appl. Phys. Lett. 73 (2), 220 (1998).
G. Trimarchi, H. Peng, J. Im, A. Freeman, V. Cloet, A. Raw, K. Poeppelmeier, K. Biswas, S. Lany, and A. Zunger: Using design principles to systematically plan the synthesis of hole-conducting transparent oxides: Cu3VO4 and Ag3VO4 as a case study. Phys. Rev. B 84 (16), 165116 (2011).
A. Walsh and J.L.F. Da Silva, S-H. Wei: Multi-component transparent conducting oxides: Progress in materials modelling. J. Phys.: Condens. Matter 23 (33), 334210 (2011).
G. Hautier, A. Miglio, G. Ceder, G-M. Rignanese, and X. Gonze: Identification and design principles of low hole effective mass p-type transparent conducting oxides. Nat. Commun. 4, 2292 (2013).
H. Peng and S. Lany: Semiconducting transition-metal oxides based on d$5 cations: Theory for MnO and Fe2O3. Phys. Rev. B: Condens. Matter Mater. Phys. 85 (85), 201202 (2012).
S. Arlot and A. Celisse: A survey of cross-validation procedures for model selection. Stat. Surveys 4, 40–79 (2010).
ACKNOWLEDGMENTS
This work was intellectually led by the Materials Project (DOE Basic Energy Sciences Grant No. EDCBEE). Work at the Lawrence Berkeley National Laboratory was supported by the U.S. Department of Energy Office of Science, Office of Basic Energy Sciences Department under Contract No. DE-AC02-05CH11231. GH acknowledges financial support from the European Union Marie Curie Career Integration (CIG) grant HT4TCOs PCIG11-GA-2012-321988. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility.
Author information
Authors and Affiliations
Corresponding author
Supplementary Material
43578_2016_31080977_MOESM1_ESM.docx
Supplementary Information for “New Opportunities for Materials Informatics: Resources and Techniques for Uncovering Hidden Relationships” (approximately 177 KB)
Rights and permissions
About this article
Cite this article
Jain, A., Hautier, G., Ong, S.P. et al. New opportunities for materials informatics: Resources and data mining techniques for uncovering hidden relationships. Journal of Materials Research 31, 977–994 (2016). https://doi.org/10.1557/jmr.2016.80
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1557/jmr.2016.80