Constraint Reasoning and Kernel Clustering for Pattern Decomposition with Scaling

  • Ronan LeBras
  • Theodoros Damoulas
  • John M. Gregoire
  • Ashish Sabharwal
  • Carla P. Gomes
  • R. Bruce van Dover
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6876)


Motivated by an important and challenging task encountered in material discovery, we consider the problem of finding K basis patterns of numbers that jointly compose N observed patterns while enforcing additional spatial and scaling constraints. We propose a Constraint Programming (CP) model which captures the exact problem structure yet fails to scale in the presence of noisy data about the patterns. We alleviate this issue by employing Machine Learning (ML) techniques, namely kernel methods and clustering, to decompose the problem into smaller ones based on a global data-driven view, and then stitch the partial solutions together using a global CP model. Combining the complementary strengths of CP and ML techniques yields a more accurate and scalable method than the few found in the literature for this complex problem.


Constraint Programming Dynamic Time Warping Basis Pattern Pattern Decomposition Missing Element 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Powder Diffract. File, JCPDS Internat. Centre Diffract. Data, PA (2004)Google Scholar
  2. 2.
    Barber, Z.H., Blamire, M.G.: High throughput thin film materials science. Mat. Sci. Tech. 24(7), 757–770 (2008)CrossRefGoogle Scholar
  3. 3.
    Barr, G., Dong, W., Gilmore, C.J.: Polysnap3: a computer program for analysing and visualizing high-throughput data from diffraction and spectroscopic sources. J. Appl. Cryst. 42, 965 (2009)CrossRefGoogle Scholar
  4. 4.
    Baumes, L.A., Moliner, M., Corma, A.: Design of a full-profile-matching solution for high-throughput analysis of multiphase samples through powder x-ray diffraction. Chem. Eur. J. 15, 4258 (2009)CrossRefGoogle Scholar
  5. 5.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, NY (2006)zbMATHGoogle Scholar
  6. 6.
    Damoulas, T., Henry, S., Farnsworth, A., Lanzone, M., Gomes, C.: Bayesian Classification of Flight Calls with a novel Dynamic Time Warping Kernel. In: ICMLA 2010, pp. 424–429. IEEE, Los Alamitos (2010)Google Scholar
  7. 7.
    Gervet, C., Hentenryck, P.V.: Length-lex ordering for set csps. In: AAAI (2006)Google Scholar
  8. 8.
    Gomes, C.P.: Computational Sustainability: Computational methods for a sustainable environment, economy, and society. The Bridge, NAE 39(4) (2009)Google Scholar
  9. 9.
    Gregoire, J.M., Dale, D., Kazimirov, A., DiSalvo, F.J., van Dover, R.B.: High energy x-ray diffraction/x-ray fluorescence spectroscopy for high-throughput analysis of composition spread thin films. Rev. Sci. Instrum. 80(12), 123905 (2009)CrossRefGoogle Scholar
  10. 10.
    Gregoire, J.M., Tague, M.E., Cahen, S., Khan, S., Abruna, H.D., DiSalvo, F.J., van Dover, R.B.: Improved fuel cell oxidation catalysis in pt1-xtax. Chem. Mater. 22(3), 1080 (2010)CrossRefGoogle Scholar
  11. 11.
    Hawkins, P., Stuckey, P.J.: Solving set constraint satisfaction problems using robdds. Journal of Artificial Intelligence Research 24, 109–156 (2005)CrossRefzbMATHGoogle Scholar
  12. 12.
    Van Hentenryck, P., Michel, L.: The steel mill slab design problem revisited. In: Trick, M.A. (ed.) CPAIOR 2008. LNCS, vol. 5015, pp. 377–381. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  13. 13.
    Holm, J., de Lichtenberg, K., Thorup, M.: Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity. In: STOC 1998, New York, NY, USA, pp. 79–89 (1998)Google Scholar
  14. 14.
    Long, C.J., Bunker, D., Karen, V.L., Li, X., Takeuchi, I.: Rapid identification of structural phases in combinatorial thin-film libraries using x-ray diffraction and non-negative matrix factorization. Rev. Sci. Instruments 80(103902) (2009)Google Scholar
  15. 15.
    Long, C.J., Hattrick-Simpers, J., Murakami, M., Srivastava, R.C., Takeuchi, I., Karen, V.L., Li, X.: Rapid structural mapping of ternary metallic alloy systems using the combinatorial approach and cluster analysis. Rev. Sci. Inst. 78 (2007)Google Scholar
  16. 16.
    Potyrailo, R.A., Maier, W.F.: Combinatorial and High-Throughput Discovery and Optimization of Catalysts and Materials. CRC Press, Boca Raton (2007)Google Scholar
  17. 17.
    Prosser, P., Unsworth, C.: A connectivity constraint using bridges. In: ECAI 2006, The Netherlands, pp. 707–708 (2006)Google Scholar
  18. 18.
    Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. Readings in Speech Recognition, 159 (1990)Google Scholar
  19. 19.
    Stockmeyer, L.J.: The set basis problem is np-complete. Technical Report Report No. RC-5431, IBM Watson Research Center, East Lansing, Michigan (1975)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Ronan LeBras
    • 1
  • Theodoros Damoulas
    • 1
  • John M. Gregoire
    • 2
  • Ashish Sabharwal
    • 3
  • Carla P. Gomes
    • 1
  • R. Bruce van Dover
    • 4
  1. 1.Dept. of Computer ScienceCornell UniversityIthacaUSA
  2. 2.School of Engr. and Applied SciencesHarvard UniversityCambridgeUSA
  3. 3.IBM Watson Research CenterYorktown HeightsUSA
  4. 4.Dept. of Materials Science and Engr.Cornell UniversityIthacaUSA

Personalised recommendations