Skip to main content

Discovering Rules That Govern Monotone Phenomena

  • Chapter

Part of the book series: Massive Computing ((MACO,volume 6))

Abstract

Unlocking the mystery of natural phenomena is a universal objective in scientific research. The rules governing a phenomenon can most often be learned by observing it under a sufficiently large number of conditions that are sufficiently high in resolution. The general knowledge discovery process is not always easy or efficient, and even if knowledge is produced it may be hard to understand, interpret, validate, remember, and use. Monotonicity is a pervasive property in nature: it applies when each predictor variable has a nonnegative effect on the phenomenon under study. Due to the monotonicity property, being able to observe the phenomenon under specifically selected conditions may increase the accuracy and completeness of the knowledge at a faster rate than a passive observer who may not receive the pieces relevant to the puzzle soon enough. This scenario can be thought of as learning by successively submitting queries to an oracle which responds with a Boolean value (phenomenon is present or absent). In practice, the oracle may take the shape of a human expert, or it may be the outcome of performing tasks such as running experiments or searching large databases. Our main goal is to pinpoint the queries that minimize the total number of queries used to completely reconstruct all of the underlying rules defined on a given finite set of observable conditions V = {0,1}n. We summarize the optimal query selections in the simple form of selection criteria, which are near optimal and only take polynomial time (in the number of conditions) to compute. Extensive unbiased empirical results show that the proposed selection criterion approach is far superior to any of the existing methods. In fact, the average number of queries is reduced exponentially in the number of variables n and more than exponentially in the oracle’s error rate.

Triantaphyllou, E. and G. Felici (Eds.), Data Mining and Knowledge Discovery Approaches Based on Rule Induction Techniques, Massive Computing Series, Springer, Heidelberg, Germany, pp. 149–192, 2006.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • M. Ayer, H.D. Brunk, G.M. Ewing, W.T. Reid, and E. Silverman, “An Empirical Distribution Function for Sampling with Incomplete Information,” Annals of Mathematical Statistics, Vol. 26, pp. 641–647, 1955.

    MATH  MathSciNet  Google Scholar 

  • A. Ben-David. “Automatic Generation of Symbolic Multiattribute Ordinal Knowledge-Based DSSs: Methodology and Applications,” Decision Sciences, Vol. 23, No. 6, pp. 1357–1372, 1992.

    Google Scholar 

  • A. Ben-David, “Monotonicity Maintenance in Information-Theoretic Machine Learning Algorithms,” Machine Learning, Vol. 19, No. 1, pp. 29–43, 1995.

    Google Scholar 

  • J.C. Bioch and T. Ibaraki, “Complexity of Identification and Dualization of Positive Boolean Functions,” Information and Computation, Vol. 123 pp. 50–63, 1995.

    Article  MathSciNet  MATH  Google Scholar 

  • D.A. Bloch and B. W. Silverman, “Monotone Discriminant Functions and Their Applications in Rheumatology,” Journal of the American Statistical Association, Vol. 92, No. 437, pp. 144–153, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  • H. Block, S. Qian, and A. Sampson, “Structure Algorithms for Partially Ordered Isotonic Regression,” Journal of Computational and Graphical Statistics, Vol. 3, No. 3, pp. 285–300, 1994.

    Article  MathSciNet  Google Scholar 

  • E. Boros, P.L. Hammer, and J.N. Hooker, “Predicting Cause-Effect Relationships from Incomplete Discrete Observations,” SIAM Journal on Discrete Mathematics, Vol. 7, No. 4, pp. 531–543, 1994.

    Article  MATH  MathSciNet  Google Scholar 

  • E. Boros, P.L. Hammer, and J.N. Hooker, “Boolean Regression,” Annals of Operations Research, Vol. 58, pp. 201–226, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  • E. Boros, P.L. Hammer, T. Ibaraki., and K. Makino, “Polynomial-Time Recognition of 2-Monotonic Positive Boolean Functions Given by an Oracle,” SIAM Journal on Computing, Vol. 26, No. 1, pp. 93–109, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  • V. Chandru and J.N. Hooker, “Optimization Methods for Logical Inference,” John Wiley & Sons, New York, NY, USA, 1999.

    Google Scholar 

  • R. Church, “Numerical Analysis of Certain Free Distributive Structures,” Duke Mathematical Journal, Vol. 6, pp. 732–734, 1940.

    Article  MATH  MathSciNet  Google Scholar 

  • R. Church, “Enumeration by Rank of the Free Distributive Lattice with 7 Generators,” Notices of the American Mathematical Society, Vol. 11 pp. 724, 1965.

    Google Scholar 

  • D. A. Cohn, “Neural Network Exploration Using Optimal Experiment Design,” Neural Networks, Vol. 9, No. 6, pp. 1071–1083, 1996.

    Article  Google Scholar 

  • D.A. Cohn, “Minimizing Statistical Bias with Queries,” A.I. Memo No. 1552, Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA, 1995.

    Google Scholar 

  • T.H. Cormen, C.H. Leiserson, and R.L. Rivest, “Introduction to Algorithms,” The MIT Press, Cambridge, MA, USA, 1997.

    MATH  Google Scholar 

  • R. Dedekind, R, “Ueber Zerlegungen von Zahlen durch ihre Grössten Gemeinsamen Teiler,” Festschrift Hoch. Brauhnschweig u. ges Werke II, pp. 103–148, 1897.

    Google Scholar 

  • T. Eiter and G. Gottlob, “Identifying the Minimal Transversals of a Hypergraph and Related Problems,” SIAM Journal on Computing, Vol. 24, No. 6, pp. 1278–1304, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  • K. Engel. Encyclopedia of Mathematics and its Applications 65: Sperner Theory,” Cambridge University Press, Cambridge, MA, USA, 1997.

    Google Scholar 

  • V.V. Federov, “Theory of Optimal Experiments,” Academic Press, New York, NY, USA, 1972.

    Google Scholar 

  • G. Felici and K. Truemper, “A MINSAT Approach for Learning in Logic Domains,” INFORMS Journal on Computing, Vol. 14, No. 1, pp. 20–36, 2002.

    Article  MathSciNet  Google Scholar 

  • M.L. Fredman and L. Khachiyan, “On the Complexity of Dualization of Monotone Disjunctive Normal Forms,” Journal of Algorithms, Vol. 21, pp. 618–628, 1996.

    Article  MATH  MathSciNet  Google Scholar 

  • D.N. Gainanov, “On One Criterion of the Optimality of an Algorithm for Evaluating Monotonic Boolean Functions,” U.S.S. R. Computational Mathematics and Mathematical Physics, Vol. 24, No. 4, pp. 176–181, 1984.

    Article  MathSciNet  Google Scholar 

  • G. Hansel, “Sur Le Nombre Des Foncions Booleenes Monotones De n Variables,” C R. Acad. Sc. Paris, Vol. 262, pp. 1088–1090, 1966.

    MATH  MathSciNet  Google Scholar 

  • J.N. Hooker, “Logic Based Methods for Optimization,” John Wiley & Sons, New York, NY, USA, 2000.

    MATH  Google Scholar 

  • D.G. Horvitz and D.J. Thompson, “A Generalization of Sampling without Replacement from a Finite Universe,” Journal of the American Statistical Association, Vol. 47, pp. 663–685, 1952.

    Article  MATH  MathSciNet  Google Scholar 

  • D.H. Judson, “On the Inference of Semi-coherent Structures from Data,” A Master’s Thesis, University of Nevada, Reno, NV, USA, 1999.

    Google Scholar 

  • D.H. Judson, “A Partial Order Approach to Record Linkage,” Federal Committee on Statistical Methodology Conference, November 14–16, Arlington, VA, USA, 2001.

    Google Scholar 

  • A.V. Karzanov, “Determining the Maximal Flow in a Network by the Method of Preflows,” Soviet Mathematics Doklady, Vol. 15, pp. 434–437, 1974.

    MATH  Google Scholar 

  • A.D. Korshunov, On the Number of Monotone Boolean Functions,” Problemy Kibernetiki, Vol. 38, pp. 5–108, 1981 (in Russian).

    MATH  MathSciNet  Google Scholar 

  • B. Kovalerchuk, E. Triantaphyllou, and A.S. Deshpande, “Interactive Learning of Monotone Boolean Functions,” Information Sciences, Vol. 94, pp. 87–118, 1996.

    Article  Google Scholar 

  • B. Kovalerchuk, E. Triantaphyllou, J.F. Ruiz, V.I. Torvik, and E. Vitayev, “The Reliability Issue of Computer-Aided Breast Cancer Diagnosis,” Computers and Biomedical Research, Vol. 33, pp. 296–313, 2000a.

    Article  Google Scholar 

  • B. Kovalerchuk, B. and E. Vityaev, “Data Mining in Finance,” Kluwer Academic Publishers, Boston, MA, USA, 2000b.

    MATH  Google Scholar 

  • C.I.C. Lee, “The min-max Algorithm and Isotonic Regression,” The Annals of Statistics, Vol. 11, pp. 467–477, 1983.

    MATH  MathSciNet  Google Scholar 

  • D.J.C. MacKay, “Information-based Objective Functions for Active Data Selection,” Neural Computation, Vol. 4, No. 4, pp. 589–603, 1992.

    Google Scholar 

  • K. Makino and T. Ibaraki, “A Fast and Simple Algorithm for Identifying 2-Monotonic Positive Boolean Functions,” Proceedings of ISAACS’95, Algorithms and Computation, Springer-Verlag, Berlin, Germany, pp. 291–300, 1995.

    Chapter  Google Scholar 

  • K. Makino and T. Ibaraki, “The Maximum Latency and Identification of Positive Boolean Functions,” SIAM Journal on Computing, Vol. 26, No. 5, pp. 1363–1383, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  • K. Makino, T. Suda, H. Ono, and T. Ibaraki, “Data Analysis by Positive Decision Trees. IEICE Transactions on Information and Systems, Vol. E82-D, No. 1, pp. 76–88, 1999.

    Google Scholar 

  • S. Nieto-Sanchez, E. Triantaphyllou, J. Chen, and T.W. Liao, “An Incremental Learning Algorithm for Constructing Boolean Functions From Positive and Negative Examples,” Computers and Operations Research, Vol. 29, No. 12, pp. 1677–1700, 2002.

    Article  MathSciNet  Google Scholar 

  • J.C. Picard, “Maximal Closure of a Graph and Applications to Combinatorial Problems,” Management Science, Vol. 22, pp. 1268–1272, 1976.

    Article  MATH  MathSciNet  Google Scholar 

  • T. Robertson, F.T. Wright, and R.L. Dykstra, “Order Restricted Statistical Inference. John Wiley & Sons, New York, NY, USA, 1988

    MATH  Google Scholar 

  • I. Shmulevich, “Properties and Applications of Monotone Boolean Functions and Stack Filters,” A Ph.D. Dissertation, Department of Electrical Engineering, Purdue University, West Lafayette, IN, USA, 1997.

    Google Scholar 

  • N.A. Sokolov, “On the Optimal Evaluation of Monotonic Boolean Functions,” U.S.S.R. Computational Mathematics and Mathematical Physics,” Vol. 22, No. 2, pp. 207–220, 1982.

    Article  MATH  Google Scholar 

  • C. Tatsuoka and T. Ferguson, “Sequential Classification on Partially Ordered Sets,” Technical Report 99-05, Department of Statistics, The George Washington University, Washington, D.C., USA, 1999.

    Google Scholar 

  • E. Triantaphyllou, “Inference of a Minimum Size Boolean Function by Using a New Efficient Branch-and-Bound Approach from Examples,” Journal of Global Optimization, Vol. 5, pp. 69–84, 1994.

    Article  MATH  MathSciNet  Google Scholar 

  • E. Triantaphyllou and A.L. Soyster, “An Approach to Guided Learning of Boolean Functions,” Mathematical and Computer Modelling, Vol. 23, No. 3, pp 69–86, 1996a.

    Article  MATH  MathSciNet  Google Scholar 

  • E. Triantaphyllou and A.L. Soyster, “On the Minimum Number of Logical Clauses Which Can be Inferred From Examples,” Computers and Operations Research, Vol. 23, No. 8, pp. 783–799, 1996b.

    Article  MATH  MathSciNet  Google Scholar 

  • V.I. Torvik, E. Triantaphyllou, T.W. Liao and S.W. Waly, “Predicting Muscle Fatigue via Electromyography: A Comparative Study,” Proceedings of the 25th International Conference of Computers and Industrial Engineering, pp. 277–280, 1999.

    Google Scholar 

  • V.I. Torvik and E. Triantaphyllou, “Minimizing the Average Query Complexity of Learning Monotone Boolean Functions,” INFORMS Journal on Computing, Vol. 14, No. 2, pp. 144–174, 2002.

    Article  MathSciNet  Google Scholar 

  • V.I. Torvik and E. Triantaphyllou, “Guided Inference of Nested Monotone Boolean Functions. Information Sciences, Vol. 151, 171–200, 2003.

    Article  MATH  MathSciNet  Google Scholar 

  • V.I. Torvik, M. Weeber, D.R. Swanson, and N.R. Smalheiser, “A Probabilistic Similarity Metric for Medline Records: A Model for Author Name Disambiguation,” J. of the Amer. Soc. For Info. Sci. and Tech. (JASIST), Vol. 56, No. 2, pp. 140–158, 2005.

    Article  Google Scholar 

  • V.I. Torvik and E. Triantaphyllou, “Guided Inference of Stochastic Monotone Boolean Functions”, Working Paper, 2004.

    Google Scholar 

  • L.G. Valiant, “A Theory of the Learnable,” Communications of the ACM, Vol. 27, No. 11, pp. 1134–1142, 1984.

    Article  MATH  Google Scholar 

  • M. Ward, “Note on the Order of the Free Distributive Lattice,” Bulletin of the American Mathematical Society, Vol. 52, No. 135, pp. 423, 1946.

    MathSciNet  Google Scholar 

  • D. Wiedemann, “A Computation of the Eight Dedekind Number,” Order, Vol. 8, pp. 5–6, 1991.

    Article  MATH  MathSciNet  Google Scholar 

  • E. Yilmaz, E. Triantaphyllou, J. Chen, and T.W. Liao, “A Heuristic for Mining Association Rules In Polynomial Time,” Mathematical and Computer Modelling, Vol. 37, No. 1–2, pp. 219–233, 2003.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Torvik, V.I., Triantaphyllou, E. (2006). Discovering Rules That Govern Monotone Phenomena. In: Triantaphyllou, E., Felici, G. (eds) Data Mining and Knowledge Discovery Approaches Based on Rule Induction Techniques. Massive Computing, vol 6. Springer, Boston, MA . https://doi.org/10.1007/0-387-34296-6_4

Download citation

  • DOI: https://doi.org/10.1007/0-387-34296-6_4

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-34294-8

  • Online ISBN: 978-0-387-34296-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics