Skip to main content

Tree Mining

  • Reference work entry
Encyclopedia of Machine Learning
  • 341 Accesses

Definition

Tree mining is an instance of constraint-based pattern mining and studiesthe discovery of tree patterns in data that is represented as a tree structure or as a set of trees structures. Minimum frequency is the most studied constraint.

Motivation and Background

Tree mining is motivated by the availability of many types of data that can be represented as tree structures. There is a large variety in tree types, for instance, ordered trees, unordered trees, rooted trees, unrooted (free) trees, labeled trees, unlabeled trees, and binary trees; each of these has its own application areas. An example are trees in tree banks, which store sentences annotated with parse trees. In such data, it is not only of interest to find commonly occurring sets of words (for which frequent itemset miners could be used), but also to find commonly occurring parses of these words. Tree miners aim at finding patterns in this structured information. The patterns can be interesting in their own right,...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Asai, T., Abe, K., Kawasoe, S., Arimura, H., Satamoto, H., & Arikawa, S. (2002). Efficient substructure discovery from large semi-structured data. In Proceedings of the second SIAM international conference on data mining (pp. 158–174). SIAM.

    Google Scholar 

  • Berka, P. (1999). Workshop notes on discovery challenge PKDD-99 (Tech. Rep.). Prague, Czech Republic: University of Economics.

    Google Scholar 

  • Chalmers, R., & Almeroth, K. (2003). On the topology of multicast trees. In IEEE/ACM transactions on networking (Vol. 11, pp. 153–165). IEEE Press/ACM Press.

    Google Scholar 

  • Chi, Y., Nijssen, S., Muntz, R. R., & Kok, J. N. (2005). Frequent subtree mining—An overview. In Fundamenta Informaticae (Vol. 66, pp. 161–198). IOS Press.

    Google Scholar 

  • Marcus, M. P., Santorini, B., & Marcinkiewicz, M. A. (1993). Building a large annotated corpus of English: The Penn Treebank. In Computational linguistics (Vol. 19, pp. 313–330). MIT Press.

    Google Scholar 

  • Morell, V. (1996). TreeBASE: The roots of phylogeny. In Science (Vol. 273, p. 569).

    Google Scholar 

  • Punin, J., Krishnamoorthy, M., & Zaki, M. J. (2002). LOGML—log markup language for web usage mining. In WEBKDD 2001—mining web log data across all customers touch points. Third international workshop. Lecture notes in artificial intelligence (Vol. 2356, pp. 88–112). Springer.

    Google Scholar 

  • Sekine, S. (1998). Corpus-based parsing and sublanguages studies. Ph.D. dissertation. New York University, New York.

    Google Scholar 

  • Wang, K., & Liu, H. (1998). Discovering typical structures of documents: A road map approach. In Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval (pp. 146–154). ACM Press.

    Google Scholar 

  • Zaki, M. J. (2002). Efficiently mining frequent trees in a forest. In Proceedings of the 8th international conference knowledge discovery and data mining (KDD) (pp. 71–80). ACM Press.

    Google Scholar 

  • Zhang, S., & Wang, J. (2005). Frequent agreement subtree mining. http://aria.njit.edu/mediadb/fast/.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this entry

Cite this entry

Nijssen, S. (2011). Tree Mining. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_851

Download citation

Publish with us

Policies and ethics