Evaluate, Reorganize and Share: An Approach to Dynamically Organize Digital Hierarchies

Senra, Rodrigo Dias Arruda; Medeiros, Claudia Bauzer

doi:10.1007/s13740-014-0035-7

Evaluate, Reorganize and Share: An Approach to Dynamically Organize Digital Hierarchies

Original Article
Published: 06 March 2014

Volume 3, pages 225–236, (2014)
Cite this article

Journal on Data Semantics

Rodrigo Dias Arruda Senra¹ &
Claudia Bauzer Medeiros¹

344 Accesses
Explore all metrics

Abstract

We are overwhelmed and overloaded with the data deluge brought by the digital age. Hierarchies are pervasive cognitive patterns that allow us to reorganize data and reduce the dimensionality of the information space to manageable levels (e.g., filesystems and navigational menus). In spite of their widespread adoption, such hierarchies can be improved to cope with the present needs of data sharing and reuse. First, we seldom use mechanisms to evaluate how well they partition the information space. Second, we build static and content-driven hierarchies instead of dynamic and context-driven (i.e., task-driven) ones. Third, we use ad hoc and implicit hierarchization criteria, whereas they should be explicit and shareable. This paper discusses the problems related to the construction of hierarchies, and presents a conceptual framework to turn them into reconfigurable and shareable artifacts. Moreover, it explores how dynamically reconfigurable hierarchies can better cope with the multi-faceted nature of content, illustrating these principles through a tool that validates our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Guidelines for Architecture Models as Boundary Objects

Towards a Class-Based Model of Information Organization in Wikipedia

Multilevel Self-organization in Smart Environment: Approach and Major Technologies

References

Acm CCS (2010) Acm’s computing classification system (ccs). http://www.acm.org/about/class/1998
Baker L, McCallum A (1998) Distributional clustering of words for text classification. In: ACM SIGIR’98: Proceedings of the 21st annual international conference on research and development in information retrieval. ACM, pp 96–103
Berman F (2008) Got data?: a guide to data preservation in the information age. Commun ACM 51:50–56
Article Google Scholar
Blei DM (2012) Probabilistic topic models. Commun ACM 55(4):77–84
Article MathSciNet Google Scholar
Bloehdorn S, Cimiano P, Hotho A (2005) Learning ontologies to improve text clustering and classification. In: Proceeding of the 29th annual conference of the German classification society (GfKl), Magdeburg, Germany, pp 334–341
Crescenzi V, Mecca G (2004) Automatic information extraction from large websites. J ACM (JACM) 51(5):731–779
Article MATH MathSciNet Google Scholar
Dekel O, Keshet J, Singer Y (2004) Large margin hierarchical classification. J Am Stat Assoc 104(487):1213
Google Scholar
Dumais S, Chen H (2000) Hierarchical classification of web content. In: ACM SIGIR’00: proceedings of the 23rd annual Iinternational conference on research and development in information retrieval. ACM, pp 256–263
Fernandes A, Moura AMDC, Porto F (2003) An ontology-based approach for organizing, sharing, and querying knowledge objects on the web. In: DEXA’03: proceedings of the 14th international workshop on database and expert systems applications. IEEE, pp 604–609
Fisher DH (1987) Knowledge acquisition via incremental conceptual clustering. Mach Learn 2(2):139–172
Google Scholar
Gates S, Teiken W, Cheng K (2005) Taxonomies by the numbers: building high-performance taxonomies. In: proceedings of the 14th ACM international conference on information and knowledge management. ACM, pp 568–577
Hua Y, Jiang H, Zhu Y, Feng D, Tian L (2012) Semantic-aware metadata organization paradigm in next-generation file systems. IEEE Trans Parallel Distrib Syst 23(2):337–344
Article Google Scholar
Irmak U, Kraft R (2010) A scalable machine-learning approach for semi-structured named entity recognition. In: Proceeings of the 19th international conference on World Wide Web. ACM, pp 461–470
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice Hall, Englewood Cliffs, NJ, USA
MATH Google Scholar
Joachims T (1997) A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. In: Machine learning international workshop, pp 143–151
Kiritchenko S, Matwin S, Nock R, Famili AF (2006) Learning and evaluation in the presence of class hierarchies : application to text categorization. In: Proceedings of the 19th Canadian conference on artificial intelligence
Kohonen T, Kaski S, Lagus K, Salojarvi J, Honkela J, Paatero V, Saarela A (2000) Self organization of a massive document collection. IEEE Trans Neural Netw 11(3):574–585
Article Google Scholar
Koller D, Sahami M (1997) Hierarchically classifying documents using very few words. In: ICML’97: proceedings of the 14th international conference on machine learning. Morgan Kaufmann, pp 170–178
Köorner C, Benz D, Hotho A, Strohmaier M (2010) Stop thinking, start tagging: tag semantics emerge from collaborative verbosity. In: Proceedings of the 19th international conference on World Wide Web. ACM, pp 521–530
Laender AHF, Ribeiro-Neto BA, da Silva AS, Teixeira JS (2002) A brief survey of web data extraction tools. ACM Sigmod Rec 31(2):84–93
Article Google Scholar
Liu J, Yu S, Le J (2005) Dynamic mining hierarchical topic from web news stream data using divisive-agglomerative clustering method. In: PAKDD’05: proceeding of the 9th Pacific-Asia conference on advances in knowledge discovery and data mining. Springer, Berlin, pp 826–831
McCallum A, Nigam K (1998) A comparison of event models for naive bayes text classification. In: AAAI’98: workshop on learning for text categorization, vol 752, pp 41–48
Michalski RS (1980) Knowledge acquisition through conceptual clustering: a theoretical framework and an algorithm for partitioning data into conjunctive concepts. J Policy Anal Info Syst 4(3):219–244
MathSciNet Google Scholar
Miller GA (1956) The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychol Rev 63(2):81–97
Article Google Scholar
Mishra N, Motwani R (2004) Introduction: special issue on theoretical advances in data clustering. Mach Learn 56(1–3):5–7
Article Google Scholar
Pant G, Srinivasan P (2005) Learning to crawl : comparing classification schemes. ACM Trans Info Syst 23(4):430–462
Article Google Scholar
Popitsch N, Schandl B (2010) Ad-hoc file sharing using linked data technologies. In: PSD’10: proceedings of the international workshop on personal semantic data
Qi X, Davison BD (2009) Web page classification. ACM Comput Surv 41(2):1–31
Article Google Scholar
Řehůřek R., Sojka P (2010) Software framework for topic modelling with large corpora. In: LREC’10: proceedings of the workshop on new challenges for NLP frameworks. ELRA, pp 45– 50
Schütze H, Hull DA, Pedersen JO (1995) A comparison of classifiers and document representations for the routing problem. In: ACM SIGIR’95: proceedings of the 18th annual international conference on research and development in information retrieval. ACM, pp 229–237
Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv (CSUR) 34(1):1–47
Google Scholar
Senra RDA, Medeiros CB (2011) Organographs - multi-faceted hierarchical categorization of web documents. In: WEBIST’11: proceedings of the 7th international conference on web information systems and technologies, pp 583–588
Sneath P, Sokal R (1973) Numerical taxonomy. The principles and practice of numerical classification. W. H. Freeman and Company, San Francisco, pp xv + 573. ISBN 0-7167-0697-0
Turmo J, Ageno A, Català N (2006) Adaptive information extraction. ACM Comput Surv (CSUR) 38(2):4
Article Google Scholar
Weigend A, Wiener E, Pedersen J (1999) Exploiting hierarchy in text categorization. Inf Retr 1(3):193–216
Article Google Scholar
Xu J, Dichev C, Esterline A (2009) On the Effectiveness of collaborative tagging systems for describing resources. In: WRI’09: proceedings of the world congress on computer science and information engineering, vol 4. IEEE Computer Society, pp 467–471
Yang Y, Liu X (1999) A re-examination of text categorization methods. In: ACM SIGIR’99: proceedings of the 22nd annual international conference on research and development in, information retrieval, pp 42–49

Download references

Acknowledgments

This work was supported by the Microsoft Research FAPESP Virtual Institute (NavScales project), the Center for Computational Engineering and Sciences—Fapesp/Cepid 2013/08293-7, CNPq (MuZOO Project and PRONEX-FAPESP), INCT in Web Science(CNPq 557.128/2009-9) and CAPES. We also thank all LIS members from IC-Unicamp for their comments and suggestions. Last but not least, we thank the JODS reviewers for their valuable suggestions.

Author information

Authors and Affiliations

Institute of Computing, University of Campinas (UNICAMP), Campinas, SP, Brazil
Rodrigo Dias Arruda Senra & Claudia Bauzer Medeiros

Authors

Rodrigo Dias Arruda Senra
View author publications
You can also search for this author in PubMed Google Scholar
Claudia Bauzer Medeiros
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rodrigo Dias Arruda Senra.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Senra, R.D.A., Medeiros, C.B. Evaluate, Reorganize and Share: An Approach to Dynamically Organize Digital Hierarchies. J Data Semant 3, 225–236 (2014). https://doi.org/10.1007/s13740-014-0035-7

Download citation

Received: 25 October 2012
Revised: 08 September 2013
Accepted: 29 January 2014
Published: 06 March 2014
Issue Date: December 2014
DOI: https://doi.org/10.1007/s13740-014-0035-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluate, Reorganize and Share: An Approach to Dynamically Organize Digital Hierarchies

Abstract

Access this article

Similar content being viewed by others

Guidelines for Architecture Models as Boundary Objects

Towards a Class-Based Model of Information Organization in Wikipedia

Multilevel Self-organization in Smart Environment: Approach and Major Technologies

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Evaluate, Reorganize and Share: An Approach to Dynamically Organize Digital Hierarchies

Abstract

Access this article

Similar content being viewed by others

Guidelines for Architecture Models as Boundary Objects

Towards a Class-Based Model of Information Organization in Wikipedia

Multilevel Self-organization in Smart Environment: Approach and Major Technologies

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation