Skip to main content

Adopting XML for Large-Scale Information

  • Chapter
  • First Online:
Communicating with XML
  • 1676 Accesses

Abstract

This book has presented many different ways to encode information in XML format and the purposes for doing so. In this concluding chapter we consider problems related to managing XML information assets and the methods available to address those problems. Approaches for persistently storing XML data can be divided into file storage and database storage, and the research community has been especially active in designing new solutions for XML databases. However, adoption of XML often means massive migration procedures from some legacy data into the XML format; examples of migration cases are given. While describing the ­problems related to adopting XML, we give examples of the kinds of data for which XML is not suitable. As a case study we consider the large scale adoption of XML in the public sector.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web. San Francisco, CA: Morgan-Kaufmann (2000).

    Google Scholar 

  2. Anderson, R., Day, D., Hennum, E.: Migrating HTML to DITA, Part 1: Simple steps to move from HTML to DITA (31 Jan 2005) http://www.ibm.com/developerworks/xml/library/x-dita8a/, Cited 19 Apr 2011.

  3. Baptista, J.: Pragmatic DITA on a budget. Proceedings of the 26th Annual ACM International Conference on Design Communication, SIGDOC’08 pp. 193–198. New York: ACM Press (2008).

    Book  Google Scholar 

  4. Bertino, E., Castano, S., Ferrari, E., Mesiti, M.: Controlled access and dissemination of XML documents. Proceedings of the 2nd International Workshop on Web Information and Data Management pp. 22 – 27. New York: ACM Press (1999).

    Google Scholar 

  5. Bhadkamkar, M., Farfán, F., Hristidis, V., Rangaswami, R.: Storing semi-structured data on disk drives. ACM Transactions on Storage, 5, 2, 6:1–6:35 (2009).

    Article  Google Scholar 

  6. Bourret, R.: XML Database Products (20 June 2010) http://www.rpbourret.com/xml/XMLDatabaseProds.htm, Cited 19 Apr 2011.

  7. Bray, T., Sperberg-McQueen, C.M. (eds): Extensible Markup Language (XML), W3C Working Draft (14 Nov 1996) http://www.w3.org/TR/WD-xml-961114.html, Cited 19 Apr 2011.

  8. Broberg, M.: A successful documentation management system using XML. Technical Communication 51, 4, 537–546 (2004).

    Google Scholar 

  9. Chen, M.: Factors affecting the adoption and diffusion of XML and Web services standards for E-business systems. International Journal on Human-Computer Studies 58, 3, 259–279 (2003).

    Article  Google Scholar 

  10. Chinaei, A. H., Chinaei, H. R., Tompa, F. W.: A unified conflict resolution algorithm, Proceedings of the 4th VLDB Workshop on Secure Data Management 2007 (SDM’07), Springer LNCS 4721, September 23, 2007, 1–17.

    Google Scholar 

  11. Clark, J.: Comparison of SGML and XML. W3C Note (15 Dec 1997) http://www.w3.org/TR/NOTE-sgml-xml-971215, Cited 19 Apr 2011.

  12. Conklin, J.: Hypertext: an introduction and survey. IEEE Computer 20, 9, 17–41 (1987).

    Article  Google Scholar 

  13. Draper, D.: Mapping between XML and relational data. In: Katz, H. (ed.) XQuery from the Experts: A Guide to the W3C XML Query Language, pp. 309–352. Addison-Wesley, Boston (2004).

    Google Scholar 

  14. Enterprise Technical Reference Model - Service Oriented Architecture (ETRM v. 5.0). Information Technology Division, Commonwealth of Massachusetts (16 Sep 2008) http://www.mass.gov, Cited 19 Apr 2011.

  15. Fahrenholz, S: SGML for electronic publishing at a technical society – Expectations meets reality. Markup Languages: Theory and Practice 1, 2, 1–30 (1999).

    Google Scholar 

  16. Geer, D.: Will binary XML speed network traffic? IEEE Computer 38, 4, 16–18 (2005).

    Article  Google Scholar 

  17. Glazier, D., Jenkins, T., Schaper, H.: Enterprise Content Management Technology. What You Need to Know, Waterloo, Ontario: Open Text Corp. (2005).

    Google Scholar 

  18. Goldman, R., McHugh, J., Widow, J.: From semistructured data to XML: Migrating the Lore model and query language. Proceedings of the International Workshop on the Web and Databases, WebDB’99 pp. 25–30. New York: ACM Press (1999).

    Google Scholar 

  19. Grabs,T., Böhm, K., Schek, H.-J.: XMLTM: Efficient transaction management for XML documents. Proceedings of the Eleventh International Conference on Information and Knowledge Management, CIKM’02, pp. 142–152. New York: ACM Press (2002).

    Google Scholar 

  20. Gray, J.: A conversation with Tim Bray. ACM Queue 3, 1, 16–25 (2005).

    Article  Google Scholar 

  21. Gudgin, M., Mendelsohn, N., Nottingham, M., Ruellan, H. (eds): XML-binary Optimized Packaging, W3C Recommendation (25 January 2005) http://www.w3.org/TR/xop10/, Cited 19 Apr 2011.

  22. Iwaihara, M., Hayashi, R., Chatvichienchai, S., Anutariya, C., Wuwongse, V.: Relevancy-based access control and its evaluation on versioned XML documents. ACM Transactions on Information and System Security 10, 1, 1–31 (2007).

    Article  Google Scholar 

  23. Kamps, J., Marx, M. de Rijke, M., Sigurbjörnsson, B.: Articulating information needs in XML query languages. ACM Transactions on Information Systems 24, 4, 407–436, 2006.

    Article  Google Scholar 

  24. Kangasharju, J. (ed): Efficient XML Interchange (EXI) Impacts, W3C Working Draft (03 Sep 2008) http://www.w3.org/TR/exi-impacts/, Cited 19 Apr 2011.

  25. Library and Archives Canada (LAC), Local Digital Format Registry (LDFR), File format guidelines for preservation and long-term access, Version 1.0 (October 2010) http://www.collectionscanada.gc.ca/obj/012018/f2/012018-2200-e.pdf, Cited 19 Apr 2011.

  26. Maier, D.: Database desiderata for an XML query language. QL’98 – The Query Language Workshop, W3C, (Boston, Dec. 1998) http://www.w3.org/TandS/QL/QL98/pp/maier.html, Cited 19 Apr 2011.

  27. McKemmish, S., Acland, G., Ward, N., Reed, B.: Describing records in context in the continuum: the Australian Recordkeeping Metadata Schema. Archivaria 48 (Fall 1999), 3–43 (1999).

    Google Scholar 

  28. New Zealand E-government Interoperability Framework (NZ e-GIF), Version 3.3., Introduction (February 2008) State Service Commission, http://www.e.govt.nz/standards/e-gif/e-gif-v-3-3/e-gif-v-3-3-complete.pdf, Cited 19 Apr 2011.

  29. Pons, A., Millet, J., Gijarro, E., Mainteiga, M.: Medical database migration using new XML Internet standard. Computers in Cardiology 26, 93–96 (1999).

    Google Scholar 

  30. Reuben, E.: Migrating records from proprietary software. Computers in Libraries 23, 6, 30–33 (2003).

    Google Scholar 

  31. SAGA Version 4.0, Standards und Architectures für E-Government-Anwendungen. Bundesministerium des Innern (March 2008) http://gsb.download.bva.bund.de/KBSt/SAGA/ SAGA_v4.0.pdf, Cited 19 Apr 2011.

  32. Salminen, A., Lehtovaara, M., Kauppinen, K.: Standardization of digital legislative documents – a case study. Proceedings of the Twenty-Ninth Hawaii International Conference on System Sciences 5, pp. 72–81. Los Alamitos, CA: IEEE Computer Society Press (1996).

    Google Scholar 

  33. Salminen, A., Lyytikäinen, V., Tiitinen, P., Mustajärvi, O.: Implementing digital government in the Finnish Parliament. In W. Huang, K. Siau, & K.K. Wei (eds), Electronic Government Strategies and Implementation (pp. 242–259). Hersley, PA: IDEA Group Publishing (2004).

    Google Scholar 

  34. Salminen, A., Tompa, F.W.: Grammars++ for modeling information in text. Information Systems 24, 1, 1–24 (1999).

    Article  Google Scholar 

  35. Salminen, A., Tompa, F.W.: Requirements for XML document database systems. Proceedings of the First Document Engineering Conference, DocEng’01, pp. 85–94. New York: ACM Press (1999).

    Google Scholar 

  36. Schneider, J., Kamiya, T. (eds): Efficient XML Interchange (EXI) Format 1.0, W3C Candidate Recommendation (8 December 2009) http://www.w3.org/TR/exi/, Cited 19 Apr 2011.

  37. Shah, R., Kesan, J., Kennis, A.: Implementing open standards: A case study of the Massachusetts open formats policy. Proceedings of the 2008 International Conference on Digital Government Research pp. 262–271. Los Angeles, CA: Digital Government Society of North America (2008).

    Google Scholar 

  38. Smithsonian Institution Archives. Archival presentation of Web resources. HTML to XHTML migration test technical considerations, Evaluation, and recommendations (1 July 2002) http://siarchives.si.edu/pdf/dollarrpt2.pdf, Cited 19 Apr 2011.

  39. TEI Consortium (eds.): TEI P5: Guidelines for Electronic Text Encoding and Interchange. 1.9.1 (Last updated March 5, 2011) http://www.tei-c.org/P5/. Cited 19 Apr 2011.

  40. TEI MI W 06 Migration Case Study Reports. http://www.tei-c.org/Activities/Workgroups/MI/miw06.xml, Cited 19 Apr 2011.

  41. Tyrväinen, P., Päivärinta, T., Salminen, A., Iivari, J.: Characterizing the evolving research on enterprise content management. European Journal of Information Systems 15, 6, 627–634 (2006).

    Article  Google Scholar 

  42. UK GovTalk, e-GIF Technical Standards Catalogue (Last updated 02 October 2009) http://interim.cabinetoffice.gov.uk/govtalk/schemasstandards/e-gif/tsc_rtf_and_pdf_versions.aspx, Cited 19 Apr 2011.

  43. Vakali, A., Catania, B., Maddalena, A.: XML data stores: emerging practice. IEEE Internet Computing 9, 2, 62–69 (2005).

    Article  Google Scholar 

  44. Vianu, V.: A Web odyssey: from Codd to XML. Proceedings of the 20th ACM Symposium on Principles of Database Systems pp. 1–15. New York: ACM Press (2001).

    Google Scholar 

  45. webMethods Tamino XML server. Software AG. http://www.softwareag.com/fr/images/SAG_TaminoXML_FS_Jul09-web_tcm46-5580.pdf, Cited 19 Apr 2011.

  46. W3C Issues XML1.0 as a Proposed Recommendation (30 December 1997) http://www.w3.org/Press/XML-PR, Cited 19 Apr 2011.

  47. Westermann, U., Klas, W.: An analysis of XML database solutions for the management of MPEG-7 media descriptions. ACM Computing Surveys 35, 4, 331–373 (2003).

    Article  Google Scholar 

  48. White, G., Kangasharju, J., Brutzman, D., Williams, S. (eds): Efficient XML Interchange Measurements Note, W3C Working Draft (25 July 2007), http://www.w3.org/TR/exi-measurements/, Cited 19 Apr 2011.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Airi Salminen .

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Salminen, A., Tompa, F. (2011). Adopting XML for Large-Scale Information. In: Communicating with XML. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-0992-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-0992-2_8

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4614-0991-5

  • Online ISBN: 978-1-4614-0992-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics