ISMIS 2008: Foundations of Intelligent Systems pp 417-423 | Cite as
Development of the XML Digital Library from the Parliament of Andalucía for Intelligent Structured Retrieval
Abstract
This paper describes the development of the XML digital library in Spanish from official documents published by Parliament of Andalucía. These documents include discussions about some important matters affecting citizens from the southern Spanish region of Andalucía. The original documents, which are organized around a very well defined structure, were published in PDF format, so the complete conversion process is explained in detail. The main reason for this format change is to allow the users of the regional chamber’s website to make the most of the interesting advantages given by the structured Information Retrieval.
Preview
Unable to display preview. Download preview PDF.
References
- 1.Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval. ACM Press/Addison-Wesley (1999)Google Scholar
- 2.Chiaramella, Y.: Information retrieval and structured documents. In: Lectures on information retrieval, pp. 286–309 (2001)Google Scholar
- 3.de Campos, L., Fernández-Luna, J., Huete, J., Romero, A.: Garnata: An information retrieval system for structured documents based on probabilistic graphical models. In: Proc. of the 11th Int. Conf. of Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU), pp. 1024–1031 (2006)Google Scholar
- 4.Jensen, F.V.: Bayesian Networks and Decision Graphs. Springer, Heidelberg (2001)MATHGoogle Scholar
- 5.Déjean, H., Meunier, J.L.: A system for converting pdf documents into structured xml format. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 129–140. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 6.Gurcan, A., Khramov, Y., Kroogman, A., Mansfield, P.: Converting pdf to xml with publication-specific profiles. In: Proc. of the XML Conf. (2003)Google Scholar
- 7.Hardy, M.R.B., Brailsford, D.F.: Mapping and displaying structural transformations between xml and pdf. In: DocEng., pp. 95–102 (2002)Google Scholar
- 8.Yildiz, B., Kaiser, K., Miksch, S.: pdf2table: A method to extract table information from PDF files. In: 2nd Indian Int. Conf. on AI, Pune (2005)Google Scholar
- 9.Shachter, R.D.: Probabilistic inference and influence diagrams. Oper. Res. 36(4), 589–604 (1988)MATHCrossRefGoogle Scholar