Linked Open Piracy: A Story about e-Science, Linked Data, and Statistics
- 1k Downloads
There is an abundance of semi-structured reports on events being written and made available on the World Wide Web on a daily basis. These reports are primarily meant for human use. A recent movement is the addition of RDF metadata to make automatic processing by computers easier. A fine example of this movement is the open government data initiative which, by representing data from spreadsheets and textual reports in RDF, strives to speed up the creation of geographical mashups and visual analytic applications. In this paper, we present a newly linked dataset and the method we used to automatically translate semi-structured reports on the Web to an RDF event model. We demonstrate how the semantic representation layer makes it possible to easily analyze and visualize the aggregated reports to answer domain questions through a SPARQL client for the R statistical programming language. We showcase our method on piracy attack reports issued by the International Chamber of Commerce (ICC-CCS). Our pipeline includes conversion of the reports to RDF, linking their parts to external resources from the linked open data cloud and exposing them to the Web.
KeywordsInformation extraction Metadata enrichment Linked data
This work has been carried out as a part of the Poseidon project and the Agora project. Work in the Poseidon project was done in cooperation with Thales Nederland, under the responsibilities of the Embedded Systems Institute (ESI). The Poseidon project is partially supported by the Dutch Ministry of Economic Affairs under the BSIK03021 program. The Agora project is funded by NWO in the CATCH programme, grant 640.004.801.We would like to thank Davide Ceolin, Juan Manuel Coleto, and Vincent Osinga for their significant contributions.We thank the ICC-CCS IMB and the NGA for providing the open piracy reports.
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
- 2.Bellare K, McCallum A (2007) Learning extractors from unlabeled text using relevant databases. In: Proceedings of sixth international workshop on information integration on the web (IIWeb-07), in conjunction with AAAI-07, July 23. AAAI Press, Vancouver, pp 10–16Google Scholar
- 3.Bensassi S, Martínez-Zarzoso I (2012) How costly is modern maritime piracy to the international community? Rev Int Econ (preprint)Google Scholar
- 4.Bizer C (2004) D2RQ—treating non-RDF databases as virtual RDF graphs. In: Proceedings of the 3rd international semantic web conference (ISWC2004)Google Scholar
- 5.Canisius S, Sporleder C (2007) Bootstrapping information extraction from field books. In: Proceedings of the 2007 joint meeting of the conference on empirical methods on natural language processing (EMNLP) and the conference on natural language learning (CoNLL), June 28–30. ACL, Prague, pp 827–836Google Scholar
- 6.Cohen WW (1995) Fast effective rule induction. In: Twelfth international conference on machine learning (ICML’95), pp 115–123Google Scholar
- 7.Crofts N, Doerr M, Gill T, Stead S, Stiff M (2008) Definition of the CIDOC conceptual reference model. Technical report, ICOM/CIDOC CRM Special Interest Group. version 4.2.5Google Scholar
- 9.Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1)Google Scholar
- 10.Hiebel G, Hanke K, Hayek I (2010) Methodology for CIDOC CRM based data integration with spatial data. In: 38th annual conference on computer applications and quantitative methods in archaeology. Granada, SpainGoogle Scholar
- 11.Jakob M, Vanĕk O, Pĕchouček M (2011) Using agents to improve international maritime transport security. IEEE Intell Syst:90–95Google Scholar
- 12.Kauppinen T, Gräler B (2012) Using the SPARQL package in R to handle Spatial Linked Data. http://www.linkedscience.org/tools/sparql-package-for-r/tutorial-on-sparql-package-for-r/
- 13.Lendvai P, Hunt S (2008) From field notes towards a knowledge base. In: Proceedings of the sixth international language resources and evaluation (LREC’08), 28–30 May 2008. European Language Resources Association (ELRA), Marrakech, pp 644–649Google Scholar
- 14.Li Ding DD, McGuinness DL, Hendler J, Magidson S (2009) The data-gov wiki: a semantic web portal for linked government data. In: 8th international semantic web conference (ISWC 2009)Google Scholar
- 15.Maletic J, Marcus A (2000) Data cleansing: beyond integrity analysis. In: Proceedings of the conference on information quality (IQ 2000), 20–22 Oct. Cambridge, pp 200–209Google Scholar
- 16.Omitola T, Koumenides C, Popov I, Yang Y, Salvadores M, Szomszor M, Berners-Lee T, Gibbins N, Hall W, Schraefel MC, Shadbolt N (2010) Put in your postcode, out comes the data: a case study. In: 7th extended semantic web conference (ESWC 2010)Google Scholar
- 18.Ramsey A (2011) Alternative approaches: land-based strategies to countering piracy off the coast of somalia. Technical report, Civil Military Fusion CentreGoogle Scholar
- 19.Shaw R, Troncy R, Hardman L (2009) Lode: linking open descriptions of events. In: 4th annual Asian semantic web conference (ASWC’09). Shanghai, ChinaGoogle Scholar
- 20.Tsilis T (2011) Counter piracy escort operations in the gulf of aden. Master’s thesis, Naval Postgraduate School, MontereyGoogle Scholar
- 21.UNOSAT / UNITAR. Spatial analysis of somali pirate attacks in 2009. http://www.unosat-maps.web.cern.ch/unosat-maps/SO/CE20100714SOM/UNOSAT_SOM_CE2010-PiracyAnalysis_Report_HR_v1.pdf, June 2010
- 22.Van Erp M (2010) Accessing natural history: discoveries in data cleaning, structuring, and retrieval. PhD thesis, Tilburg UniversityGoogle Scholar
- 23.van Erp M, Oomen J, Segers R, van den Akker C, Aroyo L, Jacobs G, Legêne, van der Meij L, van Ossenbruggen J, Schreiber G (2011) Automatic heritage metadata enrichment with historic events. In Museums and the Web 2011Google Scholar
- 25.van Hage WR, Wielemaker J, Schreiber G (2010) The space package: tight integration between space and semantics. Trans in GIS 14(2)Google Scholar
- 26.Wang Y (2011) Semantically-enhanced recommendations in cultural heritage. PhD thesis, Technische Universiteit EindhovenGoogle Scholar
- 27.Wielemaker J, Huang Z, van der Meij L (2008) SWI-prolog and the web, volume theory and practice of logic programming. Cambridge University Press, Cambridge, pp 363–392Google Scholar
- 28.Willems N, van Hage WR, de Vries G, Janssens J, Malaisé V (2010) An integrated approach for visual analysis of a multi-source moving objects knowledge base. Int J Geogr Inf Sci 24(9): 1–16Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.