Tabular Data Cleaning and Linked Data Generation with Grafterizer
Over the past several years the amount of published open data has increased significantly. The majority of this is tabular data, that requires powerful and flexible approaches for data cleaning and preparation in order to convert it into Linked Data. This paper introduces Grafterizer – a software framework developed to support data workers and data developers in the process of converting raw tabular data into linked data. Its main components include Grafter, a powerful software library and DSL for data cleaning and RDF-ization, and Grafterizer, a user interface for interactive specification of data transformations along with a back-end for management and execution of data transformations. The proposed demonstration will focus on Grafterizer’s powerful features for data cleaning and RDF-ization in a scenario using data about the risk of failure of transport infrastructure components due to natural hazards.
KeywordsOpen data Linked data Tabular data cleaning and preparation Data transformation
This work was partly funded by the European Commission within the following research projects: DaPaaS (FP7 610988), SmartOpenData (FP7 603824), InfraRisk (FP7 603960), and proDataMarket (H2020 644497).
- 1.Bizer, C., Heath, T., Berners-Lee, T.: Linked data-the story so far. Emerg. Concepts, Semant. Serv. Interoperability Web Appl. 205–227 (2009)Google Scholar
- 2.Rahm, E., Do, H.H.: Data cleaning: problems and current approaches. IEEE Bull. Data Eng. 23, 4 (2000)Google Scholar
- 3.Wickham, H.: Tidy Data. J. Stat. Softw. 59(10), 1–23 (2011). Web. 1 Mar. 2016Google Scholar
- 5.Skjæveland, M.G., Lian, E.H., Horrocks, I.: Publishing the Norwegian petroleum directorate’s factpages as semantic web data. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 162–177. Springer, Heidelberg (2013)CrossRefGoogle Scholar
- 6.Roman, D., Nikolov, N., Putlier, A., Sukhobok, D., Elvester, B., Berre, A., Ye, X., Dimitrov, M., Simov, A., Zarev, M., Moynihan, R., Roberts, B., Berlocher, I., Kim, S., Lee, T., Smith, A., Heath, T.: DataGraft: one-stop-shop for open data management. Semant. Web J. (SWJ) (2016, to appear). http://www.semantic-web-journal.net/system/files/swj1428.pdf