Knowledge and Information Systems

, Volume 53, Issue 3, pp 699–722 | Cite as

CITIESData: a smart city data management framework

  • Xiufeng LiuEmail author
  • Alfred Heller
  • Per Sieverts Nielsen
Regular Paper


Smart city data come from heterogeneous sources including various types of the Internet of Things such as traffic, weather, pollution, noise, and portable devices. They are characterized with diverse quality issues and with different types of sensitive information. This makes data processing and publishing challenging. In this paper, we propose a framework to streamline smart city data management, including data collection, cleansing, anonymization, and publishing. The paper classifies smart city data in sensitive, quasi-sensitive, and open/public levels and then suggests different strategies to process and publish the data within these categories. The paper evaluates the framework using a real-world smart city data set, and the results verify its effectiveness and efficiency. The framework can be a generic solution to manage smart city data.


Data framework Smart cities Data privacy Data quality Data sensitivity 



This research was supported by the CITIES Project (No. 1035-00027B) funded by Innovation Fund Denmark. The infrastructure components are partly supported by the Danish Electronic Infrastructure (DeIC) through the project “Science Cloud for Cities.”


  1. 1.
    Barnaghi P, Bermudez-Edo M, Tonjes R (2015) Challenges for quality of data in smart cities. J Data Inf Qual 6(2–3):6Google Scholar
  2. 2.
    Bischof S, Karapantelakis A, Nechifor CS, Sheth A, Mileo A, Barnaghi P (2014) Semantic modelling of smart city data. In: W3C workshop on the web of things—enablers and services for an open web of devices. W3CGoogle Scholar
  3. 3.
    Bischof S, Polleres A, Sperl S (2013) City data pipeline In: Proceedings of the I-SEMANTICS posters and demonstrations track, p 45Google Scholar
  4. 4.
    Bovee M, Srivastava RP, Mak B (2003) A conceptual framework and belief-function approach to assessing overall information quality. Int J Intell Syst 18(1):51–74CrossRefzbMATHGoogle Scholar
  5. 5.
    Cappiello C, Francalanci C, Pernici B (2003) Time-related factors of data quality in multichannel information systems. J Manag Inf Syst 20(3):71–91CrossRefGoogle Scholar
  6. 6.
    Carpineto C, Romano G (2015) K\(\theta \)-affinity privacy: releasing infrequent query refinements safely. Inf Process Manag 51(2):74–88CrossRefGoogle Scholar
  7. 7.
    Darari F, Manurung R (2011) LinkedLab: a linked data platform for research communities. In: Advanced computer science and information system (ICACSIS), pp 253–258Google Scholar
  8. 8.
    Fung B, Wang K, Chen R, Yu PS (2010) Privacy-preserving data publishing: a survey of recent developments. ACM Comput Surv (CSUR) 42(4):14CrossRefGoogle Scholar
  9. 9.
    Gao F, Ali MI, Mileo A (2014) Semantic discovery and Integration of urban data streams In: Proceedings of the 5th workshop on semantics for smarter cities, pp 15–30Google Scholar
  10. 10.
    Glasmeier A, Christopherson S (2015) Thinking about smart cities. Camb J Reg Econ Soc 8(1):3–12CrossRefGoogle Scholar
  11. 11.
    Gubbi J, Buyya R, Marusic S, Palaniswami M (2013) Internet of things (IoT): a vision, architectural elements, and future directions. Future Gener Comput Syst 29(7):1645–1660CrossRefGoogle Scholar
  12. 12.
    Haslhofer B, Schandl B (2008) The OAI2LOD server: exposing OAI-PMH metadata as linked data. In: Proceedings of WWW workshop linked data on the webGoogle Scholar
  13. 13.
    He Q, Antón AI (2003) A framework for modeling privacy requirements in role engineering. In: Proceedings of REFSQ, pp 137–146Google Scholar
  14. 14.
    Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of ICDE, pp 106–115Google Scholar
  15. 15.
    Li T, Li N (2009) On the tradeoff between privacy and utility in data publishing. In: Proceedings of SIGKDD, pp 517–526Google Scholar
  16. 16.
    Liu X, Nielsen PS (2015) Streamlining smart meter data analytics. In: Proceedings of the 10th conference on sustainable development of energy, water and environment systems. SDEWES2015.0558, pp 1–14Google Scholar
  17. 17.
    Liu X, Nielsen PS (2016) An ICT-solution for smart meter data analytics. Energy 115(3):1710–1722CrossRefGoogle Scholar
  18. 18.
    Lopez V, Kotoulas S, Sbodio ML, Stephenson M, Gkoulalas-Divanis A, Aonghusa PM (2012) QuerioCity: a linked data platform for urban information management. The semantic web, pp 148–163Google Scholar
  19. 19.
    Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M (2013) l-diversity: privacy beyond k-anonymity. ACM Trans Knowl Discov Data 1(1):3CrossRefGoogle Scholar
  20. 20.
    Malin B (2008) k-unlinkability: a privacy protection model for distributed data. Data Knowl Eng 64(1):294–311CrossRefGoogle Scholar
  21. 21.
    Manville C, Cochrane G, Cave J et al (2014) Mapping smart cities in the EU[J]. European Parliament; Directorate general for internal policies, policy department economic and scientific policy AGoogle Scholar
  22. 22.
    Navarro-Arribas G, Torra V, Erola A, Castella-Roca J (2012) User k-anonymity for privacy preserving data mining of query logs. Inf Process Manag 48(3):476–487CrossRefGoogle Scholar
  23. 23.
    Parreira JX, Dhungana D, Engelbrecht G (2015) The role of RDF stream processing in an smart city ICT infrastructure–the Aspern smart city use case. The semantic web: ESWC 2015 satellite events, pp 343–352Google Scholar
  24. 24.
    Pipino L, Lee YW, Wang RY (2012) Data quality assessment. Commun ACM 4:211–218Google Scholar
  25. 25.
    Qin H, Li H, Zhao X (2010) Development status of domestic and foreign smart city. Glob Presence 9:50–52Google Scholar
  26. 26.
    Rahm E, Do HH (2000) Data cleaning: problems and current approaches. IEEE Data Eng Bull 23(4):3–13Google Scholar
  27. 27.
    Redman TC (1996) Data quality for the information age. Artech House, Boston, MAGoogle Scholar
  28. 28.
    Samarati P, Sweeney L (1998) Generalizing data to provide anonymity when disclosing information. In: Proceedings of SIGMOD-SIGACT-SIGART symposium on the principles of database systemsGoogle Scholar
  29. 29.
    Santos H, Pinheiro P, McGuinness DL (2015) Contextual data collection for smart cities. In: Proceedings of the 6th workshop on semantics for smarter citiesGoogle Scholar
  30. 30.
    Scannapieco M, Catarci T (2002) Data quality under a computer science perspective. Arch Comput 2:1–15Google Scholar
  31. 31.
    Snigdha C, Tanveer AF, Hima PK, Mukesh KM, Venkata S (2015) Cleansing a database system to improve data quality. US Patent US9,104709 B2Google Scholar
  32. 32.
    Su K, Li J, Fu H (2011) Smart city and the applications. In: Electronics, communications and control (ICECC), pp 1028–1031Google Scholar
  33. 33.
    Sweeney L (2002) Achieving k-anonymity privacy protection using generalization and suppression. J Uncertain Fuzziness Knowl Based Syst 10(5):571–588MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Thomsen C, Pedersen TB (2009) Pygrametl: a powerful programming framework for extract-transform-load programmers. In: Proceedings of DOLAP, pp 49–56Google Scholar
  35. 35.
    Thusoo A, Sarma JS, Jain N, Shao Z, Chakka P, Zhang N, Murthy R (2010) Hive-a petabyte scale data warehouse using Hadoop. In: Proceedings of ICDE, pp 996–1005Google Scholar
  36. 36.
    Wand Y, Wang RY (1996) Anchoring data quality dimensions in ontological foundations. Commun ACM 39(11):86–95CrossRefGoogle Scholar
  37. 37.
    Wong RC, Li J, Fu AWC, Wang K (2007) K-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. In: Proceedings of SIGKDD, pp 754–759Google Scholar
  38. 38.
    Zanella A, Bui N, Castellani A, Vangelista L, Zorzi M (2014) Internet of things for smart cities. Internet Things J 1(1):22–32CrossRefGoogle Scholar
  39. 39.
    Zaveri A, Rula A, Maurino A, Pietrobon R, Lehmann J, Auer S (2016) Quality assessment for linked data: a survey. Semantic Web J 7(1):63–93CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2017

Authors and Affiliations

  • Xiufeng Liu
    • 1
    Email author
  • Alfred Heller
    • 2
  • Per Sieverts Nielsen
    • 3
  1. 1.Danmarks Tekniske UniversitetLyngbyDenmark
  2. 2.Danmarks Tekniske UniversitetLyngbyDenmark
  3. 3.Danmarks Tekniske UniversitetLyngbyDenmark

Personalised recommendations