Frontiers of Computer Science in China

, Volume 4, Issue 3, pp 376–385

Development of foundation models for Internet of Things

Research Article


With the advent of the Internet of Things (IoT) that offers capabilities to identify and connect worldwide physical objects into a unified system, the importance of modeling and processing IoT data has become significantly accentuated. IoT data is substantial in quantity, noisy, heterogeneous, inconsistent, and arrives at the system in a streaming fashion. Due to the unique characteristics of IoT data, the manipulation of IoT data for practical applications has encountered many fundamental challenging problems, such as data modeling and processing. This paper proposes the infrastructure for an IoT prototype system that aims to develop foundation models for IoT data. We illustrate major modules in the IoT prototype, as well as their functionalities, and provide our vision of the key techniques used for tacking the critical problems in each module.


Internet of Things (IoT) IoT pre-processing IoT query processing IoT event detection 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Wong C Y. Integration of Auto-id Tagging System With Holonic Manufacturing Systems. White Article, Auto-id Labs, University of Cambridge, 2001Google Scholar
  2. 2.
    Cooper J, James A. Challenges for Database Management in the Internet of Things. IETE Technical Review, 2009Google Scholar
  3. 3.
    ITU. The Internet of Things. ITU Internet Reports, 2005Google Scholar
  4. 4.
    Contactless Payment and the Retail Point of Sale: Applications, Technologies and Transaction Models. Smart Card Alliance White Paper,
  5. 5.
    Landt J. The History of RFID. AIM, Inc.Google Scholar
  6. 6.
    Internet Protocol, version 6 (IPv6) Specification,
  7. 7.
    Buneman P, Khanna S, Tan W C. Why and Where: A Characterization of Data Provenance. ICDT, 2001Google Scholar
  8. 8.
    Zogg J M. GPS Basics. U-Box. 2002Google Scholar
  9. 9.
    Zigbee Alliance, 2009,
  10. 10.
    Buneman P, Khanna S, Tajima K, Tan W. Archiving scientific data. ACM Trans. Database Syst., 2004, 29(1): 2–42CrossRefGoogle Scholar
  11. 11.
    Cheng R, Kalashnikov D V, Prabhakar S. Evaluating probabilistic queries over imprecise data. SIGMOD, 2003, 551-562Google Scholar
  12. 12.
    Jeffery S R, Alonso G, Franklin M J, Hong W, Widom J. Declarative support for sensor data cleaning. PerCom, 2006, 83–100Google Scholar
  13. 13.
    Subramaniam S, Palpanas T, Papadopoulos D, Kalogeraki V, Gunopulos D. Online outlier detection in sensor data using non-parametric models. VLDB, 2006, 187–198Google Scholar
  14. 14.
    Kriegel H-P, Kunath P, Pfeifle M, Renz M. Probabilistic similarity join on uncertain data. DASFAA, 2006, 295–309Google Scholar
  15. 15.
    Lian X, Chen L. Probabilistic ranked queries in uncertain databases. EDBT, 2008, 261: 511–522CrossRefGoogle Scholar
  16. 16.
    Lian X, Chen L. Monochromatic and bichromatic reverse skyline search over uncertain databases. SIGMOD, 2008, 213–226Google Scholar
  17. 17.
    Pei J, Jiang B, Lin X, Yuan Y. Probabilistic skylines on uncertain data. VLDB, 2007, 15–26Google Scholar
  18. 18.
    Fan W. Dependencies revisited for improving data quality. PODS, 2008, 159–170Google Scholar
  19. 19.
    Chomicki J, Marcinkowski J. Minimal-change integrity maintenance using tuple deletions. Info. Comput., 2005, 197(1–2): 90–121MATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Arenas M, Bertossi L, Chomicki J. Consistent query answers in inconsistent databases. PODS, 1999, 68–79Google Scholar
  21. 21.
    Bohannon P, Fan W, Flaster M, Rastogi R. A cost-based model and effective heuristic for repairing constraints by value modification. SIGMOD, 2005, 143–154Google Scholar
  22. 22.
    Cong G, Fan W F, Geerts F, Jia X B, Ma S. Improving data quality: Consistency and accuracy. VLDB, 2007, 315–326Google Scholar
  23. 23.
    Wijsen J. Database repairing using updates. TODS, 2005, 30(3): 722–768CrossRefGoogle Scholar
  24. 24.
    Lian X, Chen L, Song S. Consistent query answers in inconsistent probabilistic databases. SIGMOD, 2010, 303–314Google Scholar
  25. 25.
    Wu E, Diao Y, Rizvi S. High-performance complex event processing over streams. SIGMOD, 2006, 407–418Google Scholar
  26. 26.
    Meng X L. Multiple-imputation inferences with uncongenial sources of input (with discussion). Statistical Science, 1995, 9: 538–558Google Scholar
  27. 27.
    Dong X, Halevy A. Indexing dataspaces. SIGMOD, 2007, 43–54Google Scholar
  28. 28.
    Muralikrishna M, DeWitt D J. Equi-depth multidimensional histograms. SIGMOD Rec., 1988, 17(3): 28–36CrossRefGoogle Scholar
  29. 29.
    Olken F, Rotem D. Simple random sampling from relational databases. VLDB, 1986, 160–169Google Scholar
  30. 30.
    Fuxman A, Fazli E, Miller R J. ConQuer: Efficient management of inconsistent databases. SIGMOD, 2005, 155–166Google Scholar
  31. 31.
    Lian X, Chen L. Efficient join processing on uncertain data streams. CIKM, 2009, 857–866Google Scholar
  32. 32.
    Letchner J, Ré C, Balazinska M, Philipose M. Access methods for Markovian streams. ICDE, 2009, 246–257Google Scholar

Copyright information

© Higher Education Press and Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  1. 1.Department of Computer Science EngineeringHong Kong University of Science and TechnologyHong KongChina

Personalised recommendations